The n Habits of Highly Defective Windows Applications

Home
Back To Tips Page

OK, so it's a pun on "The 7 Habits of Highly Effective People". But there are a number of programming techniques that I've found virtually instantly identify a defective program. In addition, there are a number of techniques that mark a program as deeply suspect, and likely to break. Finally, there are a number of techniques which represent poor style, some exceptionally poor, which violate principles of modularity and which result in programs that are clumsy to construct, difficult to debug, and impossible to maintain.

Many of these are simple well-known programming issues; for example, the use of constants where a #define is more appropriate. For some reason, I find that there is a tendency of programmers to assign certain integers, such as control IDs, to random hard-coded values. This is a serious problem. But there are a number of techniques that also indicate serious flaws in a program. I will summarize these here, and clicking on one will link to a discussion of it.

Generally, when I get a program from a client who is having problems, I can solve most of the problems by looking for the items I describe here. My track record is that I have been shown code on a page, or on a screen, containing one or more of the techniques I discuss here, and I point to it and say "There's your problem!", usually within the first 30 seconds. It sometimes takes another ten or twenty minutes to explain to the programmer what has gone wrong as a consequence, and demonstrate exactly what can go wrong in the future, but thus far I've never been wrong in my assessment. On one recent consulting trip, I spotted a Sleep() inside a WaitFor..., and said "there's the problem", and then explained "Your program will fail under the following conditions (a), (b), (c) and the behavior it will exhibit on failure will be (d), (e) and (f)" and they said "Yes, that's exactly where and how it is failing!". As it turned out, there was nothing wrong with that part of the program that a complete rewrite wouldn't solve. And the implications on the rest of the program architecture were significant, so they now have several weeks' work to fix the problem, but they now understand why it couldn't possibly have worked, except by a great miracle. I'll talk more about the failure mode details later.

I have a little awk script I run on *.cpp, and it locates these "hotspots". Usually in less than half an hour I can locate more problems than I have been asked to solve just by studying the code near the hotspots. (I consider this a seriously good revenue generator, since instead of the 1-day or 3-day fix expected, I often get a week or more out of it, since the client, like the one described above, says "Yes, we never did figure out why that was failing..." followed by "go ahead and fix that while you're in there"

I've put some Severity notifications in these descriptions. My rating of these is

Note that I have not had time to complete all the points in this article, but it has been sitting in my computer for months and I thought it time to finally get it out, finished or not!

Threading

Processes

Clipboard

Timing

Synchronization

Memory Allocation

Dialogs, forms, and controls

I/O

Program Structure


AfxExitThread(), _endthread(), _endthreadex(), or ExitThread()

Severity: potentially fatal

These will result in erroneous programs. The only correct way to exit a thread is to either exit from the top-level thread function (straight worker threads in MFC or pure C), to do a PostQuitMessage from within a UI thread, or do a PostThreadMessage(WM_QUIT,0,0) to a UI thread from outside it.

Why? Because these functions cause the thread to terminate immediately. (Well, from the viewpoint of the thread, "immediately"; the DLL_THREAD_DETACH calls will still be properly made, so it is not as bad as using TerminateThread). Any pending objects on the stack will not have their destructors called. Resources will leak. Worse still, if some of those objects are synchronization objects, such as a CSingleLock or CMultiLock object, the lock will never be released, which will eventually result in the program blocking indefinitely.

So, you claim, you don't do any of this? Sure. Right. That's this week. A year from now, when you are working deep in the thread logic, and discover a failure to synchronize, and add one of these, you are now dead. You just don't know it yet.

Then there's the issue of expectations. When a top-level thread function looks like this

/* static */ UINT CMyClass::threadfunc(LPVOID p)
   {
    while(somecondition)
       { /* thread loop */
        DoSomething();
       } /* thread loop */
    return 0;
   } // CMyClass::threadfunc

and you decide to add something useful to the completion, such as

/* static */ UINT CMyClass::threadfunc(LPVOID p)
   {
    while(somecondition)
       { /* thread loop */
        DoSomething();
       } /* thread loop */
    wnd->PostMessage(UWM_THREAD_COMPLETED);
    clean up thread state
    return 0;
   } // CMyClass::threadfunc

this is based on the assumption that you will actually exit the thread through the top-level thread function. Any premature exit of the thread handled at some lower level will bypass this code. So the program will occasionally malfunction in bizarre, possibly non-reproducible ways. 

Why would you ever want to exit the thread prematurely? Many valid reasons. The file that the thread is supposed to open isn't accessible. The network connection it was managing was disconnected. You had an internal data structure consistency error.

So how do you exit a thread prematurely? Simple: throw an exception. For example, I would write

/* static */ UINT CMyClass::threadfunc(LPVOID p)
   {
    while(somecondition)
       { /* thread loop */
        TRY 
          { /* try */
           DoSomething();
          } /* try */
        CATCH(CMyTerminateException, e)
          { /* catch */
           e->Delete();
           break;
          } /* catch */
       } /* thread loop */
    wnd->PostMessage(UWM_THREAD_COMPLETED);
    clean up thread state
    return 0;
   } // CMyClass::threadfunc

To define your own exception, you can just derive from the general MFC CException class, e.g.,

class CMyException : public CException {
    public:
        CMyException() { }
        virtual ~CMyException() { }
};

If you want to pass parameters in the CMyException, just add parameters to the constructor, and appropriate member variables. To throw the exception, just write

throw new CMyException;

In pure C, you should use the __try/__except mechanism and you can only throw integer exceptions, but the technique is the same.

Note that __try/__finally will guarantee cleanup, but only if the thread does not terminate; so while the __finally clause is "guaranteed" to execute, this guarantee applies only if control passes through it. An ExitProcess or ExitThread (or anything that conceals them) will bypass this effect.

Note that in MFC, explicitly terminating a thread is always a potentially fatal operation, because the per-thread handle maps are not cleaned up, including temporary objects. This will leak storage, as well as leaving various kernel objects orphaned. AfxEndThread will actually clean up the MFC state correctly, but has all the other hazards, so don't do it.

Note also that the documentation of _beginthread is misleading; it suggests that calling _endthread "helps to ensure proper recovery of resources allocated for the thread", suggesting that it makes sense for you to do this. Yet it also states that , "_endthread or _endthreadex is called automatically when the thread returns from the routine passed as a parameter", without explaining how it knows to call which function (it is actually because _beginthread will implicitly call _endthread and _beginthreadex will actually call _endthreadex, which you can easily see by reading the source code for the C runtime in

d:\Program Files\Microsoft Visual Studio\vc98\crt\src\thread.c

and

d:\Program Files\Microsoft Visual Studio\vc98\crt\src\threadex.c

(Providing you installed the C runtime source; if you didn't, you should have. Go install it now)

What they really mean is that you should not call ExitThread, which would not clean up these resources. There is no justification for explicitly calling _endthread or _endthreadex yourself.


_beginthread(), _beginthreadex(), or CreateThread() in an MFC program

Severity: potentially fatal

MFC requires certain internal state be set up properly for its correct functioning in a multithreaded environment. This includes creating (and upon completion, destroying) the per-thread handle maps. If you use these methods, you bypass MFC's maintenance of its internal state.

If you call no MFC functions from within the thread, and use no MFC objects in any form, you might get away with using _beginthread, _beginthreadex or CreateThread. But that means that using any MFC function inadvertently opens you to potentially fatal problems. Since there is essentially no reason to use these, you should use only AfxBeginThread.


CreateThread in a C program

Severity: serious, potentially fatal

The C runtime comes in two flavors: the single-threaded libraries (normal, debug, and DLL) and the multithreaded libraries (normal, debug, and DLL). It is essential that you use the multithreaded C library (in any of its forms). Unfortunately, the default for a C application is to use the single-threaded library. This library gains some efficiency at the cost of losing all thread safety. 

If you call _beginthread or _beginthreadex, the functions will generate a compiler error if you have not gone into Project | Settings | Code Generation and selected the multithreaded library. Therefore, you know that if you get a compilation error, you have to enable the multithreaded library. If, however, you use CreateThread, this is an API function, which is always defined. Therefore you can create threads which use the single-threaded C library from multiple threads.

This technique is perfectly safe if you never call any C library functions from the thread. However, maintaining this restriction is difficult over the lifetime of the program. Since _beginthread or, if you need all the capabilities, _beginthreadex, already provide everything you need to do, there is little if any reason for using CreateThread.


TerminateThread

Severity: potentially fatal

This is always a bad idea. Always. It is to my deep embarrassment that in my early writings on threading, in Win32 Programming, I made the mistake of using this. At least I said that I knew it was a Bad Idea, but I was under a fair amount of pressure to finish the book, and was not as careful as I should have been. I have documented the proper ways of terminating a blocked thread in a companion essay

The problem with calling TerminateThread is that it will instantly terminate the thread, and will even suppress the DLL_THREAD_DETACH calls that many DLLs rely on for correct behavior.

Never use this call. If you think you need it, recode your program so you don't need it.


Polling for thread completion

Severity: inefficient

A common failure mode in multithreaded programming is to have some thread "wait" for another by asking "are we there yet? are we there yet?". All this accomplishes is to waste a significant amount of CPU time, since, unless you have a multiprocessor, the thread that is "waiting" will consume its entire time slice asking for a condition that cannot possibly be set to true because during the time that thread is running, the thread that is being waited for is not running and therefore cannot possibly change the state!

In a multiprocessor, it means the waiting processor simply wastes a significant amount of its time waiting for a thread to finish; that time could be better spent doing something productive, such as running one of the other threads of your application.

It doesn't matter how you poll; some people use a Boolean variable, waiting for it to go TRUE, and some use GetThreadExitCode. Both of these are equivalent, and wrong.

There are two techniques for dealing with thread completion. The simplest method, which I will explain why has serious defects in a different section of this essay, is to simply do a WaitFor... operation on the thread handle. The thread handle remains non-signaled as long as the thread is running (or potentially runnable) and becomes signaled when the thread terminates. Therefore, the thread that is waiting for the completion uses no CPU time while it is waiting. However, since this blocks the waiting thread, and that can be the main GUI thread, this is not generally recommended in the main thread.

The second method is to have the thread PostMessage a completion message (user-defined) to the GUI thread. This can be handled by either a view or the main frame. The thread that receives the message then can enable/disable controls respond to the completion, and so on.


Blocking the main GUI thread

Severity: deeply suspect

A problem that comes up fairly often in the newsgroups deals with the fact that the "GUI is non-responsive", usually to something like a "Cancel" button not being able to cancel a running thread. It is essential that you not block the GUI thread if you expect features like this to work. It is generally a Very Bad Idea to block the GUI thread.

The usual excuse is "but I don't want to have the user start another thread" or something like that. Fine. This is trivial. Just disable the controls that would initiate a new thread. For example,

void CMyView::OnStartComputation()
   {
    AfxBeginThread(...);
    threadrunning = TRUE;
   }
void CMyView::OnUpdateStartComputation(CCmdUI * pCmdUI)
   {
    pCmdUI->Enable(!threadrunning);
   }

I then use a user-defined message which the thread posts back to the main GUI thread just after the thread loop completes, using

wnd->PostMessage(UWM_THREAD_FINISHED);

In the handler for wnd, add

ON_MESSAGE(UWM_THREAD_FINISHED, OnThreadFinished)

or

ON_REGISTERED_MESSAGE(UWM_THREAD_FINISHED, OnThreadFinished)
LRESULT CMyView::OnThreadFinished(WPARAM, LPARAM)
   {
     threadrunning = FALSE;
   }

For a dialog, read my essay on dialog control management. What I do is

void CMyDialog::OnStartComputation()
   {
    AfxBeginThread(...);
    threadrunning = TRUE;
    updateControls();
   }
void CMyDialog::updateControls()
   {
    ...
    c_StartComputation.EnableWindow(!threadrunning);
   }

The only difference in the handler for handling the OnThreadFinished event in a dialog is that I call updateControls() explicitly. 

For more details on handling user-defined messages, see my essay on message management.

The use of a blocking call on the main GUI thread, or alternatively a long computation, produces unexpected and generally unpleasant effects on the user interface. The failure to put up a wait cursor exacerbates this.


Using PeekMessage anywhere

Severity: serious, potentially fatal, inefficient, poor style. (It loses in a lot of ways!)

First rule: PeekMessage is obsolete

Second rule: If you think you need PeekMessage, consult rule 1.

Actually, PeekMessage can be useful, if used with extreme care. For example, the MFC message pump uses it, and that code demands very careful study if you ever think you want to use PeekMessage. It is subtle code. If you cannot be at least as subtle as this code, don't even consider trying to use it.

But in most cases, just pretend you never heard of PeekMessage. Most people who try to use it get it so badly wrong that its very existence has negative effect on program quality. You are always safe if you apply the two rules above. While they are a heuristic for testing the validity of a program, one of my criteria for determining if a program is seriously defective is the mere presence of this API call. In all cases I have seen this used, it has been used incorrectly. So as a simple heuristic I can state that nearly everyone gets it wrong, and therefore, its very presence is a good indicator of a fundamentally flawed program design.

If you think you have to use it, you are almost certainly better off using threads. But I've seen some pretty bad code when threads are used. So if you think you need it in a single-threaded program, your structure is probably wrong, and you should have used multiple threads. And if you think you need it in a multithreaded application, you're almost certainly wrong, since there is nothing meaningful it can do.

For example, here is an adaptation from a piece of code I found on the MFC newsgroup. It is actually a hybrid of several serious errors I've found in multithreaded programs. I decided to discuss all of these in one synthesized example.

// The global variable that controls and reports the thread state
BOOL running = FALSE;
CRITICAL_SECTION lock;

void CMyClass::OnInitThread()
  {
   EnterCriticalSection(&lock);
   running = TRUE;
   LeaveCriticalSection(&lock);
   AfxBeginThread(threadfunc, parms);
   // Do not block the GUI thread while the worker is running
   while(running)
     {
      MSG msg;
      if(PeekMessage(&msg, 0, 0, NULL, PM_REMOVE))
        { 
         TranslateMessage(&msg);
         DispatchMessage(&msg);
        }
     }

   DoSomething();
  }
UINT threadfunc(LPVOID p)
   {
    BOOL isrunning;
    while(TRUE)
      {
       EnterCriticalSection(&lock);
       isrunning = running
       LeaveCriticalSection(&lock);
       if(!isrunning)
          break;
       ...do thread thing here...
       if(thread_has_completed_its_work)
         { 
          EnterCriticalSection(&lock);
          running = FALSE;
          LeaveCriticalSection(&lock);
         }
      }
    }

What's Wrong With This Picture? 

To be blunt, nearly everything. The AfxBeginThread is salvageable, as is the code represented by the "...do thread thing here...". The test for the thread having completed its work. The function call for doing something when the thread has finished. Nothing else is salvageable. This is a horror of the first order.

First, there is a global variable in use. There is no need for a global variable. There are a lot of ways of dealing with this, but a global variable is an indication there is something wrong with the structure. See, for example, my essays on worker threads, and on callbacks. These show why you would not ever need a global variable.

Then there is the global thread function. This is not necessary. This can be a static class member function (again, see my previously-cited essays on worker threads and callbacks).

The BOOL variable access is carefully synchronized, which is unnecessary, and it is not declared volatile, which is necessary.

But the worst horror of all is the PeekMessage loop with its accompanying comment. This is absolutely, positively incorrect program structure. The failure in thinking here is that the actions which must follow the thread completion must physically follow the thread invocation, in the same function in which the thread invocation appears. This is completely nonsensical. The completion actions are syntactically unrelated to the thread initiation. They are only required to temporally follow the execution of the thread.

The PeekMessage loop is absurd here. It means that while the thread is running, half the CPU cycles go to running a thread that is doing absolutely nothing but consuming CPU cycles in a completely pointless and nonproductive fashion!

The correct solution is to get rid of the PeekMessage loop entirely. There is no justification for its existence.

One of several correct approaches can be characterized by the code below:

class CMyWndClass : public CSomeWindowBaseClass {
    ... lots of stuff here
    protected:
      volatile BOOL running;
      static UINT threadfunc(LPVOID p);
      void RunThread();
      afx_msg LRESULT OnThreadFinished(WPARAM, LPARAM);
};

void CMyWndClass::OnInitiateThread()
   {
    running = FALSE; 
    updateControls(); // dialogs only
    AfxBeginThread(threadfunc, this);
   } 

That's all that is required. Note that there is absolutely nothing else happens in this function! That's because there is nothing else that needs to happen.

When this function returns, the built-in message loop resumes. If there is nothing to do, the message loop blocks on the GetMessage, thus not consuming any CPU cycles because there is nothing else to do. In the best case, if this were the only application requiring service on the CPU, 100% of the CPU cycles would go to accomplishing useful work in the thread, instead of 50% to a useless PeekMessage loop and 50% to the thread. Read: if your task took ten minutes under the previous implementation, it takes five minutes under this one. That performance improvement matters.

Now what does the thread function look like? For details, consult my essay on worker threads, but the code is simply expressed as

/* static */ UINT CMyWndClass::threadfunc(LPVOID p)
   {
     ((CMyWndClass *)p)->RunThread();
     return 0; // return value never, ever is of interest
   }

void CMyWndClass::threadfunc()
   {
    while(running)
      { /* thread loop */
       ...do thread thing here...
       if(thread_has_completed_its_work)
         break;
      } /* thread loop */
    PostMessage(UWM_THREAD_FINISHED); // See my essay on Message Management
   }

LRESULT CMyWndClass::OnThreadFinished(WPARAM, LPARAM)
   {
    DoSomething();
    running = FALSE;
    updateControls(); // dialogs only
    return 0;
   }

This is not only efficient, but note that it does not do any synchronization! Why not? Clearly, the running flag is being accessed by two different threads: the GUI-level thread can set it to FALSE by the user clicking a Stop button or selecting a Stop menu item or any other of a variety of ways. So why don't we need synchronization?

Synchronization is needed only when concurrent access can produce incorrect results. The main GUI thread never reads the value; it only writes it. The worker thread only reads it, but if there is any timing error it really doesn't matter in the slightest. The worst case is that the worker thread might go through the loop one more time if there was a concurrent access that caused the thread to see a TRUE although concurrently there was a FALSE being set. So functionally, this is the same as if the user had hesitated perhaps an extra nanosecond on the click. So no synchronization is required.

What about the argument that "I don't want the user to start the thread(s) again until it(they) finish". Well, the first example code doesn't actually prevent that from happening! So it would be possible to start a new of thread running each time the operation was re-selected. You need some way to disable the initiation of new threads. The simplest way to do this in a view-based app that is using menu items or toolbar buttons is to simply have an UPDATE_COMMAND_UI handler:

void CMyView::OnUpdateStartThreads(CCmdUI * pCmdUI)
   {
     pCmdUI->Enable(!running);
   }

The controls are disabled if the thread is running. For a dialog, or a CFormView, where you have a button on the dialog or form, the updateControls handler I describe in my essay on control management would be

void CMyDialog::updateControls()
   {
    ...other stuff here
    c_StartThread.EnableWindow(!running);
   }

Note how much simpler the code is. There are no temporal dependencies that are enforced by syntactic sequencing. The system is fully asynchronous, requires a minimum of synchronization (none), consumes no CPU cycles by polling for messages, and is easier to understand.


Failure to use volatile

Severity: potentially fatal

Any variable that is shred between two threads and can be changed concurrently must be marked as volatile. This keeps the compiler from doing certain optimizations, such as code motions, constant propagation and common subexpression elimination, on a variable whose value might change spontaneously.

Note that declare a variable as volatile does not make it thread-safe. But a thread-safe variable, properly handled with synchronization primitives, can still produce incorrect results in the Release version if the variable is not declared volatile.

Example:

while(running)
    { /* thread loop */
     ...bunch of stuff here
    } /* thread loop */

looks like a perfectly good loop, where running is, for example, a member variable of a class and can be modified by some other thread. For example, you might do

void CMyDialog::OnCancel()
   {
    thread->running = FALSE;
   }

However, the optimizing compiler might detect that within the thread loop, nothing modifies the running variable. If it believes this, it feels perfectly free to transform the program as if it had been coded

if(running)
   { /* thread loop */
     compiler_generated_label:
       ...bunch of stuff here
     goto compiler_generated_label;
    } /* thread loop */

Now, you say, that's ridiculous! That's a completely different program! Not so. From the viewpoint of code optimization the two programs are absolutely identical. This is because the optimizer assumes single-thread semantics. And this assumption is completely wrong.

If, however, you declared the member variable as

volatile BOOL running;

then the compiler recognizes that some mechanism it is not aware of can change the value, and it will not do the transformation. 


Failure to synchronize access to a variable between threads

Severity: potentially fatal

No variable or structure should be accessed from multiple threads without being accessed in a thread-safe fashion. Note that any thread-safe access must also be multiprocessor-safe. 

Note that this applies only to variables where concurrent access can produce incorrect results. There are situations in which concurrent access can never result in an incorrect result. One such example is discussed in this essay in the section on the (mis)use of PeekMessage.

However, for situations in which concurrent access can produce an incorrect result, you must provide proper synchronization. This means that even trivial actions like

volatile int a;
a++; // not safe!

cannot possibly work. If you look at the generated code, you will find that it actually takes a couple instructions to increment a, but even if the compiler generated an inc instruction, that takes two memory cycles, which means that in a multiprocessor environment the other processor could simultaneously fetch the initial value. Then both processors would increment the value (say, from 17 to 18) and the value 18 would be stored back, first by one processor, then by the other. So executing a++ twice would cause the value to go from 17 to 18, which doesn't make much sense.

If you believe this cannot happen, you are wrong. I first found this in an operating system in 1968. I found that instead of doing the multithread-safe increment, it did a multithread-unsafe increment. I reported this to my manager, who reported it to IBM. In addition I fixed it. Suddenly, the MTBF of the system went up; in retrospect, examining the "indeterminate" crash memory dumps, we determined that one crash a week happened because there was a thread preemption between the instruction that tested the value and the instruction that set the value.

For simple operations, like increment, decrement, simple addition and subtraction, and a compare-and-exchange, there are Interlocked... operations, so the correct form of the code would be

volatile long a;
InterlockedIncrement(&a);

There are also InterlockedDecrement, InterlockedExchangeAdd, and InterlockedCompareAndExchange, among others.

However, generally you need to provide higher-level synchronization on objects, because very few changes can take place in a single instruction. So you should use CRITICAL_SECTION or a Mutex, or the MFC classes CCriticalSection or CMutex to provide the necessary synchronization.

There is a persistent piece of misinformation which I have said back to me often enough to have realized this has now achieved the status of Programming Legend: by declaring a variable as volatile access to it is implicitly synchronized. Nothing could be further from the truth. The truth is that volatile in no way provides synchronization. What it does do is inform the other threads that they cannot cache the value, which is not at all the same! There is absolutely, positively, no synchronization whatsoever provided by declaring a variable as volatile. You, and you alone, are responsible for providing synchronization. If, however, you also fail to declare the variable as volatile, the cached value may be used by another thread, which would also be incorrect. You must do both.

See my companion essay on synchronization.


Overriding the Run() method in a CWinThread class

Severity: fatal. No "potentially" about it. Programs that are written this way invariably do not even work, from day one.

Basically, there is no reason to do this. Or the reasons are so exotic that you have to be a very advanced user to do this sanely.

Hint: I've been writing threaded code for over a decade and not once, not ever, have I had a need to override the Run() method.

Invariably this is the result of a misunderstanding of the purpose of a UI thread.

The purpose of a UI thread is to have a message pump running. This is important for dispatching events such as socket messages and other asynchronous events from libraries such as SAPI (the speech API). Overriding the Run method will block these messages. The most common failure mode is to put an infinite loop in the Run() method. This can be assumed to make no sense. If you need an infinite loop, use a general worker thread. If you need a UI thread, you do not need an infinite loop in it.

Whatever you are trying to do, overriding Run is going to be the wrong way to accomplish it. Putting an infinite loop in the Run method is always a mistake.


Putting an infinite loop in the InitInstance handler of a UI thread

This makes no sense at all. Ever.

It defeats the purpose of having a UI thread. There is absolutely no way this could ever make sense in a UI thread. Use a worker thread. It will block the message pump, which is the only reason you would ever need a UI thread. Therefore it represents a fundamental design flaw.

Whatever was intended by this, the choice of putting it in a UI thread is always a mistake.


SendMessage between threads

Severity: serious, potentially fatal

Never use SendMessage between threads. The result of this can be a deadlock situation. The deadlock can be unpredictable and therefore difficult to debug if it appears to be happening. So the easiest way to avoid this is to never use it.

Use PostMessage to send messages between threads. This also means that a worker thread cannot use most methods on controls, since they all end up being SendMessage calls. Here are just a few of the operations you must not do on a control or window from a thread which does not own it. I don't intend to reproduce the entire list here. Assume that if you are doing anything to a window (or a control, which is just a window) from a thread, except PostMessage, your program is at serious risk.

Note that I've had people say "But I'm not sending any messages to the controls; I use MFC methods!" which is actually the same as saying "I send messages to controls". MFC is just a nice syntactic wrapper on hundreds of SendMessage operations.

Message prefix Example messages MFC equivalents
BM_ BM_GETCHECK CButton::GetCheck
BM_SETCHECK CButton::SetCheck
CB_ CB_ADDSTRING CComboBox::AddString
CB_DELETESTRING CComboBox::DeleteString
CB_FINDSTRING CComboBox::FindString
CB_GETCOUNT CComboBox::GetCount
CBEM_ CBEM_DELETEITEM CComboBoxEx::DeleteItem
CBEM_GETITEM CComboBoxEx::GetItem
CBEM_INSERTITEM CComboBoxEx::InsertItem
DM_ DM_GETDEFID CDialog::GetDefID
DM_SETDEFID CDialog::SetDefID
DTM_ DTM_GETRANGE CDateTimeCtrl::GetRange
DTM_GETSYSTEMTIME CDateTimeCtrl::GetSystemTime
DTM_SETRANGE CDateTimeCtrl::GetRange
EM_ EM_GETLIMITTEXT CEdit::GetLimitText
EM_GETLINE CEdit::GetLine
EM_GETLINECOUNT CEdit::GetLineCount
HDM_ HDM_DELETEITEM CHeaderCtrl::DeleteItem
HDM_GETITEM CHeaderCtrl::GetItem
HDM_INSERTITEM CHeaderCtrl::InsertItem
HDM_SETITEM CHeaderCtrl::SetItem
IPM_ IPM_CLEARADDRESS CIPAddressCtrl::ClearAddress
IPM_GETADDRESS CIPAddressCtrl::GetAddress
IPM_ISBLANK CIPAddressCtrl::IsBlank
IPM_SETADDRESS CIPAddressCtrl::SetAddress
LB_ LB_ADDSTRING CListBox::AddString
LB_DELETESTRING CListBox::DeleteString
LB_GETCURSEL CListBox::GetCurSel
LB_SETCURSEL CListBox::SetCurSel
LVM_ LVM_DELETEITEM CListCtrl::DeleteItem
LVM_EDITLABEL CListCtrl::EditLabel
LVM_GETITEM CListCtrl::GetItem
MCM_ MCM_GETCURSEL CMonthCalCtrl::GetCurSel
MCM_GETMONTHRANGE CMonthCalCtrl::GetMonthRange
MCM_SETRANGE CMonthCalCtrl::SetRange
PBM_ PBM_GETPOS CProgessCtrl::GetPos
PBM_SETRANGE32 CProgressCtrl::SetRange32
PBM_STEPIT CProgressCtrl::Stepit
PSM_ PSM_ADDPAGE CPropertySheet::AddPage
PSM_GETPAGEINDEX CPropertySheet::GetPageIndex
PSM_GETACTIVEPAGE CPropertySheet::GetActivePage
RB_ RB_GETBANDCOUNT CRebarCtrl::GetBandCount
RB_GETBKCOLOR CRebarCtrl::GetBkColor
RB_SETBANDINFO CRebarCtrl::SetBandInfo
SB_ SB_GETTEXT CStatusBar::GetPaneText
CStatusBarCtrl::GetText
SB_SETTEXT CStatusBar::SetPaneText
CStatusBarCtrl::SetText
SBM_ SBM_GETPOS CScrollBar::GetPos
SBM_SETPOS CScrollBar::SetPos
SBM_SETRANGE CScrollBar::SetRange
STM_ STM_GETICON CStatic::GetIcon
STM_GETIMAGE CStatic::GetImage
STM_SETICON CStatic::SetIcon
STM_SETIMAGE CStatic::SetImage
TB_ TB_ISBUTTONENABLED CToolbarCtrl::IsButtonEnabled
TB_PRESSBUTTON CToolbarCtrl::PressButton
TB_SETSTATE CToolBarCtrl::SetState
TBM_ TBM_CLEARSEL CSliderCtrl::ClearSel
TBM_GETNUMTICS CSliderCtrl::GetNumTics
TBM_SETBUDDY CSliderCtrl::SetBuddy
TCM_ TCM_ADJUSTRECT CTabCtrl::AdjustRect
TCM_GETITEM CTabCtrl::GetItem
TCM_SETIMAGELIST CTabCtrl::SetImageList
TTM_ TTM_ADDTOOL CToolTipCtrl::AddTool
TTM_GETTOOLINFO CToolTipCtrl::GetToolInfo
TTM_NEWTOOLRECT CToolTipCtrl::SetToolRect
TVM_ TVM_DELETEITEM CTreeCtrl::DeleteItem
TVM_GETCOUNT CTreeCtrl::GetCount
TVM_GETITEMRECT CTreeCtrl::GetItemRect
UDM_ UDM_GETBUDDY CSpinCtrl::GetBuddy
UDM_SETRANGE32 CSpinCtrl::SetRange32

Believing a thread starts immediately/Believing a thread does not start immediately

Severity: potentially fatal

Rule 1: any time you assume that a non-suspended thread starts immediately, you are wrong
Rule 2: any time you assume there is a delay in starting a non-suspended thread, you are wrong
Rule 3: there are no exceptions for non-suspended threads. See rules 1 and 2.

An example of the first failure is seen by the number of people who pass a local variable to a thread:

SOMESTRUCT s;
s.whatever = somevalue;
s.thing = something;
AfxBeginThread(threadfunc, &s);

The first failure is that the code will be something like

UINT threadfunc(LPVOID p)
    SOMESTRUCT * s = (SOMESTRUCT *)p;
    ...stuff
    ... = s->whatever; // invalid access

By the time the thread gets around to accessing the s->whatever field, the struct is long gone. The creating function returned, its stack frame is gone, and the space formerly used to hold SOMESTRUCT s is now being used for some other purpose. It gets even worse if the contents of SOMESTRUCT were pointers; they now point to someplace, but where? Who knows? Certainly serious malfunctions will result if they are used to load or store anything.

I've seen a solution posted that looked like this:

UINT threadfunct(LPVOID p)
   {
     SOMESTRUCT s = *(SOMESTRUCT *)p;

and the explanation was that by making a copy, the contents would now be saved. Sorry to disappoint the poster, but this is every bit as incorrect as the previous example, and for the same reasons. All you know about a thread, once you've called the ::CreateThread call (by whatever wrapper your environment demands, such as _beginthread, _beginthreadex, or one of the forms of AfxBeginThread), is that at some point in the indeterminate future, the thread will start. And at some other indefinite point in the future, the thread will end.

If you need to pass information across a thread boundary, it must be space that is either statically allocated, or is allocated on the heap, and it must be guaranteed to remain correct for the entire lifetime of the thread. Anything less will not work.

Sometimes you have a single thread that does one thing. In this case, it sometimes can simply use member variables of the class that spawns it. For example, a thread started by a CView-derived or CDialog-derived class may use member variables of the class instance. Note that it is your responsibility to see that this thread has completely terminated before allowing the CView or CDialog to be destroyed. Rather than use individual members, it is often best to have a struct/class which contains all the variables needed for the thread.

Sometimes you need many threads to do certain tasks on behalf of a single view, document, or other class. In this case, it would be impossible to use a single variable or set of variables to hold the state. In this case, you must actually allocate heap memory and pass a pointer to this heap memory to your thread. The memory must be freed by the thread when it has finished using it.

The complementary problem is the illusion that the thread still exists after the return from ::CreateThread or its wrappers. For example, here is a classic error:

CMyThread * t = (CMyThread *)AfxBeginThread(RUNTIME_CLASS(CMyThread), ...);
t->m_bAutoDelete = FALSE;

Whoops! If you think this code can work, rethink what you are doing. It cannot work. By the time the assignment is done to t, the thread might have finished, and because m_bAutoDelete was implicitly TRUE to start, the thread object referred to by t is gone, it has been returned to the heap, the space has been reallocated for some other purpose, and you just stored the value "1" somewhere, but you have no idea where! Perhaps you merely corrupted the heap. Perhaps you have damaged some other data structure. You have no idea. None whatsoever. And someday your code is going to crash and burn in a spectacular manner.

This is one of the many reasons that keeping thread object pointers around beyond the immediate creation of the thread is often useless. If you need to set member variables, you need to do

CMyThread * t = (CMyThread *)AfxBeginThread(RUNTIME_CLASS(CMyThread), ..., CREATE_SUSPENDED, ...);
t->m_bAutoDelete = FALSE;
t->anythingelse = something;
t->ResumeThread();

Now you can depend on the fact that the thread has not started running, because it was created suspended. Until you execute ResumeThread, you are safe. Once ResumeThread is called, the thread could be terminated before the call returns. (and if you think this is impossible, you have not thought out what threading means).

One of the common questions which arises under these conditions is "But I can't do x until I know the thread is executing!". If that is so, you need to rethink what you are doing. If you need to do w, start a thread, wait for the thread to be running, and then do x, you have to stop thinking in a procedural fashion and think in an event-driven fashion. The correct approach is to do w, and start the thread. That's it. That's as far as you go. If x needs to be done after the thread starts, why is it not done in the thread? That's the first question to ask. If there is actually a good reason (not just a failure in design) that this must be so, then the correct behavior is to have the thread PostMessage a notification to the main GUI thread, or possibly PostThreadMessage if it was spawned by some other UI thread, and when that message is received, you are now free to do x.

I've seen solutions of the form:

HANDLE h = ::CreateEvent(NULL, TRUE, FALSE, NULL);
AfxBeginThread(threadfunc, (LPVOID)h);
::WaitForSingleObject(h, INFINITE);

where the thread contains

UINT threadfunc(LPVOID p)
    {
     HANDLE h = (HANDLE)p;
     ... thread setup goes here...
     ::SetEvent(h);
     ... rest of thread goes here...
    }

which is a bit clunky; it is not a clean solution. Worse still, I've seen the passing of two event handles; after the creator comes out of the ::WaitForSingleObject, it then does something and does a ::SetEvent to tell the thread it can now continue; the thread, after it did the ::SetEvent to resume its creator, did a ::WaitForSingleObject to wait for a resume notification. This is really clumsy, and shows a failure to give up procedural, sequential thought models. Junk code like this, and rewrite it into something simple. Nearly all the time I see code like this, there is actually a much simpler way to write it. The few times a better solution was not feasible, it was because of fundamental design failures elsewhere that could not be easily rewritten. Poor design is poor design, and there is usually not a single point where it goes bad, but lots of places that go bad.


ExitProcess() and exit()

Severity: potentially fatal

Programmers have been erroneously trained to "exit the program if there is an error". Much of this training was simply incorrect, and was often based upon people who learned how to program when programs were encoded in pieces of cardboard and run overnight.

In a modern interactive program it is never appropriate to "exit the program". Exiting the program is an admission of total failure. It is always an error. Therefore, an ExitProcess call, or as it is disguised by the C library, exit, is always inappropriate. If you think you have an "unrecoverable error", you are wrong. All errors are recoverable. The recovery may mean that a whole lot of menu items, controls, etc. are now disabled, but the condition is always recoverable with respect to keeping the program running. To think that such a thing as an "urecoverable" error that necessitates exiting the program could ever exist is a deep flaw in thinking.

I once bought a third party database library, CodeBase. It was without a doubt the worst subroutine library I have had the misfortune to be subjected to. Its attitude was that if an error occurred, there were two errors: those caused by incorrect programming, which would pop up and obscenely cryptic MessageBox, underneath the application because the programmer did not understand that using a NULL handle made the dialog box a child of the desktop, and those errors which were caused by its perception that data was incorrect, in which case it called exit instead of returning an error code that said "data corrupted". They would not even listen to an analysis of what was wrong; they gave lame excuses such as "If the index is bad, we can't proceed". Fine, but that has nothing to do with the program. Any subroutine library that contains an ExitProcess or exit is so fundamentally erroneous that it can only be ignored as a piece of legitimate software. Its inclusion in a project probably dooms the project. After all, do you want to take the call from your customer saying "I was working in the program and it just went away!"? Do you think you can reproduce it?

The arrogance that says a subroutine writer has the right to determine that my program should shut down requires a serious attitude adjustment on the part of that programmer. No subroutine has the right to terminate a program. Only the user has the right to terminate a program. Now, the user may choose to do so because an internal error has rendered the program unusable, but by that time, there should have been detailed analyses of the failure mode presented to the user, the error would have been logged in a way that can be presented to a technical support person, and operations the user might do that would now be invalid should be disabled.

So you say, "If that error occurs, the user cannot proceed!". Fine. This means you end up disabling all the menu items and toolbar items except those which are still viable, such as Exit and About... (the latter is critical if the user calls tech support!)

If an error occurs in a subroutine, return an error code. Or throw an exception. But never, ever call ExitProcess. In addition, in a GUI program, if you call ExitProcess, the process stops. The user gets no chance to decide if buffers should be saved; the mechanism that asks the question has been bypassed; objects which may need to be cleaned up (for example, semaphores, which have an existence independent of the process itself) are left corrupted; Mutexes which are currently locked will generate WAIT_ABANDONED notifications, which guarantee that other processes which are depending on them will also have to enter a nonrecoverable error situation; data which has been buffered but not yet written to the kernel, such as file buffers and record buffers, are lost. Because of these hazards, I have classified this as potentially fatal.

It doesn't matter what you need to do to avoid ExitProcess. Do it. It doesn't matter if the program is a GUI program or an ordinary console app; never call ExitProcess (or exit). I have managed to avoid using this (except for the occasional trivial ten-line test program, and even then I get nervous) for something like 20 years. It isn't hard.


PostQuitMessage() and PostMessage(WM_QUIT)

Severity: potentially fatal. If you are lucky, merely serious

Using these on the main GUI thread is almost always an error. The only context in which they ever made sense was in the WM_DESTROY (or the WM_NCDESTROY) handler of a pure C API program. They never made sense in any other context in pure C programming, and they never make sense in an MFC program.

The way to shut a program down is to PostMessage(WM_CLOSE) to the main window. This invokes the standard closedown actions, which typically include such actions as prompting the user to save unsaved buffers. If you get a WM_QUIT message to the message loop, the message loop shuts down immediately and control proceeds to ExitInstance. Nothing else in the program associated with shutdown will take place. This will eventually lead to fatal situations; if not immediately, certainly later (often before the product is released, and if not then, usually on the first round of maintenance). 

The only valid place these can make sense is to shut down a worker thread with a message queue, known somewhat erroneously as a "UI" thread, and then only if you have a fair degree of confidence that everything else in the thread is shut down. Usually I will PostMessage a message to the UI thread asking it to shut itself down, and when it has determined it is in a clean state, it does a PostQuitMessage to itself.


PostMessage(WM_CLOSE)

The only time an application should shut down is when the user or the system asks it to shut down. This means that the user has used one of the numerous means to ask a shutdown of the app (the close box, Alt+F4, the system menu, the Task Manager), or the user is logging off or the system is shutting down. Otherwise, this has all the charm of using ExitProcess() or exit() and is an error. If the program is seriously broken, you should still keep it running...just don't let it do anything.

Sometimes, shutdown is a multiphase operation. The OnClose handler only initiates the close operation, but does not actually call the superclass OnClose handler which eventually gets to the built-in DefWindowProc which calls DestroyWindow. This is deferred until the necessary conditions are met, such as network connections quiesce, the robotic arm has been retracted to its safe position, and similar long-period delay operations. In this case, the superclass OnClose handler will be called when the necessary conditions are met.


Using Events for Mutual Exclusion

Severity: potentially fatal

Events (those objects created by ::CreateEvent or by using CEvent) can be used for synchronization, but they are inadequate for doing mutual exclusion. The reason is that many threads can pass an Event. For mutual exclusion use a Mutex (created by ::CreateMutex) or CMutex or a CRITICAL_SECTION. If you are using an Event for mutual exclusion, you are using the incorrect mechanism and need to change. 


Entering a Critical Section before a Lengthy Operation

Severity: serious, potentially fatal

A Critical Section of any sort (a section of code which is protected from concurrent execution on the same data item) should be as short as possible. Generally there are a very small number of lines of code between the locking and the unlocking, and furthermore these lines do not usually contain lengthy loops. The longer you spend in a critical section, the higher the probability that another thread will block on the same lock. This decreases total throughput of your program.

But one thing that is always a mistake is to block before a blocking operation. I have actually seen code that locks a buffer in the following fashion (this was in the worker thread). 

// in the shared data area:
volatile DWORD length;
----------------------------
length = 0;
while(reading)
   { /* read loop */
    EnterCriticalSection(&bufferlock);
    ReadFile(comport, &buffer[length], bufferlength - length, &bytesRead, NULL);
    length += bytesRead;
    LeaveCriticalSection(&bufferlock);
    mainthread->PostMessage(UWM_DATA_READ);
   } /* read loop */

The main GUI loop was then supposed to read the data by doing

LRESULT CView::OnDataRead(WPARAM, LPARAM)
   {
    EnterCriticalSection(&bufferlock);
    ... use data in buffer...
    length = 0; // reset the length
    LeaveCriticalSection(&bufferlock);
   }

There are two deep and unrecoverable bugs in this code. The programmer had analyzed it as "I can only have one thread at a time accessing the buffer and the length, so I have done proper synchronization". However, the crucial problem was that there was never really a time when the buffer was unlocked. By the time the PostMessage was processed, the buffer was locked again! Consider the execution of the loop: it leaves the critical section, posts the message, loops around, locks the critical section again, and blocks on the ReadFile. So by the time the message is processed in the main GUI thread, the thread is blocked when it tries to enter the critical section. When the worker thread unblocks, it fills in the buffer, releases the lock, posts a message, loops around, and locks the critical section. Boom!

Now the second fatal bug is the fact that there is no test for buffer overrun. That is, if you look at this, it just keeps reading data from the serial port. Because the main thread can never get a chance to reset the position to zero, it merely keeps increasing. When the position gets to be greater than bufferlength, the computation bufferlength - position gives a negative number, but since the input is a DWORD, this turns out to be a massively large positive number. Since the buffer is not as long as the large negative number suggests, the inevitable result is that the I/O Manager will reject the ReadFile by throwing an access fault exception. 

The trick here is to never, ever lock the buffer during the I/O operation. The best method would be to have the thread allocate an input buffer on the stack, and upon completion, allocate a block of storage to hold the data and PostMessage a pointer to the heap block to the main GUI thread which handles it. This is similar to the technique that I describe in my essay on worker threads, except I wouldn't use CString * for the data because the data could contain embedded NUL bytes, which would be incompatible with CString. I might PostMessage a simple BYTE * pointer in WPARAM and the length in LPARAM, or I might use a CArray<BYTE, BYTE> *, or, if I were a user of STL, some STL array structure. At no time would any locking be required.

Alternatively, it is worth observing that if I put a lock simply around the setting and examination of the buffer length, there would only be a very small window of locking, and furthermore the locking would essentially be guaranteed to complete in a nominally brief period of time (excluding effects of thread priorities). However, I would be disinclined to use this method when a simpler method already exists. Note that the performance issue doesn't arise because the entire allocation and message handling task completes in less than a single character time at any serial port speed, and would therefore be undetectable. Again, refer to my essay on "Optimization, Your Worst Enemy". Why introduce complexity when it is not needed?


Using SetCurrentDirectory/_chdir

Severity: poor style/potentially fatal

There are some very rare occasions in which it makes sense to change the working directory. Every one of them is at the explicit request of the user. It is never appropriate to simply change it. Possibly the worst example of the misuse of SetCurrentDirectory (or its C-library equivalent, _chdir) is in programs that try to do recursive tree walks during wildcard searches. The problem with this call is that it changes the working directory for the entire process, meaning that every thread sees the effect of this. Since many threads open or create files based on the current directory setting, a thread that creates a file while another thread is changing the directory will end up creating it in some random place. 

The changing of a directory should only be a consequence of the user requesting a change to a specific directory. Changing it at any other time, for any other reason, that does not involve the user explicitly specifying the directory (often as part as executing one of the file dialogs or a ShBrowseForFolder dialog), can be thought of as a serious error. The programmer does not have the right to spontaneously change the current directory. To do so is at the very least exceedingly poor style, and at worst it can have potentially fatal consequences if you have multiple threads. 


Polling for process completion

Severity: inefficient

A common failure mode in multithreaded programming is to have some thread "wait" for another by asking "are we there yet? are we there yet?". All this accomplishes is to waste a significant amount of CPU time, since, unless you have a multiprocessor, the thread that is "waiting" will consume its entire time slice asking for a condition that cannot possibly be set to true because during the time that thread is running, the thread that is being waited for is not running and therefore cannot possibly change the state! Putting this thread in a separate process is no different from polling for thread completion in the same process. GetExitCodeProcess is as pointless as GetExitCodeThread. If you want to see the process exit code, you do this after you have determined the process has finished. Since most processes don't return a meaningful code, even that is usually pointless.

In a multiprocessor, it means the waiting processor simply wastes a significant amount of its time waiting for a thread to finish; that time could be better spent doing something productive, such as running one of the other threads of your application, or some other application, or a thread of the application you are waiting for.

There are three techniques for dealing with process completion. The simplest method, which I will explain why has serious defects in a different section of this essay, is to simply do a WaitFor... operation on the process handle. The process handle remains non-signaled as long as the process is running (or potentially runnable) and becomes signaled when the process terminates. Therefore, the thread that is waiting for the completion uses no CPU time while it is waiting. However, since this blocks the waiting thread, and that can be the main GUI thread, this is not generally recommended in the main thread.

The second method is useful when you have a console-style app for which you have redirected stdout, which you are receiving via an unnamed pipe in your application. When the pipe is broken, it typically means that the application has finished, since very few applications would spontaneously close stdout. It is usually closed implicitly when the process finishes.

The third method is to create a thread that waits for the process to complete, and posts a user-defined message to the appropriate window (view or mainframe) to indicate that the process has finished. This is discussed in the next section and in an accompanying essay.


TryEnterCriticalSection

Severity: inefficient

I find this a pointless operation. The whole idea of a critical section is that you should block if you can't enter it. But here is a call that simply returns FALSE if you can't enter the critical section. What can you do? Well, retry the operation. This is very much like polling. If you need a timeout, use a Mutex and a WaitForSingleObject with a timeout. So I find that this does not seem to solve any problem that matters, but the few times I've seen it used it results in a merely inefficient program, and in one case an erroneous program.

And, I should point out, that the following are not equivalent:

::WaitForSingleObject(mutex, timeout);
while(!TryEnterCriticalSection(&lock)) 
       Sleep(deltaT);

In the first form, the wait blocks the thread until either the mutex handle becomes signaled or the timeout occurs, whichever occurs first. In the second form, if the TryEnterCriticalSection fails, the thread always blocks for the deltaT time. If the lock becomes signaled 0.1ms after the TryEnterCriticalSection, the loop still waits the full deltaT before testing again. Think about it This is like calling up tech support, finding there is someone ahead of you in the queue, and waiting five minutes before calling back. There is an excellent chance you will never, ever get support.


Blocking the main GUI thread waiting for process completion

Severity: potentially fatal

This is almost always an error. For one thing, if you are capturing output from the process, you won't be able to display it, because even if the I/O is done in a separate thread, the updating of the main GUI controls is handled by the GUI threads. So the net result is that the program appears to not work at all. Never block the main GUI thread if you can help it. Certainly never block it waiting for process completion.

Note that this is more serious than just blocking a thread and getting a non-responsive GUI, which is merely annoying to the user. Your whole program can simply hang. So it goes beyond poor interface strategy and falls into the potentially fatal class.

As with asynchronous thread notification, you simply disable all controls, menu items, etc. that are inappropriate while the process is running. You've seen this happen when you run a Build under the VC++ IDE. If you drop down the Build menu, there are lots of menu items enabled, but Stop Build is disabled. If you select a build option and the compiler is running, all the other options are disabled but Stop Build is enabled. This is simple OnUpdateCommandUI handling.

What I do is have a Boolean that enables/disables the controls. It is set to TRUE when a process is running and FALSE when a process completes.

class waitInfo {
   public:
      waitInfo(HANDLE p, CWnd * wnd) {process = p; recipient = wnd; }
      HANDLE process;
      CWnd * recipient;
};

...AfxBeginThread(waiter, new waitInfo(procinfo.hProcess, this));
processRunning = TRUE;


/* static */ UINT CMyView::waiter(LPVOID p)
    {
     waitInfo * info = (waitInfo *)p;
     ::WaitForSingleObject(info->process, INFINITE);
    recipeint->PostMessage(UWM_PROCESS_FINISHED, (WPARAM)info->process);
    delete info;
    return 0;
   }

LRESULT CMyView::OnProcessFinished(WPARAM, LPARAM)
   {
    processRunning = FALSE;
    return 0;
   }

void CMyView::OnRunProcess(CCmdUI * pCmdUI)
   {
    pCmdUI->Enable(!processRunning);
   }

Using setjmp/longjmp in a C++ program

Severity: fatal. No "potentially" about it.

These are functions that are such an outrageous kludge that I'm not going to dignify them with an explanation of how they work internally. They don't work, never did. The only reason they gave the appearance of working was the incredibly poor quality of the PDP-11 C compiler, which ran in 64K of memory and generated lousy code. The technique couldn't possibly work in any modern optimizing compiler, and didn't. Been there, done that. 

The essence of setjmp was that it marked a place to which control would transfer when a longjmp was executed. It would return with 0 when called directly and nonzero (the longjmp value) if it was an abnormal return.

Today, this is handled with exceptions. setjmp is replaced by TRY/CATCH and longjmp is replaced by throw. Key here is that in C++, these are first-class concepts in the language and not subject to the horrendous bugs that setjmp/longjmp generated in most compilers (the compiler was working correctly; the longjmp logic would produce erroneous results). But the major failure mode of longjmp is that it bypasses all destructors on the stack, with inevitably fatal consequences.

Bottom line: if you have setjmp/longjmp in a C++ program, you have a fatally flawed program. Remove them and replace them with C++ exception handling.

Note: If you believe there is any justification for setjmp/longjmp in a C++ program, you are wrong. Don't try to justify the situation, fix it. (Seriously, I've had people argue this point with me. It takes less time to fix the problem than to try to justify why it makes sense).


Using setjmp/longjmp in any GUI program

Severity: potentially fatal (except if the GUI program is written in C++, in which case it is always a fatal error)

Yes, I've done it. In Win16. With optimizations turned off, because the brain-dead third-party library we were using demanded that we support it. I would never support it willingly. There is no excuse for using it in Win32. The problem in a GUI program is that you can end up with state that is not cleaned up, operations that do not complete, and a plethora of other problems. And I cover these in the next section.


Using setjmp/longjmp in any program

Severity: potentially fatal (except if the program is written in C++, in which case it is always a fatal error)

If you are programming in C, use Structured Exception Handling. If you want portability, you cannot tolerate setjmp/longjmp in any case because it is only barely functional. To make it work correctly, you must declare every local variable as volatile, and that's just the start. You must not use the register qualification on any local variable or parameter. There are limits how setjmp can be called. There are nonportable limitations on the contexts in which longjmp can be called. Managing the longjmp buffer is a real royal pain. They don't nest without a lot of effort. Given we now have linguistic mechanisms that get rid of the need for this horror, it is best to avoid it.

Essentially, if I see this in a program, I know the program is working only by amazing good fortune.


Using the clipboard for anything other than copy or cut

Severity: fatal. This is so incredibly bogus that it is impossible to classify it as anything else.

No, it isn't fatal to your program. Programs that violate this rule will probably continue to work just fine. Just don't ever be caught in a dark alley with one of your users!

You, the programmer, do not have the right to modify the clipboard contents. Only the user has the right to modify those contents. Thus, the only time you are permitted to modify the contents of the clipboard is when the user has requested you to by doing a copy-related or cut-related action. You may not modify it at any other time. I've seen people use it for interprocess communication. This is absolutely beyond any shadow of a doubt the WRONG thing to do. There is no way to emphasize this strongly enough. Only if the user has an explicit operation that says "Put this in the clipboard" are you permitted to change the clipboard contents. It could be a menu item, a popup menu item, Ctrl+C, Ctrl+X, a pushbutton that says "copy", or anything else. Then, and only then, are you permitted to change the clipboard contents.

Imagine the following: your program is running. It fires off another program to run. It puts something in the clipboard to send to that program. However, I left out something in this sequence. The real sequence is


Sleep(n)

Severity: potentially fatal

There are valid reasons for using Sleep. However, the most common usages I find of them are completely wrong-headed and in most cases result in fatal flaws in the programs.

First, you have to understand what the value of n means: it means, "Stop this thread for at least n milliseconds". The thread will resume sometime after that. Perhaps a long time after that. So if you say "Sleep(5)" the likelihood that your thread will be running 5ms later is nearly zero. This is because the basic clock is 10 or 15 ms (or in MS-DOS, 55 ms, but I consider those systems dead), so it cannot possibly run sooner than one clock tick. But if a higher priority thread is running, it will not be preempted, so it might be several timeslices before the thread which is now runnable can actually run. Even if you think you have control of the priorities, you don't, because the Balance Set Manager thread may be running, and it can boost any thread to priority 15 and give it a double timeslice, and in NT, there are file system and other kernel threads, and Deferred Procedure Calls (DPCs) that will also preempt. Read my essay "Time is the simplest thing" for more details.

Mostly, when I spot a Sleep() call in a program, I ask "Why is this here?" and the answer is "Because the threads won't work unless it is". What this means is the multithreading architecture is irrevocably flawed and needs to be rethought and perhaps rewritten. Throwing a Sleep() call may give the illusion that you have solved the problem, but in fact there is a fundamental problem, and the Sleep() call disguises it. However, on a faster machine, or a different version of the operating system, or a different mix of applications, or a different loading factor on your app, or most especially, on a multiprocessor, you will discover that the fundamental architecture is flawed. At that point, you don't have a choice.

True story: some time ago I was consulting with a client, doing a code walkthrough of their multithreaded application. I spotted a Sleep() and knew immediately that this was where the problem was. I didn't know what the problem was, but I knew I had found it. After some explanation by their programmer (which included "the threading doesn't work right if it isn't there") I understood the code, and said "This will fail under the following conditions" and proceeded to outline the conditions, and said "and it will fail in the following ways" and listed a set of ways the program would fail. They were impressed. I had described exactly the conditions under which it failed and the manner of the failures, without ever having seen the program run. It had taken me perhaps 15 seconds to spot the Sleep() call (they had said "Here's the function that is failing" so I didn't have to look for it). How did I do this so quickly? Because it was by no means the first such instance I had seen! And in every previous time I had been absolutely correct.

Another case I see all too often is after a ::CreateProcess call there is a Sleep() call, "to give the process time to come up". No matter what value is chosen here, it is wrong. There is no correct value for this Sleep() call! Either it is too short, or too long, but it will never be "just right". There is a simpler method if you are waiting for a GUI app to become active, with its window fully displayed. Use the WaitForInputIdle API call. You can put a timeout on it, but this timeout is usually fairly large and is intended to indicate that you are in really deep yogurt, somewhere down in the boysenberries. If the timeout is selected properly, it will be fairly long, and it means that the process has failed to come up at all.

There are occasional cases in which a Sleep() makes sense. But they are so rare, and so special-case, that I can't give a general rule by which you can determine that it is a valid usage. But keep in mind that most uses are not valid or sensible. And also it means that any constant you choose for the timing value is probably wrong, also. In my multithreading lab, I demonstrate a case of using Sleep(5000) to make the program work. I then force the students to consider all the ways this can fail, and why any value, 5000, 10000, 1000, etc. will always be wrong. Then I force them to rethink how the synchronization between the threads is done. When we are finished, we have a correct, Sleep-free program.

There are times when Sleep(0) is handy to produce interesting animation effects in multithreaded programs; Sleep(0) forces the current thread to yield and allows another thread to run. This is in general inefficient, because it induces a lot of gratuitous context swaps and scheduling operations, but for some animation effects to illustrate the effects of multithreading it really makes a nice visual effect. But you have to be willing to give up a fair amount of CPU resource to use this, and it is very special case.


Using SleepEx()

Severity: potentially fatal, or absolutely essential

SleepEx() has all of the charm of Sleep(). It differs only slightly in that it takes two parameters; one of which is the sleep interval and one of which is a Boolean indicating if pending Asynchronous Procedure Calls (APCs) should be taken. The combination SleepEx(0, TRUE) has the characteristic that it does not really suspend the thread, but allows any pending APCs to be taken. If there are any APCs, they will be processed before the SleepEx continues. Thus, if you are using asynchronous I/O, this is actually not only a useful call, but in some ways an essential call (You could also use WaitFor...Ex() calls). So you can feel free to use it in the context of pending asynchronous callback I/O. If you are not using one of the WaitFor...Ex() calls you must use SleepEx(n, TRUE).


Polling for time

Severity: inefficient to potentially fatal

This is one of the most common errors I've seen. The idea is to somehow ask the system about how much time has elapsed and then take some action based on a change in time. Most of the code I've seen to do this is usually bizarre or convoluted. And mostly seems to be based on some illusions about what time is. As a prerequisite to doing anything with time, you should read my essay on time.

First, you can't poll for time. This just wastes CPU time with no guarantees that anything meaningful will happen.

As pointed out in my essay, the resolution of a time value has nothing to do with the precision that is used to represent it. For example, a time value that can be stored in milliseconds. I've seen code that goes like this

SYSTEMTIME t0;
GetSystemTime(&t0);

while(TRUE)
   { /* pause loop */
    SYSTEMTIME now;
    GetSystemTime(&now);
    if(abs(t0.wMilliseconds - now.wMilliseconds) > 1)
      { /* 1 ms passed */
       break;
      } /* 1 ms passed */
   } /* pause loop */

Actually, this code was written because the programmer said "But you said that the resolution of the Sleep() function was only 10ms, so this gets around that!". Actually, the system time increments by some number of milliseconds on each clock tick, and consequently if you write a program that simply reads the time in a tight loop, the minimum difference you will see is 10ms or 15ms. So this code is identical to Sleep(1) without the simplicity, and will resume sometime between 1ms and 10ms after the call. So the above code is fatally flawed. Not only is it inefficient, it doesn't do what it claims to do.

An even worse case was one where the programmer wanted to activate something at 8 a.m.. The code was similar to this:

void DoAtEight()
   {
    while(TRUE)
      { /* Wait loop */
       CTime t = CTime::GetCurrentTime();
       CString s = t.Format(_T("%H:%M:%S"));
       if(s == _T("08:00:00"))
         { /* 8am */
          ...do processing...;
         } /* 8am */
      } /* Wait loop */
   }

Now this code cannot possibly work reliably. First, the programmer observed that the machine ran somewhat sluggishly and Task Manager showed that the application was consuming 100% of the CPU time. This should not be surprising; the code sits there and keeps asking "are we there yet?" "are we there yet?" and will only be suspended when it has exhausted its timeslice. But what is worse, this code cannot work. Consider: a timeslice is about 200ms (the details vary based on the fundamental time tick). So suppose that at 07:59:59.99, the last time read, the thread executing this code is suspended. There are ten other threads waiting their turn. The next time the waiting thread is scheduled, it reads the time, and finds that the time is now 08:00:01.13. So at no time will the string representing the time ever be 08:00:00. Now on some days this will work, but on other days it will fail. In addition, in the unlikely chance that the time read was actually 08:00:00, if the processing takes less than one second, there is an excellent chance that the next time through the loop the  time might again be 08:00:00, causing the processing code to be executed two or more times, instead of the once that was desired. All in all, possibly the worst possible way to accomplish the task.

My proposed solution was to do something like this:

void DoAtEight()
    {
     BOOL processed = FALSE;
     while(TRUE)
       { /* wait loop */
        CTime t = CTime::GetCurrentTime();
        if(t.GetHour() >= 8 && !processed)
          { /* process it */
           processed = TRUE;
           ...do processing...;
          } /* process it */
        else
        if(t.GetHour() < 8)
           processed = FALSE;
        Sleep(60000); // wait for a minute, or a bit more
       } /* wait loop */
     }

This is still not optimal; it wakes up once a minute, performs its test, and goes back to sleep. It would be even better to use a waitable timer.

void DoAtEight()
   {
    HANDLE timer = ::CreateWaitableTimer(NULL, FALSE, NULL);
    BOOL processed = FALSE;
    while(TRUE)
      { /* wait loop */
       SYSTEMTIME t0;
       ::GetSystemTime(&t0);
       if(t0.wHour > 8)
         { /* adjust to tomorrow */
          ULARGE_INTEGER t;
          ::SystemTimeToFileTime(&t0, (FILETIME *)&t);
          t += 24ui64 * // hours in a day
               60ui64 * // minutes in an hour
               60ui64 * // seconds in a minute
             1000ui64 * // milliseconds in a second
             1000ui64 * // microseconds in a millisecond
               10ui64;  // 100 ns units in a microsecond
          ::FileTimeToSystemTime((FILETIME *)&t, &t0);
         } /* adjust to tomorrow */
       t0.wHour = 8;
       t0.wMinute = 0;
       t0.wSecond = 0;
       t0.wMilliseconds = 0;
       LARGE_INTEGER alarm;
       ::SystemTimeToFileTime(&t0, (FILETIME *)&alarm);
       ::SetWaitableTimer(timer, &alarm, 0, NULL, NULL, FALSE);
       ::WaitForSingleObject(timer, INFINITE);
       ...do processing...
      } /* wait loop */
   }

Use of malloc/free in C++ programs

Severity: poor style, potentially harmful

There is no reason I can imagine to use malloc or free in a C++ program. If you need them, there are useful alternatives. For example, if you need a dynamically-allocated buffers of n bytes, the simplest way is not to write

    BYTE * buffer = malloc(n);

but to write

    BYTE * buffer = new BYTE[n]; 

They perform the same action, so why use an obsolete allocation mechanism.

Why does this matter? Consider the following: I need an array of objects, so I allocate them as follows:

typedef struct {
     int x;
     int y;
} Coordinate;
Coordinate * coordinates = (Coordinate *)malloc(n * sizeof(Coordinate));

Looks good, right? Works fine, right? Sure. This week. But now I change the structure to

typedef struct {
   int x;
   int y;
   CString description;
} Coordinate;

Now what happens? The malloc operation does not call the CString constructor. You have garbage. Your program crashes. 

The correct solution was to have always written the code as

Coordinate * coordinates = new Coordinate[n];

This works, and it will always work correctly, even if you later add members to the structure that require constructors. Writing code that breaks when apparently irrelevant changes are made is always a Bad Idea. And since there is no possible justification for using malloc, why bother writing incorrect code?

Yes, I've heard the argument, "malloc is faster". I have some advice for anyone who advocates this: Get A Life. You are clueless. An allocation takes thousands of instructions. The trivial overhead introduced by using new is unmeasurable. This is a classic example of the "let's optimize at the instruction level" when the real costs are at the architecture level. If you are allocating frequently enough that you think this trivial overhead can matter, you are very confused; why are you allocating at all? The real issue is to make programs efficient, and you don't accomplish that by worrying about unmeasurable overheads when the real costs (such as the actual cost of the allocator) swamp it by two to four orders of magnitude.

An even worse situation I've seen is the mixing of malloc with delete, or even new with free. There is no possible justification for such insanity. If it exists, you are wrong. Fatally wrong (no "potentially" here!). Fix the code. Pretend malloc and free are obsolete and should never, ever be used in a C++ program.


Use of GlobalAlloc/GlobalLock/GlobalUnlock/GlobalFree

Severity: inefficient, deeply suspect (except those cases where it is mandated)

These APIs are essentially obsolete, with a very few exceptions. Only when you have one of these exceptions should you ever consider these as viable entities. Otherwise, use new/delete. They are relatively inefficient in terms of both time and space. 

The contexts that I am aware of are the clipboard, where you must provide a global memory handle This must be allocated by GlobalAlloc. You use GlobalLock to make its address available, GlobalUnlock when you are done with it, and you never call GlobalFree on a clipboard handle, under any circumstances. 

There are certain uses with Blobs in OLE controls that use GlobalAlloc.

The DEVNAMES structure and certain fields of the PRINTDLG (contained in CPrintDialog) use global memory handles for backward compatibility with 16-bit Windows.

There may be other uses, but the documentation will always tell you if a global handle is needed. Otherwise, avoid them entirely.


Using LocalAlloc

Severity: poor style

There is no reason to use LocalAlloc. This is a throwback for compatibility with old Win16 programs. Ignore its existence. The only reason to use LocalFree is for the buffer that is returned from FormatMessage. There may be other uses that are mandated, since I have not read or memorized all 5,000 or so API calls. But unless the use is mandated by an API call, there is absolutely no reason to consider this as useful.


Doing a delete of an array object without []

Severity: potentially fatal

When you do a delete on an array of objects, it is crucial that you include the [] in the syntax, that is, if data is an array of objects,

delete data;    // wrong!
delete [] data; // right

The problem is that the first form only deletes the space occupied by the array. The second for deletes the space occupied by the array, but first calls the destructors on every object in the array.

For example, consider the class

class MyData {
    public:
      int x;
      int y;
};

If I write

MyData * data = nwe MyData[n];

then n objects are allocated. If I do 

delete data;

the array is deleted. Everything appears to work. But consider, if I do

class MyData {
    public:
      int x;
      int y;
      CString s;
};

The the call

delete data;

will delete the space, but the CString destructor will not be called on each element. The effect will be a storage leak. If, however, I write

delete [] data;

then the destructor is called on each element of the array, and proper object destruction is done for each contained object. Note that the effect of failing to write the delete correctly is that the addition of an object requiring a destructor will mean the program stops working properly in some way (usually a memory leak, but it could be more serious), and you won't know what went wrong.


Using non-const static variables in functions

Severity: deeply suspect to potentially fatal

By "static" variables here I mean a declaration of the form

static type name;

and all variants, when they appear inside a function. An example is

int DoSomething()
   {
    static int count = 0;
    ...
    count++;
    }

This may give the illusion of being modular, but in fact it is a very poor programming technique. I find that this is a technique which is a throwback to the more primitive forms of C programming. It is inappropriate in a Windows programming environment. The reasons deal with functions that need to be reentrant across class instances and functions which could appear in a multithreading context.

The worst and most typical form is

LPCTSTR ConvertToString(SomeType * t)
    {
      static TCHAR result[SOME_SIZE_HERE];
      wsprintf(_T("%d:%02d/%d"), t->seconds / 60, t->seconds % 60, t->divisor);
      return result;
    }

A piece of code like this in any program is an invitation to disaster. The first thing I do when I find it is rip it out and replace it with something sensible. There is no point in wasting time trying to debug a program that is this deeply flawed. When I'm working with MFC, I tend to return CString data types, since the allocation is automatically handled.

When would the above code fail? Consider the case

CString s;
s.Format(_T("%s and not greater than %s"), CovertToString(t1), ConvertToString(t2));

Both ConvertToString operations use the same buffer and return it, so by the time the format string is processed, the buffer computed by the conversion of t2 is clobbered by the result of the conversion to t1 (remember that parameters are always evaluated right-to-left).

Consider the case of multithreading, where two threads execute a call to ConvertToString. The value seen is unpredictable, and on a multiprocessor the string could potentially change while being read!

The use of const static variables is perfectly acceptable, for example, 

double ConvertToSomething(double f)
   {
    static const factors[] = {1.0, 1.5, 1.723, 1.891, 1.9843};
    ... do some conversion here, using factors[i] based on 
    ... some criterion appropriate
    return result;
   }

The code of the form

CString cvMonth(int n)
    {
     static const LPCTSTR names[] = {_T("Jan"), _T("Feb"), _T("Mar"), ...etc };
     return names[n];
    }

however, is probably a Very Bad Idea since it binds a particular language into the source code, which is a losing idea when internationalization is required. You could do

CString cvMonth(int n)
   {
    static const UINT names[] = { IDS_JAN, IDS_FEB, IDS_MAR, ...etc. };
    CString result;
    result.LoadString(names[i]);
    return result;
   }

But even this is a Bad Idea if you have to deal with any culture that has more than 12 months in its year, or whose month index would not correspond to the Western European/American civil conventions that January is the first (zeroth) month. Use the National Language Support APIs, but that's a different essay to be written at some future time.

There is one exception to this: cached values. A cache keeps a precomputed value, but--and this is critical--it has a way of detecting if the cache is invalid. You can frequently get nice performance speedups if you use cached values, but here you have to weigh the cost of cache management with the cost of not caching. This management includes both the execution costs of determining if the cache is valid and the architectural costs imposed if explicit cache invalidation is required (and the concomitant costs of failure to invalidate and using incorrect data). Here you have to use good judgment, and put lots of explicit comments explaining exactly why the use of a non-const static value is a valid thing to do. And even then, I'd be more inclined to keep the cache in a member variable of a class that was involved in the computation, rather than in a static variable. The only reason to use static variables is if the cache maintains validity across multiple objects.


Using new without any reason to

For some reason, people use new when there is absolutely no reason to do so. The most common failure of thinking is when it comes to passing pointers to objects. For example, if I have an API that requires a parameter SOMEDATATYPE *, the solution is obvious:

    SOMEDATATYPE data;
    SomeAPI(&data);

This has not stopped programmers from thinking that the type of the argument has to match the type of a variable, so they will do

    SOMEDATATYPE * data;
    data = new SOMEDATATYPE;
    SomeAPI(data);
    ... use it
    delete data;

It is hard to imagine why the second form could ever be imagined to make sense. It introduces a gratuitous concept, memory allocation, to a place where memory allocation is not needed, and requires a corresponding delete. This not only adds intellectual baggage, it obscures the purpose of the code, and introduces two possibilities for error. The first is that the new failed (note the lack of any check for its success!) and the second is that in the section called "use it" that the code gets very long, and someone puts, in a later modification, a return FALSE or some similar construct into the code which bypasses the delete, resulting in a storage leak.

Another example is to create a member variable for some type of window, e.g., a dynamically-created CButton:

class whatever : public CFormView {
    ...
    protected:
        CButton * mybutton;
};

then in the code do

    mybutton = new CButton;
    mybutton->Create(...stuff here...);

This has the same problem as above. This introduces a gratuitous need for a concept that has no relevance, allocation; it requires that you remember to put into the constructor an assignment

    mybutton = NULL;

and in the destructor

    delete mybutton;

all of which is pointless intellectual overhead. Instead, do

    protected:
        CButton mybutton;

and then you can do

    mybutton.Create(...stuff here...);

Note that concepts such as initialization of the pointer, destruction of the object, and the use of new are replaced by a trivial concept: declaring a variable. There is virtually no reason to allocate a CWnd-derived class dynamically when its extent is the extent of the parent class.

Just about the only place where you do this dynamically is to allocate a CDialog *-derived object for a modeless dialog.

Another example is "but what if I need an array of integers?" What if you do? CArray or std::vector exist, and work quite well. For example, consider the case where you need an array of integers which represent the selected items in a multi-select listbox:

    CArray<int, int>selections;
    int n = c_MyListBox.GetSelCount();
    selections.SetSize(n);
    c_MyListBox.GetSelItems(n, selections.GetData());

works nicely, and has the advantage that you don't need to remember to call delete.

There are lots of valid reasons to use new. Foremost among them is passing information across thread boundaries. Information to be passed is put into a heap-allocated object, and a pointer to that object is passed from one thread to another. When the receiving thread is finished with the object, it calls delete.


Creating Controls Dynamically

Severity: deeply suspect, poor style (unless there really is no alternative!)

There are lots of valid reasons for creating controls dynamically. One correspondent pointed out that he had to create controls based on an XML description. I've done them from database schemata. But 99% of the time I see someone creating a control dynamically, they're doing it for the wrong reasons. And sometimes when they do it, they do it in the wrong way.

It is rare to need to create a control dynamically (in spite of the fact there are lots of valid reasons, they only rarely apply). However, it is also unnecessary in general to do a new to allocate an object to reference the control. I commonly see people doing things like

BOOL CMyDialog::OnInitDialog()
   {
    CDialog::OnInitDialog();
    if(sometestorother)
       { /* we could stop */
        CButton stop = new CButton;
        CRect r(175, 54, 205, 65);
        stop->Create("Stop", BS_PUSHBUTTON, r, this, IDC_STOP);
       } /* we could stop */

Just about every line in the above code is about as wrong as you can get. The problems are

Key here is that none of these lines are required, or are even correct. I would simply create the control in the Dialog editor, size it and place it where I wanted it, and then do

BOOL CMyDialog::OnInitDialog()
   {
    CDialog::OnInitDialog();
    c_Stop.ShowWindow(sometestorother ? SW_SHOW : SW_HIDE);

Note how much simpler and cleaner this is. I don't have to compute a position, I don't have to allocate anything, I don't have to delete anything, and I don't have to do a Create

If I had to create a control at runtime, there is no reason to do a new if there is not some dynamic quantity being created. Just declare yourself, outside the scope of the magic AFX comments, an appropriate variable

CMyCustomControl ctl;

Don't do a new unless there is a compelling reason to do so (there can be, but most of the instances I've seen where new is used to create a control variable are just a waste of conceptual energy).

When you need to create the control, just make sure your custom control has a Create method, and write

CString caption;
caption.LoadString(IDS_STOP);
VERIFY(ctl.Create(caption,and,parms,you,want,here));

To create a control based on a custom window class, you can select the "custom control" (the little human head icon) from the toolbox. Use it to place the control. Specify the class name of the control class in the Properties box. Select the desired styles. The problem with the styles is that you can't just use symbolic names like WS_VISIBLE, which is a real pain; you have to go into winuser.h and figure out the correct hex constants to use. Sigh. Make sure you have registered the window class before you try to create the window. You may do this in the InitInstance handler for a dialog-based app, or in the constructor for a CDialog- or CFormView-derived class. Or see my essay on self-registering classes.

I use VERIFY to make sure that if there were an error, I will see it at the time the creation is done, instead of having some other part of my program fail much later for reasons I will have to work hard to understand. In some cases I might store the result and take other action, but in most cases, creation failures represent coding errors which, once fixed, no longer exist. This catches the failure early.

Hints: making dynamic creation easy

Suppose I wanted to create a pushbutton. I would know that, for example, its left edge, right edge, top edge, and/or bottom edge (choose at least one horizontal and one vertical) would have to line up in some way with some other control. Or be a certain percentage distance from a left, right, top, and/or bottom of the window.

Rather than compute the button size, which must be in pixels, based on some criterion, or trying to mess with Dialog Box Units, which are a waste of intellectual effort, I would use one of several tricks.

When I had to do this in a product I delivered, I created a form which had several hidden controls: a combo box, a radio button, a check box, a static label, and an edit control. I created these as disabled, invisible controls. I did not have space on the form to put them (the entire body of the form was a giant ListBox that contained these controls inside it), so I simply made the ListBox a little smaller vertically and resized it in the OnInitialUpdate handler of the form. Thus it overlapped the invisible controls, but I could still see them in the dialog editor (hint: if you ever discover you are creating physically overlapping controls, rethink what you are doing. There is always a better way. I'll trade off more complex code for ease of long-term maintenance any day!

When I needed to create an Edit control, I had a variable, c_EditPrototype, which got the value of the edit control that I put on the dialog. I wrote something along these lines:

void CMyForm::CreateEdit(int n)
   {
    // Create an edit control on line n of the listbox
    CRect r;
    c_EditPrototype.GetWindowRect(&r);
    CRect line;
    c_Data.GetItemRect(&line);
    int left = ...complex computation of left edge of edit control;
    CRect edit(left, line.top, left + r.Width(), line.top + r.Height());
    CEdit * ctl = new CEdit;
    ctl->Create(styles, edit, &c_Data, controlid++);
    ...lots more stuff here, including saving the CEdit * for later deletion
   }

One key insight here is that I did not have to worry about the font size or screen resolution. The edit control prototype was the width desired (all edit controls would be the same width), the height was automatically set when the dialog was created, taking into account the font size, resolution, etc., so all I had to do was copy them. 

The buttons were a bit trickier; what I did was create a button with no caption text. This gave me my minimum width; I then added GetTextExtent pixels to that to get the new desired size.

Bottom line, be lazy. Let the system do as much for you as it can. 

If I had wanted a caption, I would have stored the caption in the STRINGTABLE and loaded it dynamically. I would never hard-code English text into a product program. If I wanted a button with a non-language caption, I would create it as

c_Button.Create(_T("<<"), BS_PUSHBUTTON, r, this, IDC_REVERSE);

Note the use of _T( ) around the string constant so it is Unicode-aware.

Positioning

Sometimes it is easy. Suppose I need to create a custom control centered in a form, just above another control. We have a desired width and height.

CRect client;
GetClientRect(&client);

CRect below;
c_WhateverControl.GetWindowRect(&below); // the control below the one we want
ScreenToClient(&below); // convert to client coordinates
gap = 2 * ::GetSystemMetrics(SM_CYEDGE); // allow nice gap between

int left = (client.Width() - width) / 2;
CRect ctlrect(left, below.top - height - gap,
               left + width, below.top - gap, this, ctlid);

You can compute all the geometry you want, but if constants appear in any context other than multipliers or divisors, you are in trouble. Note that I even compute the gap based on the screen driver's opinion of the height of the ideal 3-D horizontal edge.

If you know where you want it, and how big it will be, you can simply create, in the dialog editor, a static control, such as a frame or rectangle. Mark it as invisible. Change its ID from IDC_STATIC to something useful. Create a control variable for it. For example, I might call it IDC_CUSTOM and the variable c_CustomFrame.

CRect w;
c_CustomFrame.GetWindowRect(&w);
ScreenToClient(&w);
c_WhateverControl.Create(styles, w, this, ctlid);
c_CustomFrame.DestroyWindow(); // no longer need frame

Note that there is no computation of the size; the size has already been computed for me by the dialog handler when it instantiated the static control, and I am creating a custom control in the same area.


Using hardwired constants for any size or position

Severity: serious

It is a common practice to want to rearrange controls on a dialog at run time, for example, when the dialog is resized. An unfortunate characteristic of such code is that it tends to include hardwired constants, which is almost always an unrecoverable flaw.

This is because dialogs are always expressed in terms of Dialog Box Units, which are an abstract coordinate system. When a dialog is instantiated, the system "realizes" the dialog box by computing a function based on the default system font. From that point on, everything is in terms of pixels. For a very high resolution (e.g., 1600 ×1200) the dialog may take many more pixels than if the display is running at some lower (e.g., 800 × 600), because it tries to keep an approximately same-size-on-the-screen effect of a given dialog. The downside is that any attempt to use a hardwired constant is doomed from the start..

The proper method is to use one of the basic resolution-sensitive parameters of the system to scale the dimensions properly. Two of the key parameters are the border size and the resize border size. For example, I might choose to separate a set of buttons by a multiple of the resize border size. In a very high resolution, this size may be doubled over what it is at a lower resolution. Useful computations I include in my positioning code might be

UINT vsep = 3 * ::GetSystemMetrics(SM_CXBORDER);
UINT colsep = 2 * ::GetSystemMetrics(SM_CXEDGE);

The most useful values I have found for ::GetSystemMetrics that are handy to know about are shown in the table below. However, there are many other parameters which can be interesting.

SM_CXBORDER
SM_CYBORDER
Width and height, in pixels, of a window border. This is equivalent to the SM_CXEDGE value for windows with the 3-D look.
SM_CXEDGE
SM_CYEDGE
Dimensions, in pixels, of a 3-D border. These are the 3-D counterparts of SM_CXBORDER and SM_CYBORDER
SM_CXFIXEDFRAME,
SM_CYFIXEDFRAME
Thickness, in pixels, of the frame around the perimeter of a window that has a caption but is not sizable. SM_CXFIXEDFRAME is the heightof the horizontal border and SM_CYFIXEDFRAME is the width of the vertical border.
SM_CXSIZEFRAME,
SM_CYSIZEFRAME
Thickness, in pixels, of the sizing border around the perimeter of a window that can be resized. SM_CXSIZEFRAME is the width of the horizontal border, and SM_CYSIZEFRAME is the height of the vertical border.
SM_CXHTHUMB Width, in pixels, of the thumb box in a horizontal scroll bar.
SM_CYVTHUMB Height, in pixels, of the thumb box in a vertical scroll bar.
SM_CYCAPTION Height, in pixels, of a normal caption area.
SM_CXVSCROLL,
SM_CYVSCROLL
Width, in pixels, of a vertical scroll bar; and height, in pixels, of the arrow bitmap on a vertical scroll bar.

 In those rare cases where you may need to create controls dynamically, hardwiring any integer value for the width or height is uniformly a blunder. I have found the most convenient way to create controls dynamically is to lay down a "prototype" control on the dialog. I mark this control as invisible, and create a control member variable for it. Then, for example, if I need to create a control at runtime I use the dimensions of this control to determine the dimensions of the control I want to create, most often the height.

CStatic c_PrototypeStatic;   // declared in the class by ClassWizard
//...create the control at position x,y holding text 'caption'
CRect r;
c_PrototypeStatic.GetWindowRect(&r);
ScreenToClient(&r);
CClientDC dc(&c_PrototypeStatic);
int width = dc.GetTextExtent(caption).cx;
width += 2 * ::GetSystemMetrics(SM_CXEDGE);
r.right = r.left + width;

CRect w(x, y, x + r.Width(), y + r.Height());
...Create(caption, SS_LEFT, w, this, IDC_STATIC);

Note that I did not show anything to the left of the Create method because there are lots of ways I could create a window as simply calling

CStatic * wnd = new CStatic;
wnd->Create(...);

or have an array 

wnd[i] = new CStatic;
wnd[i]->Create(...);

or have a variable declared in the class and write

c_MyLabel.Create(...);

Here's another trick I use a lot: I need, for whatever reason, to create a window at runtime at a specific location on a dialog. Often this is a custom control of some sort. To get it properly positioned on the dialog, I do not ever rely on coordinates that I "wire in". Instead, I place an invisible CStatic control with an ID other than IDC_STATIC on the dialog, and use its coordinates to create my desired custom window. For example

CStatic c_Frame; // created by ClassWizard

CMyControl c_Whatever; // added by hand in the protected area

BOOL CMyDialog::OnInitDialog()
   {
    ... lots of stuff here...
    CRect r;
    c_Frame.GetWindowRect(&r);
    ScreenToClient(&r);
    UINT id = c_Frame.GetDlgCtrlId();
    c_Whatever.Create(r, this, id);
    c_Frame.DestroyWindow();
    ... more stuff here...
   } // CMyDialog::OnInitDialog

Note some tricks here. I can actually reuse the control ID because I destroy the static window (although this is not actually necessary, I find it cleaner). The Create method shown here is the Create method I have written for my own control. Your Mileage May Vary and you may have additional parameters.

Another time I use this trick is when I want to limit the amount the user can resize a resizable dialog. What I will do is create an invisible static control that represents the minimum size. This is much cleaner than picking some control to serve the purpose, because if you rearrange the controls the control you originally selected may no longer be the rightmost or bottommost. 

To create an OnGetMinMaxInfo handler, you either have to add it by hand or go to the Class tab in the ClassWizard and change the type from dialog to general CWnd. It should have been obvious to Microsoft that a resizable dialog would need this (they have the style flags available to determine this) but this is one of the many defects of ClassWizard.

void CMyDialog::OnGetMinMaxInfo(MINMAXINFO * lpMMI)
   {
    CRect r;
    c_Frame.GetWindowRect(&r);
    ScreenToClient(&r);
    lpMMI.ptMinTrackSize.x = r.right;
    lpMMI.ptMinTrackSize.y = r.bottom;
   }

Because I have the frame to reference, if I add new controls that I don't want covered on a resize, I only have to adjust the size of the hidden frame.

However, note that at all times I have never, ever once used a hardwired constant to determine either the position or the size of any control, or other distance, on the dialog.


Using hardwired constants for control IDs

Severity: potentially (usually, when small values are used) fatal

It is frequently useful, and sometimes necessary, to be able to create controls at run time. This is a practice which ought to be avoided. Typical silly reasons are largely similar to "I only need the progress control when I'm doing the [read, write, computation]". This is a reason so silly that I cannot imagine why anyone would ever think it made sense. It is so trivial to hide a control when you don't need it (and even create it without the Visible property checked) and show it when you need it that this never makes sense. It is usually compounded by the tendency to hardwire control coordinates, which is even less excusable. An even sillier excuse is "Well, if I create it in the dialog editor and then hide it, it takes up space". Yes, and the price of gasoline in downtown Pittsburgh on 18-Mar-03 was $1.659/10. That is about as relevant to the discussion. People who talk this way act as if the infinitesimal amount of space required to hold the window information actually matters. In fact, the code required to create and manage the window probably takes up more space! This is another case of "optimizing" the wrong parameter, with no justification whatsoever.

A typical sort of insanity tends to look like

m_progressbar = new CProgressCtrl;
m_progressbar->Create(...., 1);

This embodies at least two serious blunders. The use of a dynamically-created CProgressCtrl is another silly idea. You could create a CProgressCtrl variable, not a CProgressCtrl *, and it would work just as well. Without involving the additional complexity of the allocation, and necessitating the corresponding delete.

But why create the control at all? Place it on the dialog in the dialog editor! You get a nice control ID already assigned, you can use the ClassWizard to create a binding to a control variable. Just uncheck the Visible attribute. Then when you need it, replace the complex lines shown above with a simple

c_progressbar.ShowWindow(SW_SHOW);

when done with it, do a ShowWindow with SW_HIDE.

Now, when it comes to using hardwired numbers, note the use of the number "1". There is no explanation of why so many people choose the number "1", but in a dialog this is fatal. The control which is the OK button has ID IDOK, which a quick inspection of the header files will show is ID number 1. Hence, most attempts to bind the control will fail with ASSERT failures when the binding is tried, because the control with ID 1 is already assigned.

Typically, for most of us when creating controls, we pick a range of integers in the range 10000 to 32767. Most importantly, we use a #define to define the base of this range, so if there is a conflict for some reason, it is trivial to compensate for it. The reason for the choice of values is that the dialog editor tends to assign much smaller numbers, and the values in the range greater than 32767 would often result in a serious conflict with the existing Microsoft menu item ranges.


Using GetDlgItem

Severity: poor style

This is hardly ever necessary. It is certainly not necessary for getting (as some have argued with me) the window reference for a control. Just use the dialog editor to create a control variable. I discuss this in depth in my companion essay. Key here is that if you subclass a control later, and override one or more of its built-in methods, GetDlgItem guarantees that your program will stop behaving correctly. 

There is rarely an excuse for using it; I write perhaps one a year, and consider very carefully the alternatives before doing it.


Using UpdateData

Severity: poor style

This is one of those ideas so bad that if it hadn't already been invented, nobody would be dumb enough to invent it. It is based on the illusion that you want memory variables to be exact replications of control state. We all know from basic design principles that replicating information is usually fatal unless you are extremely careful about how you manage the state. My experience with UpdateData was that I found programs harder to write, harder to debug, and harder to maintain than programs that didn't use it. Almost every time I get a dialog that has failure modes such as inconsistent control behavior, incorrect results depending on what order controls are used in, and similar problems, I look for UpdateData. I find it. I discover the programmer was clueless as to how to use it in a way that would guarantee these problems would not exist, which is not surprising given how poorly designed it is. So I just rewrite the code to eliminate it (see my essays on Avoiding UpdateData, Avoiding GetDlgItem, and Dialog Control Control Management). At that point the problems simply disappear.


Writing an OnCommand handler

Severity: poor style, potentially harmful, potentially fatal

I have not encountered any situation in which this is a valid approach to solving any problem. In all cases I have seen it, it is someone who doesn't understand either Windows or MFC trying to do something that is much more easily (and often correctly) done by using normal MFC message dispatching. In eight years of MFC programming, I have only ever written on OnCommand handler, and that was to investigate how OnCommand works to explain command routing. Every example I have found in code, newsgroup postings, etc. was the result of a beginning programmer being totally clueless. Some if this code persists for years, confusing the next maintainer who has no idea how to modify the code. The only solution is to rip this code out and replace it with something that makes sense.


Writing a PreTranslateMessage handler

Severity: deeply suspect

This is an overused and usually inappropriately-used technology. Many programmers use it as a substitute for doing proper subclassing of events. There are many legitimate reasons to use it, but they are far and away overshadowed by inappropriate misuse. Generally, the severe warning signs of misuse are

Appropriate uses include testing for mouse and keyboard events without attaching those to a specific control ID (for example, handling F1 help), handling complex tasks which MFC interferes with, such as editing labels in a tree control contained in a property page (the Enter key to terminate label editing will also close the property sheet if you don't do the override), and, within a subclass handling messages sent to that control.


Writing an OnCtlColor handler

Severity: (exceptionallly) poor style

This seems like it would be a legitimate activity. It rarely is. OnCtlColor is part of the parent class, and therefore suggests that somehow the parent class should know what color to use for a child window. This is a throwback to the days of early 16-bit Windows programming, when the paradigms had not been properly thought out. It violates every sensible definition of modularity that exists. I have no idea how a parent control can know how a child control should paint itself. Years ago, I started writing reflected WM_CTLCOLOR handlers for child classes, and it seemed much saner. All I did was take the incoming WM_CTLCOLOR message and send it to the window whose handle was provided. That window handled the coloring.  I have only seen one sensible use of OnCtlColor, when the programmer wanted to change the color of every child control in a dialog (and I question why this was a valid thing to do). The correct approach is to subclass the control, and use the reflected WM_CTLCOLOR_xxx message by subclassing the child control and creating an =WM_CTLCOLOR reflected-message handler in that subclass. MFC will then route the message to the child class, which is where the decisions belong. Key warning sign: there is a test for both the control message type and a specific control ID in the body of the handler.


Control Management

Severity: potentially fatal (when done poorly, that is, distributed throughout the dialog)

If you do the same computation in two places, pretty soon you won't have the same computation.

This is why we use subroutines.

Unfortunately, this knowledge seems to be lost on programmers who try to manipulate controls in a CDialog- or CFormView-derived class.

I blame some of this on Charles Petzold, because I learned from his Win16 book back in 1989 or so, and it teaches a number of horrendously bad Windows programming habits, which I then had to break. Essentially, what I did was say "There is something wrong with this; how do I do this right in Windows?" and it took me a couple years to get the paradigms right. 

So I wrote an essay on how I do dialog control management in dialogs and CFormViews. It works, it is highly effective, and it is the way I have written control management for about eight years. It consolidates all the control logic in one place. It would have been nicer if Microsoft had given us ON_UPDATE_COMMAND_UI handlers in dialogs, an extension that seems blindingly obvious, but they didn't. I have written an essay on this technique. 

Bottom line: when I get a dialog that malfunctions in terms of enabling, I simply look for all instances of ShowWindow and EnableWindow, and then consolidate them in one function I call updateControls. I was given one app which updated several controls, in 23 different places, using 6 different algorithms, all of which were wrong!  Furthermore, depending on what the user had most recently done, controls that should have been enabled remained disabled, or controls which should have been disabled remained enabled. Any programming style that has this sort of result is flawed beyond hope of redemption. Abandon that style.


Manipulating Global Variables

Severity: poor style, deeply suspect, serious

There are very, very few global variables needed in a C++/MFC program. Generally, these variables represent key global state which is the raison d'ętre of the program. Other global variables, incidental to the actual computations being performed, are usually a mistake.

Perhaps the worst mistake committed is the use of global variables which are used or manipulated by a dialog. This results in a dialog which cannot be reused easily, and introduces possibilities of serious errors during maintenance.

Now let's get something established at the start here: I come from the school of "Global Variables Considered Harmful". I mean that literally. My Ph.D. dissertation advisor and one of my committee members wrote a paper of that title:

W. Wulf and M. Shaw. Global variables considered harmful. Sigplan Notices, 8(2):28--34, February 1973. 17.

Since there is no reason to use global variables (except for the basic program state being manipulated), my rule is that no dialog I ever write will use a global or static variable, with the possible exception of a table labeled as a static const table and declared in the source module of the dialog.

Note that many programmers show a proclivity to stuff all sorts of variables into the CWinApp-derived class or the CMainFrame class, and sprinkle constructs like

((CMyApp *)AfxGetApp())->somevariable

throughout the program. This is functionally identical to the use of global variables. Unfortunately, they rationalize this as "I'm not using a global variable, I'm using a member variable of a class!" This is arrant nonsense. The real issue is the encapsulation of information, and statements like the one above demonstrate a profound violation of the concept of encapsulation. Having a document, a view, a dialog, or whatever sharing some variable in the CWinApp-derived class violates all sorts of principles of encapsulation, and as such is no better than the use of a global variable.

Dialogs which need to obtain values from, or communicate values to, their parent can easily do so by using SendMessage. Read my essay on messaging for more information. This in particular applies to modeless dialogs. A toolbox-style dialog, that is, one that is supposed to apply to the current view, I implement by having it SendMessage to the mainframe:

AfxGetMainWnd()->SendMessage(...);

and have the mainframe do a GetActiveView() to forward the message.

in the message map
ON_REGISTERED_MESSAGE(UWM_SOME_VALUE_CHANGED, OnSomeValueValued)

...the handler
LRESULT CMainFrame::OnSomeValueChanged(WPARAM wParam, LPARAM lParam)
   {
     CView * view = GetActiveView();
     if(view == NULL)
       return 0; // or other appropriate value indicating failure
     return view->SendMessage(UWM_SOME_VALUE_CHANGED, wParam, lParam);
   }

Note that I can't do this in the dialog because GetActiveView is a member of CMainFrame. This also makes the dialog easily reusable, since the only thing a client needs to know is that it sends a message to the mainframe. Note my preference for Registered Window Messages, described in my essay on message management.


Manipulating or calling members of another window class from within a window class

Severity: poor style, serious

This is a very common practice, and as far as I can tell, it evolves from the C attitude that "I can do anything I want to any data structure at any time I feel like it". This is an unhealthy attitude to bring to OO programming. The key question is, why would you ever need to manipulate the member variables or call the member methods of sibling or parent window class? Hint: there are no right answers. The only time you want to manipulate variables in general (and there are people who are more fanatic than I on this, who assert that this is even too much) is to set the public members of a CDialog class before doing a DoModal() call. They argue, and I philosophically agree with them, that these should not be public variables, but parameters to the constructor. And if they need to be queried, access functions should be provided to be called after DoModal() returns. We see exactly this approach in many MFC classes such as CFileDialog. However, Microsoft's ClassWizard is not especially programmer-friendly, and does not provide us the support we need to do this easily. It also makes sense to be able to call the methods of child controls. But that's as far as it makes sense to go. Reaching into the parent is poor style. Manipulating variables in the CWinApp is inexcusable. Repeat after me: "modularity is good. modularity is good. modularity is good!"

The inter-window communication mechanism is messages. In general, you shouldn't even need a specifically-stored pointer to a window class to send a message. A dialog would query state by doing a GetParent()->SendMessage(...) which is clean because the dialog then becomes entirely self-contained.

In general, windows are independent entities and have no business knowing about each other in any detail, and certainly don't need to know anything about the internal implementations of each other. Communication is via well-specified interfaces. For example, a dialog may mention a set of public variables which should be initialized before DoModal is called and a set of public variables (perhaps the same set) from which values can be obtained after DoModal is completed. Other than that, nothing about the implementation needs to be known, particularly such nonsense as indexes into ListBoxes that don't exist, which is what UpdateData does (see my essays on "Avoiding UpdateData" and "Who Owns the GUI?"). Otherwise, if a dialog needs to make a real-time query of its parent, it should use SendMessage, and if it needs to set a value, it should use SendMessage.

Views do not need to know about each other. In fact, it is inconceivable to me that two views should know any details of each other, even that they exist. If you do something in a view, the only reason another view needs to know about it is because the first view has changed something. And that's what a CDocument-derived class is for. If a view makes a change, it notifies the document, and the document undertakes to notify any other views. Or the view can simply ask the document to notify other views, using UpdateAllViews. All views (if any others) will be notified, but the view that is doing the notification doesn't have to know that any other views even exist.

One objection usually turns out to be something along the lines of "Well, I can write it syntactically, so why shouldn't I be allowed to do it?" Sure, and you can program in assembly code, too, but that doesn't mean that it makes sense. Good OO programming is a discipline.

I elaborate a lot more on this set of problems in my companion essay on the design of dialogs and controls.


Manipulating the controls of a dialog from outside the dialog

Severity: (exceptionally) poor style, serious

I cannot imagine how this could ever make sense. A dialog is an object, and the controls represent an internal implementation detail of the class. So nothing outside that class should know its implementation details. It is true that the ClassWizard creates the control member variables as public, but I consider this a deep blunder representing an inadequate design of ClassWizard (it isn't the worst or most egregious flaw in ClassWizard). I've seen people trying to reach into modeless dialogs from outside it calling things like

(CEdit *)(modeless->GetDlgItem(IDC_NAME))->SetWindowText(s);

which is about as bad as you can get; second worst is something like this, executed by a child dialog:

(CEdit *)(GetParent()->GetDlgItem(IDC_NAME))->GetWindowText(s);

These represent depths of tastelessness in OO programming that set my teeth on edge. For example, this style presumes that you know the implementation of the object being manipulated. So if you change the implementation for some reason, it means that places in your program you never heard about now have to be changed, violating every known principle of good, modular design. For example, see my companion essay "Who owns the GUI?".

For example, suppose I create a derived class based on a standard control, such as CEdit or CListBox. I then override methods of these controls. For example, my ListBox class that automatically maintains its horizontal scrollbar is described in another essay. I override InsertString, AddString, ResetContents, and DeleteString. Now suppose you want to add something to this ListBox. Instead of notifying the dialog to add something, so it correctly calls CHListBox::InsertString, you have done

((CListBox *)dlg->GetDlgItem(IDC_SOMELIST))->AddString(s);

which will call CListBox::AddString, not CHListBox::AddString

So, you say, suppose I do it "right" and access the correct member variable, which is already bound to the correct type? Wouldn't this solve the problem?

dlg->SomeList.AddString(s);

That solves one problem, but now it means that the dialog contents can be modified in ways that are not expected. Suppose I am using an owner-draw ListBox, and I want to modify the structure that is used. Now I have to locate all the places inside my program where that operation is performed, and modify all of them as well! If I did it only in my dialog, I would have the control localized to exactly one place. This is why I frequently make such structures protected within the dialog.

The correct specification of this is to have a method which is invoked to perform the data setting or retrieval; it is up to that method to decide, within the confines of the dialog, how to implement the functionality that effects the desired change or retrieval. Nothing outside the dialog needs to know how this is done. The cleanest method by far is to use SendMessage as the mechanism. So I would do something like

modeless->SendMessage(UWM_GET_NAME, buffer, size);

where I would write the specification

/*******************************************************************
*                            UWM_GET_NAME
* Inputs:
*        WPARAM: (WPARAM)(LPTSTR) pointer to a buffer to be filled
*        LPARAM: (LPARAM)(DWORD)  the size of the buffer
* Result: LRESULT
*        Logically void, zero, always
* Effect:
*        Fills the buffer with the name supplied by the user
*******************************************************************/
#define UWM_GET_NAME see_my_essay_on_messaging

This defines a clean modular interface between the dialog and its clients.

Think of messages being a clean way to invoke virtual methods, and you have the right idea.

However, this leads to another of the horrors: simulating control messages from outside the dialog by sending messages to the dialog. This is never sensible. For example, someone wants to get the effect of clicking the "Do Something" button. They will code

dialog->SendMessage(WM_COMMAND, 
                    (WPARAM)MAKELONG(IDC_DO_SOMETHING, BN_CLICKED), 
                    (LPARAM)dialog->GetDlgItem(IDC_DO_SOMETHING));

(that is, assuming they can actually read the documentation; most do something far sillier:

dialog->SendMessage(BN_CLICKED, 0, 0)

or something equally bad). This pathetic attempt at notification is in the same class as manipulating the controls. It is just the wrong approach. Note that the sender had to know the control ID of the control, the type of the control, etc. These are all violations of object integrity.

The only correct solution is to use SendMessage to send a user-defined message to the dialog. Then the dialog decides how that message is handled. It might call the same functions as the button-click handler; it might not. It is none of your business, as the initiator, how this is implemented.

The correct, and for all practical purposes, the only correct, method of requesting a dialog, CFormView, or other such window perform something is to send it a user-defined message. For example, SendMessage or PostMessage. As an example, 

dialog->SendMessage(UWM_SOME_USER_MESSAGE);

with the possible addition of a WPARAM or LPARAM value.

One of the nicer features of VS7 (Visual Studio .NET 2002 and Visual Studio .NET 2003) is that you can declare the variables protected. You should declare all member variables, especially control variables, and all message handler functions, protected. The only public variables to a dialog are those you use to pass information into and out of the dialog; and by "pass into" I mean "those variables whose values must be set before creating the dialog", and by "pass out of" I mean "those variables whose values contain information to be retrieved after the dialog has terminated". They are never used to communicate information to anyone during the lifetime of the dialog, and particularly they are never accessed by any child dialogs.


Drawing on the dialog

Severity: insanely poor style, hard to get working (if ever), easy to break

Many people are tempted to draw things on the dialog, by overriding the OnPaint handler in their CDialog-derived class. Assume that for all practical purposes this is always a mistake. As far as I can tell, the only construable excuse might be to use a bitmap image as the background for the dialog, and even them I'm deeply suspect of this excuse.

Don't do it! Don't even bother trying! Else you will end up posting questions of the form "I tried to draw ... on my dialog, and it ....<description of numerous unexpected results>". These usually include random disappearances, solid disappearances, flickering, ovewriting other controls, and a few other things I can't recall at the moment. Not that it matters--the problems go away when you give up trying to write on the dialog.

If you need to draw on a dialog, create a static control, and draw in that. This is always the simplest answer. For example, I once needed to draw arrows that went from box to box on a dialog. To do this, I simple created a CArrow subclass, and in its OnPaint handler I used 1 bit to indicate its direction (N/S, E/W--all arrows are only horizontal or vertical), and two bits to indicate which ends got arrowheads (00 - neither, 01 - N/E, 10 - S/W, 11 - both ends). The 00 combination gave me "thick lines" so I could draw L-shaped and other non-simple arrows. Not once did I draw anything on the surface of the dialog.

I've never tried to draw on a dialog, and I've never had problems. Every week or two someone posts a question on the newsgroup that starts out with that "I tried to draw ... on my dialog" and then describes some bizarre behavior. Patient: "Doctor, it hurts when I do this" Doctor: "Then don't do that!". The easiest way to solve a problem is to avoid choosing a way that is going to fail, and use a way that is known to work, is easier to use, has no known problems, and is utterly reliable.

Even to do the background of a dialog, I simply put a static control with the WS_CLIPSIBLINGS style down, and put the background bitmap in it. I've even done it using OnPaint and StretchBlt to fill the dialog area, which works pretty well for most situations, but looks really bad if you have a resizable dialog that can get very much smaller or very much larger than the background bitmap, but this is a defect in StretchBlt and doing your own painting on the dialog won't make it go away.


Using GetDC

There is rarely a need to use GetDC. The need is so rare and unusual that it is safe to assume that if you are writing GetDC( ), you have made a mistake. Part of the reason for this is that the number of times I see people write GetDC( ) without writing ReleaseDC is horrendous.

Generally, you don't need to get a DC at all. Certainly if you think you need to get a DC for drawing, you should be immediately suspect. If you are doing "rubber-band" objects (lines, selection rectangles, etc.) you need a DC other than in the OnPaint handler, and occasionally you may need to compute a text width, but the rest of the time, ask yourself, "Do I really need a DC?" If the answer is "Yes, of course, to paint" and you aren't doing rubber-banding in an OnMouseMove handler, you have probably made a fundamental error. If you are not computing some text width somewhere, the answer again is that you have probably made a fundamental error.

Given that you need a DC, the correct way in MFC to do it is to use the CClientDC data type, e.g. instead of writing

    CDC * dc = GetDC();
    ... do something with DC
    ReleaseDC(dc);

you should be writing

    CClientDC dc(this);
    ...do something with DC

Note that no release is required because the destructor of the CClientDC handles the release.


Thinking you have one monitor

Severity: deeply suspect

OK, I confess: I've done this far too much. Back in the Bad Old Days of single monitors, you could tell where something was by simply looking at the coordinates of the window. If they were negative, you were off-screen somewhere. If any of them exceeded ::GetSystemMetrics(SM_CXSCREEN) and/or ::GetSystemMetrics(SM_CYSCREEN) meant you were offscreen in the other direction. I've even heard that people will "move a window offscreen" instead of using the more sensible ::ShowWindow(SW_HIDE) by setting the coordinates to be large enough in some direction to force the window offscreen. And the other thing I got wrong was fixing a different bug: if you store the window placement somewhere, such as in the Registry, and the screen resolution is changed (usually to a lower value), the window will end up offscreen. This results in a window being unintentially invisible if its former coordinates were no longer on-screen given the new resolution, and often it was nearly impossible to force the window back on-screen. So I'd compare the coordinates of the window to the coordinates of the screen, and if it was out of range, I'd force it onscreen.

Alas, it ain't so any longer. Multiple-monitor support has been with us for several years, and it is now time to abandon the old ideas and move on with the rest of the world.

 


Having an #include of the app file in every module

Severity: poor style, inefficient (development)

Unfortunately, the ClassWizard does this for you. I consider this one of the numerous serious defects in the ClassWizard. There are virtually no modules in your program, except the CWinApp-derived module and the CMainFrame-derived module, that need access to the application object (many components may invoke methods of the generic CWinApp class such as LoadCursor and such, but they do not need the CWinApp-derived header file included). The inclusion of this file makes it impossible to reuse the class in any other app. In general, I try to immediately remove this #include. Mostly I can just remove it. However, in many situations all I need to replace it with is

#include "resource.h"

which is all you usually need.

Not only does the presence of this file tempt you to access member variables and methods of the CWinApp class, which is usually poor style, but it also means that every time you happen to modify the CWinApp class, every module that includes its header has to be recompiled, whether the module needs the definitions or not, which is simply stupid. If you aren't using anything in the module, it should not have a dependency. This links to the notion of "global header files" which is equally sloppy.


Global (or "umbrella") header files

Severity: unbelievably poor style

There is an extremely unfortunate habit that many programmers develop, which is to dump every possible definition of every possible variable into a single header file, with some name like "global.h" or "all.h" or something like that. This technique has offended me for decades. I always said this was a really stupid thing to do, because it means that every change you make recompiles nearly everything. I was generally told I was just flaming.

Now I have proof. If you want to see it, get a copy of Ellen Borison's PhD dissertation, 

“Program Changes and the Cost of Selective Recompilation”, PhD Dissertation, Department of Computer Science, Carnegie-Mellon University, May, 1989.

In this she demonstrates beyond the shadow of any doubt that this technique is always a poor methodology.

A module should include exactly those header files it requires, and no more. There are two methods that can be adopted. One is to do all the #includes "flat" style, that is, no header file includes any other header file. A slightly cleaner method is that a header file which requires that some other header file be included do the #include itself, and that all #include files have some sort of protection against multiple inclusion.

In addition, there is a tendency to say "well, since I have included these six files in every compilation unit, I really need to make a single #include that contains all six files". OK, maybe that makes sense. Just barely. But that is the start of the "slippery slope" effect, wherein you then add a seventh, and an eighth, and a twenty-second, and a thirty-fifth, and pretty soon every header file you have is defined in that header file and no matter what you do, everything recompiles. This is lousy programming style. It introduces gratuitous interdependencies that do not exist. It also tempts you to using sloppy programming style; since you have all those definitions, you just concatenate accesses to get such grotesquely ugly expressions as

BOOL changed = (CEdit *)((CMyApp *)AfxGetApp())->modeless->(GetDlgItem(IDC_NAME)))->GetModify();

Don't laugh. I saw this in one program! It was memorable. Read my sections in this essay on avoiding global variables and manipulating dialog controls to see why this incorporates some of the grossest programming I have ever encountered. 

If you need a global function (one whose behavior is defined entirely by its input parameters, for example), just create a file to hold it, create a header file to define it, and include that header file in the very few modules that use the function. In the presence of modern browsing technology (such as the Class View of VC++), there is no loss of ease of use, and you don't end up with modules which have every sort of mishmash of functions in them. A good programmer I knew years ago said "You know you are in trouble if you write a file called "util.c" that holds all your "utility" functions. You probably are clueless about program structure". I learned from that, and have never since written such a file, independent of what name it might have. That was something like 25 years ago.


Reaching across classes

Severity: potentially fatal

There is a perennial set of blunders that start out with a question of the form

"I want to access the contents of my [document, view, dialog] from the mainframe. I am doing ... and it doesn't work. What am I doing wrong?"

This is usually accompanied by some grotesque code example such as

CEdit * data = (CEdit *)(&my_View->m_InputName)

or something equally appalling. There is no excuse for writing such execrable code. If you have something that involves a view, put the code in the view. Use messages to transmit information if necessary. For example, I might write

CView * view = GetCurrentView();
CString * name = NULL;
if(view != NULL)
    name = (CString *)view->SendMessage(UWM_QUERY_NAME);

although it is more likely I would not do anything in the mainframe that required such precise information from a view. You should neither know nor care about the type of the view. A view which understands the message will respond to it. One which does not will ignore it, and you will get NULL or 0 or FALSE back. Maintain your abstractions!

In a sane and responsible world, control variables would have been declared protected; it is a design defect of the ClassWizard that makes them public. Similarly, a document must never reach into a view, a view should never reach into a dialog (except for setting input and retrieving output variables of a modal dialog), and so on. Just because the language allows some horrible kludge is not a justification for writing one. 

A variant of this is the one where the questioner says: "I have this custom control and it needs to call a method/access a variable of the parent class. I did the following..." usually accompanied by some ghastly example of the form

classs CSomething : public CStatic {
     public:
        CMyParent * parent;
        ...
    };

and an access of the form

parent->SomeMethod();
parent->SomeVariable = ...;
... = parent->SomeVariable;

They are often perplexed that the code does not compile. Usually some helpful soul on the newsgroup will point out that they need to #include "MyParent.h" or some equally bad piece of advice.

This usually solves the problem about the program compiling. It does not solve the more fundamental problem, that the person writing the code is totally clueless about program structure.

A dialog or child control does not need to know anything about its parent. Ever. It has a set of clean interfaces that specify its behavior, entirely in terms of the control or dialog itself. It is up to the outside world to meet the constraints of this interface specification. 

A dialog or control that knows anything at all about its parent is a design so poor that there is no salvaging it. See my essay on Dialog and Control Design.


Overriding CWinThread::Run or CWinApp::Run

Severity: Almost always poor style. Abysmally poor style

Every time I have seen someone override CWinThread::Run (which is CWinApp::Run, because CWinApp is a subclass of CWinThread) it has represented something that could have been done more readily by another mechanism. Simpler, faster, easier, and more reliable code would result. It is almost always the result of failing to understand how MFC works, and kludging in something based upon having once read a book on pure Win32 programming.

For example, people who create UI threads, that override CWinThread::Run and put an infinite loop inside, have missed the fact that they should have used a worker thread, and not a UI thread (a similar mistake is made if they put the infinite loop in their InitInstance handler and return FALSE when it exits). There was no reason to use a UI thread!

Another common failure is to create a thread, and override CWinThread::Run to have a GetMessage or PeekMessage loop. There is no point to this! That's what the Run method already does! And it can be safely assumed that the loop is probably wrong. For example, a PeekMessage loop which has no blocking condition merely creates a mechanism for consuming 100% of the CPU.

What is the proper solution?

First, don't use a UI thread if you meant to use a worker thread. This eliminates a number of failure modes.

One excuse is "I need to poll for incoming events, but do a loop that is doing something useful otherwise". There is already a better mechanism for doing this, the OnIdle handler. What you do in the OnIdle handler is part of a computation; for example, in one application I got, each invocation of OnIdle tested a flag to see if a report was being generated. If one was, it generated one line of the report and returned TRUE. It kept this up until the report was completed (or cancelled). If the report completed or was cancelled, the pending-report flag was cleared; if this flag was cleared, it returned FALSE. This was a way to get "background processing" in a 16-bit Windows app. In Win32, this would actually be better handled by firing off a thread to create the report.

Another excuse is "I need to process a message to cancel the loop", which is silly, because there is no need to process a message to cancel a loop. Read my essay on worker threads, including the section on thread cancellation.

I've seen even weirder structures, so convoluted that even trying to explain them here would take more effort than I'm willing to expend. They were all wrong. In many cases, the program simply never worked correctly, because the whole mechanism that was used was so deeply flawed that any illusion that it worked was just that: an illusion. Under load conditions, the mechanism collapsed under its own weight.

I've even seen someone put a TranslateMessage/DispatchMessage handler in the middle of a CWinThread::Run, for a thread that had no windows! And wondered why this was not handling thread messages (duh! Thread messages aren't associated with windows! That's why they're called thread messages). Exactly how it was going to handle them escapes me.

To handle thread messages, create a WM_APP-based message, or better still, a RegisterWindowMessage value, and use that in an ON_THREAD_MESSAGE or ON_REGISTERED_THREAD_MESSAGE entry in the Message Map. See my essay on Message Management for more details on this, as well as my essay on UI threads.

As far as I can tell, there is no need to ever override CWinThread::Run. Assume, the instant you decide to do this, that you are wrong. That assessment is correct so often that the times that overriding CWinThread::Run would be correct are measured in events-per-decade. Except that in nearly a decade of MFC programming, I have yet to find a reason to ever override CWinThread::Run. And I've done some pretty sophisticated UI threads.

Summary

I'm sure that I've manage to offend nearly everyone who has read this with some opinion or the other, because I've slammed your favorite way of doing things. That's fine. I want you to think about what you are doing. Programs have the unfortunate characteristic that just because you can do something in a certain way, you have not proven that is the most effective way to do things. Often programmers invent idioms which they use over and over because they once carefully figured out how to reach inside abstractions and extract what they need; this does not make that programming methodology clean or effective. Just because "it works" does not make it "good". 

I work in a world in which I not only write code, but I have to maintain it a year or two after I wrote it. I cannot build "fragile" programs that might break. They are often maintained (perhaps usually maintained) by unskilled labor, which includes myself a year or two later. I cannot afford to do things that make simple changes complex.

I use multithreading a lot. I have to be exceedingly careful to get it right; any technique that is error prone is eventually fatal. I may not see conditions arise during testing that arise in normal use, and there can be thousands of copies of my program out there being used. Anything that is deadlock-possible will deadlock, and this reflects badly on my clients' products, and in turn reflects badly on me. I am in the habit of delivering robust applications that I can tackle modifications on after having seen tens of thousands of lines of other code for months. Adopting fairly rigid styles which are, if not impervious to changes under maintenance, are extremely insensitive to changes under maintenance, is critical (I prefer impervious when I can achieve it).

So this essay not only captures techniques which I use in my ordinary programming, but also captures a set of bugs, stylistic errors, and fragilities I encounter in nearly every program I am given to maintain. It is very sad, but most programs I get are fairly poorly written (probably because the well-written ones don't need the kind of repairs I can give, sort of like healthy people not needing to see physicians as often as those with medical problems--I remember a pre-Monty-Python radio program ["I'm Sorry, I'll Read That Again"] in which John Cleese said something along the lines of "I'm tired of being forced to look into people's filthy mouths! I'd like to fix something nice and clean, like a broken arm, someday" and the reply "But you're a dentist". So I sometimes have that feeling, too). So I've learned to look for these problems when I get symptoms that they would cause, and I find them. And I fix them. 

But it is a whole lot easier to not commit these crimes against good programming in the first place.

[Dividing Line Image]

The views expressed in these essays are those of the author, and in no way represent, nor are they endorsed by, Microsoft.

Send mail to newcomer@flounder.com with questions or comments about this web site.
Copyright © 2002-2003 FlounderCraft Ltd./The Joseph M. Newcomer Co. All Rights Reserved
Last modified: May 14, 2011