Time is the Simplest thing...
...which is actually the title of a science fiction story by Clifford D. Simak; my edition was published in 1961. Fairly unmemorable is my recollection, something that can't often be said about his work.
But in fact, time isn't simple, not in operating systems. The number of startlingly naive assumptions that I've seen made about time are astonishing. This essay attempts to clear things up
This is the most common misconception. There is not, nor has there ever been, an attempt to make Windows a real-time operating system. If you have the slightest belief that this is possible, forget it. Windows, like most other general-purpose operating systems, has only a vague notion of time, and accuracy is not a concept it cares terribly much about. A true Real-Time Operating System (RTOS) is designed with time as its most important parameter. Any time you believe Windows might support real-time constraints, remember to recite this mantra: "Windows is not a Real Time Operating System, Windows is not a Real Time Operating System, ..." and repeat it until you understand it.
A company called VenturCom (www.vci.com) produces a real-time OS kernel that allows Windows NT to run on top of it. They also have real-time Windows CE. This is an add-on. Note that other than knowing of the existence of this company, I know no other details other than the ones I learned talking to them at a trade show. So I neither endorse nor recommend this product; but I don't have any reason to say it is not what they claim it to be. No data. Make your own evaluation. Go to their Web site and learn everything I know.
There are many things that can control what happens with timing. Most of these apply to ordinary timing and to multimedia timers. I will state here that I know virtually nothing about the multimedia timers in Win32, but what I do know is that everything that can affect timing normally can interfere with them as well.
The first, and one of the most significant, impacts on timing is the system clock quantum. The system clock ticks at a predetermined rate, and anything that is going to happen happens when those clock ticks happen. Between clock ticks, in essence, there is no way for the system to have any idea what time it is. The clock timings for Windows are shown below.
|Operating System||System Clock|
|Windows 95, 98, Me||55 ms|
|Windows NT, Uniprocessor HAL||10 ms|
|Windows NT, Multiprocessor HAL||15 ms|
|Windows NT 4.0 Alpha||7.5 ms|
No matter what time you request, you will not get a resolution better than the above times. This means that if, on Windows 98, you request a Sleep(10) to sleep 10 milliseconds, you might find that you don't get control back for 55ms. This means your loop will execute 18 times per second, not 100 times as you might expect. The way this works, or at least a working explanation of it, is that there is a queue of events which are to be scheduled at a certain time. When the timer ticks, that queue is examined, and any event whose time has expired is marked for execution. When it next requests a sleep time, a time is computed for when it should be awakened, and the event is scheduled for that time. The next time the timer ticks, the event will be scheduled if its time has expired.
Gating error is a common phenomenon when you have discrete sampling intervals. This was first documented when the early automated event counters, used to measure nuclear decay events, were being constructed; I think this dates back to the late 1940s. It may have been known before that, but my readings on history of technology suggest that the phenomenon was certainly identified as something interesting at that time.
You see this phenomenon if your car has an odometer which records only whole miles (as many modern ones do). You live 3.3 miles from a destination. Some days when you travel there, your odometer reports 3 miles, sometimes it reports 4. Why? If it was really reading 18002.8 miles, adding 3.3 miles gives 18006.1, but you see 18002 go to 18006, so it looks like 4 miles. Other days, you might start at 18143.1 miles and adding 3.3 miles gives 18146.4 miles, so you see 18143 go to 18146, or 3 miles.
If you schedule a sleep for 10 ms more than 10 ms before the next 55-ms clock tick of Win9x, you will find that it occurs less than 55ms. In fact, applying normal statistical sampling techniques, you will find that your probability of hitting the middle of a sample will give you a normal distribution around 55/2ms, or 27.5ms. So mostly you will see 27.5ms, with a few rare instances taking only 10ms and a few rare instances taking 55ms. But this is three times slower than you expect. If you were trying to do an animation, it says that you will run no better than 18 frames per second, but the intervals between those frames can vary by a factor of more than 5, from as little as 10ms to as long as 55ms. This is not real-time by any stretch of the imagination. Essentially, no matter what interval you choose, you round up to 1/2 the quantum above it and expect that will be the best you can do, on the average. It is more pronounced at shorter intervals because the error is a larger percentage of the total time interval.
While it is a little better on NT, running between 10ms and 15ms quantization, there are other considerations that can make this unrealizable.
Saying your thread is schedulable because its wait time has completed does not imply that it will run immediately. The scheduler has to get around to running it. Even if it is the highest-priority feasible thread, it has to make its way through the scheduler.
If the thread has been idle for a while, and was the only feasible thread in the process, your process has become a candidate for being paged out. The first thing that happens when a process becomes schedulable is that its "working set", the collection of pages that are the likeliest ones to be used by the process, are guaranteed to be in memory. This can take tens of milliseconds for each page on a reasonable system. If the working set is large, it can be hundreds of milliseconds between the time the timer expires and the thread can really execute.
Threads have priorities. The Windows schedulers are very simplistic, because over the years we have found that simple schedulers work best. What this means is that on a single-processor system, the highest priority feasible thread always runs. A feasible thread is one which is not blocked on I/O, message wait, a synchronization object, or a timer wait, and all of whose pages are in memory and ready for use (that's all of the pages of the working set, not necessarily all of the pages of the process, which can be very, very large, larger than physical memory). So if your timer expires, and there is a higher-priority thread ready to run, your thread won't get scheduled. In fact, it won't even be looked at until that higher-priority thread blocks. This could take a very long time. Certainly hundreds of milliseconds is not out of the question. Note that even if the high-priority thread exhausts its timeslice, if there is no other thread to run at the same priority, that thread will be scheduled again. And if any thread can preempt it, that thread is at the same or higher priority, either preempting it because it is now its turn ("round-robin within priority level"), or preempting it because it is already a higher priority. A higher-priority thread will preempt any running thread, taking away the rest of its timeslice.
In Windows NT, there are a number of important kernel threads, all of which are running at a much higher priority than user applications. The entire file system, for example, uses high-priority kernel threads. Once one of these threads starts running, it will run "to completion", that is, until it suspends itself on a block of some sort. And your thread will wait.
There are a couple quirks in the simplistic scheduler model, primarily to give better responsiveness to users. These add to the confusion.
Any foreground process will have its threads boosted to 2 units above the background priority. This makes the program you are working on more responsive to the mouse and keyboard than a program that is running in the background, for example, checking email.
When a thread leaves an I/O wait condition, it is given a one-time-only priority boost for its first timeslice after coming out of wait. This allows it to preempt other threads. Some of the priority boosts, from the device driver header file, are shown in the table below. Note that these are not necessarily absolute boosts in priority (their significance is carefully not documented, so we don't know if they are simple integers added to the current priority or they are converted to some fractional-priority boost). But what this means, for example, is that a background program that comes out of I/O wait on an audio device is very likely to be able to exceed the priority of your foreground task that is simply doing a long computation. And moving the mouse is going to give your foreground thread, the one talking to the GUI, priority over the background thread of your foreground task that is talking to a serial port. (It doesn't know the difference between a background or foreground thread; all it knows is the thread that was waiting came out of a mouse wait, so that thread gets boosted).
This says that if you have a background task that is controlling something, that it will almost never be able to compete with the foreground task. Your scheduling delays can be arbitrarily long.
Hardware interrupts preempt everything. In addition to the interrupts themselves, the NT I/O architecture uses a mechanism called the deferred procedure call (DPC) to handle most of the I/O operations at an interruptible level. However, this interruptible level is a software level higher than all schedulable threads. So while the DPC mechanism allows NT to work with devices that have high interrupt rates and low latency tolerance, the DPC mechanism further reduces the guarantees that an application has about running.
And interrupts and the DPC can request, at least in NT, the file system threads be scheduled, adding further delays.
There was a significant defect in either the specification or the implementation of IDE 1.0-compliant devices. I'm not sure which it was. But it means that the IDE disk and CD-ROM drivers of your machine compensate for this by keeping the application program locked out for as long as 1500 ms (that's 1.5 seconds!) while they try to compensate for real or imagined hardware defects. The general advice is that if you have any need for a responsive system, you must not have any IDE devices on your machine. Use SCSI-only secondary storage, CD-ROMs, and/or DVDs.
Many programmers try to depend on WM_TIMER events coming in at fairly small intervals (a typical question shows a SetTimer with an interval of 10ms, followed by the question "Why don't I get my timer messages quickly enough?", usually in the same message stating that Win9x is the platform.
WM_TIMER messages cannot come in faster than the scheduling quantum effects. That is, just like the Sleep() calls, you will see a uniform distribution with a normal curve centered on half the quantum interval. But it is far worse than that for WM_TIMER.
There are two "special" messages in the "message queue" of a Windows application. One of these is WM_PAINT, a message which is never in the message queue. What happens is that in the window description a bit is set indicating there is a need to paint. When there are no other messages in the message queue, and no WM_TIMER to send, Windows simulates a WM_PAINT message! It is never, ever in the queue. Doing a SendMessage of WM_PAINT will have unpredictable results. This is because in the process of synthesizing the WM_PAINT message, Windows also computes the clipping region and the clipping rectangle for the OnPaint handler. And prepares to handle the BeginPaint/EndPaint requests. Without these actions, the behavior of the OnPaint handler is undefined. Hence, if you want a window painted now, you call the UpdateWindow function to force the WM_PAINT to be created and processed immediately. This is the equivalent of sending a WM_PAINT message yourself, but does all the setup and teardown correctly.
The other special message, as suggested above, is WM_TIMER. When a timer fires, some information is placed in the kernel's window object to indicate this condition. When there are no other messages in the queue, but there is a timer request pending, the WM_TIMER message is sent. This means that if you are pushing the mouse around rapidly, or you are using PostMessage, the WM_TIMER message won't be seen for an arbitrarily long time. The WM_TIMER message is synthesized "on the fly" as needed. So even after your thread is scheduled, it can take quite a long time before the WM_TIMER is seen!
I have a friend with whom I've worked in the past who does a lot of real-time computer music work. He has actually done experiments, and has found delays measured on older processors (I don't recall the speed, but it was in the 200MHz days) of up to 400ms from the time the condition at the hardware device generated the interrupt until the user application was dispatched. This is real-time if you are monitoring the temperature in a hot water heater, but not real-time if you are doing computer music. The audio perceptual system is the most precise one we have. An ordinary person can easily detect a 30ms irregularity in a "steady" drumbeat; a professional drummer can detect a 10ms irregularity. Anyone, without training, can detect a 2ms synchronization error between two sharp percussion sounds. Win9x, NT and 2000 are simply not capable of doing this from the application level, by better than two orders of magnitude.
Basically, at application level you can't. Not without a real-time add-on like the one from VenturCom. For some tasks, such as those which require fast response to interrupts, what you do is write a device driver (there are several books out on how to do this, including one I co-authored: Developing Windows NT Device Drivers, by Dekker & Newcomer). But at high sustained interrupt rates you get into other problems, such as sucking down so many CPU resources in your driver that no program can run; it doesn't matter if, in the short term, you can meet the desired latency requirements if the application never has time to compute the next data packet to send.
But even drivers can't handle timing intervals of less than the system clock quantum. The very worst possible piece of hardware (which we actually describe in the book, without identifying the turkeys responsible for its existence) required the driver take a regular 1ms interrupt to refresh the onboard RAM of the card. This is simply not possible in any general-purpose NT environment. There is a distinct danger in letting hardware engineers design products without consulting with the software people who have to deal with the consequences of the hardware design. These are the same people who design PCI cards that only require 95% of the total available PCI bus bandwidth to meet the desired data rate, and require 100MB of contiguous physical memory to store the data. Some things are not just not doable in a general-purpose operating system.
Many designers deal with this by making intelligent peripherals. Boards which contain their own CPU and which contain all the really time-sensitive code. Some of these processors are programmed using one of the many embedded-system real-time kernels. Others are just programmed "bare". In some cases, the software for the attached microprocessor is loaded into the card explicitly when the driver starts up; in other cases, it is in ROM or reprogrammable FLASH memory and is never, or rarely, changed.
There are a number of API calls to retrieve time values. Some of these have resolutions specified in the range of milliseconds, or 100s of nanoseconds. Most of these are simulations. For example, while there is an API call to give you the current time, in milliseconds, in fact it basically tells you the number of ticks that have occurred in the system timer, multiplied by the system timer interval. Thus, you won't see resolution better than 55ms on Win9x. It doesn't tell you how much of that elapsed time was actually spent doing your program. As a performance counter, it is completely meaningless.
QueryPerformanceCounter returns a reasonably accurate rendering of time, at least on Intel platforms. This is because on Intel platforms it returns the contents of a "cycle counter", converted to timing units based on the processor frequency. Because the cycle counter actually counts the number of CPU cycles, it gives a reasonable approximation of real time. However, executing the same loop twice may give different results because the presence or absence of data in the cache will change the number of cycles required to execute the program. So it is at best an approximation, and you have to do some more careful statistical analysis to figure out what is really going on.
Windows is not a real-time operating system. Windows is not a real-time operating system. Windows is not a real-time operating system.
What this means is that if you schedule a timer callback, a WM_TIMER, or a Sleep, you will see the event somewhere between the time you specified, and several weeks from now (my favorite example is the terabyte image processing algorithm which uses asynchronous, non-blocking I/O and runs at high-priority so it "finishes faster". Nothing else will run until it finishes!)
While you can often deal with some time-critical events in restricted contexts, the operating system, like Unix, Linux, and every other general-purpose OS that preceded it, makes no guarantees about real-time performance. You must not depend on this, particularly in any environment in which the failure to meet real-time constraints can cause something to fail in an unpleasant fashion.
This is not a design failure of Windows. Complaining that Windows doesn't do real-time is about as meaningful as complaining that the socket wrench you bought at the hardware store can't do a decent weld, and is lousy for driving nails. If you choose the wrong tool for the job, it is not the fault of the designer of the tool. No general-purpose operating system is a real-time system, any more than a good real-time system makes a good general-purpose operating system. Misusing a tool will only get you in trouble, and only a poor craftsman blames his tools. A really bad one chooses the wrong tools for the job, and then blames the tools.
The views expressed in these essays are those of the author, and in no way represent, nor are they endorsed by, Microsoft.