Windows 7

ClearType in Windows 7

One of my big pet peeves with ClearType prior to Windows 7 was that it only anti-aliased horizontally with sub-pixels. This is great for small fonts, because at such a small scale traditional anti-aliasing has a smudging effect, reducing clarity and increasing the font’s weight. For large fonts however, it introduces some very noticeable aliasing on curves, as best seen in the ‘6′ and ‘g’ here:

"" rendered with GDI

You’ve probably noticed this on websites everywhere, but have come to accept it. Depending on your browser and operating system, you can probably see it in the title here. This problem is solved in Windows 7 with the introduction of DirectWrite, which combines ClearType’s horizontal anti-aliasing with regular vertical anti-aliasing when using large font sizes:

"" rendered with DirectWrite

Of course, DirectWrite affects more than just Latin characters. Any glyphs with very slight angles will see a huge benefit, such as hiragana:

"まこと" rendered with GDI and DirectWrite

Unfortunately, this isn’t a free upgrade. For whatever reason, Microsoft didn’t make all the old GDI functions use DirectWrite’s improvements so to make use of this, all your old GDI and DrawText code will need to be upgraded to use Direct2D and DirectWrite directly, so an old WM_PAINT procedure like this:

HDC hdc = BeginPaint(hwnd, &ps);

HFONT font = CreateFont(-96, 0, 0, 0, FW_NORMAL,
                        0, 0, 0, 0, 0, 0, 0, 0, L"Calibri");

SelectObject(hdc, (HGDIOBJ)font);

RECT rc;
GetClientRect(hwnd, &rc);

DrawText(hdc, L"", 9, &rc,

EndPaint(hwnd, &ps);

Will turn into this:

ID2D1Factory *d2df;

   __uuidof(ID2D1Factory), 0, (void**)&d2df);

IDWriteFactory *dwf;

   __uuidof(IDWriteFactory), (IUnknown**)&dwf);

IDWriteTextFormat *dwfmt;

dwf->CreateTextFormat(L"Calibri", 0, DWRITE_FONT_WEIGHT_REGULAR,
   96.0f, L"en-us", &dwfmt);


RECT rc;
GetClientRect(hwnd, &rc);

D2D1_SIZE_U size = D2D1::SizeU(rc.right - rc.left,
                               rc.bottom -;

ID2D1HwndRenderTarget *d2drt;

   D2D1::HwndRenderTargetProperties(hwnd, size), &d2drt);

ID2D1SolidColorBrush *d2db;


D2D1_SIZE_F layoutSize = d2drt->GetSize();
D2D1_RECT_F layoutRect = D2D1::RectF(0.0, 0.0,
   layoutSize.width, layoutSize.height);

d2drt->DrawText(L"", 9, dwfmt, layoutRect, d2db);

This is no small change, and considering this API won’t work on anything but Vista and Windows 7, you’ll be cutting out a lot of users if you specialize for it. While you could probably make a clever DrawText wrapper, Direct2D and DirectWrite are really set up to get you the most benefit if you’re all in. Hopefully general libraries like Pango and Cairo will get updated backends for it.

DirectWrite has other benefits too, like sub-pixel rendering. When you render text in GDI, glyphs will always get snapped to pixels. If you have two letters side by side, it will choose to always start the next letter 1 or 2 pixels away from the last—but what if the current font size says it should actually be a 1.5 pixel distance? In GDI, this will be rounded to 1 or 2. This is also noticeable with kerning, which tries to remove excessive space between specific glyphs such as “Vo”. Because of this, most of the text you see in GDI is very slightly warped. It’s much more apparent when animating, where it causes the text to have a wobbling effect as it constantly snaps from one pixel to the next instead of smoothly transitioning between the two.

DirectWrite’s sub-pixel rendering helps to alleviate this by doing exactly that: glyphs can now start rendering at that 1.5 pixel distance, or any other point in between. Here you can see the differing space between the ‘V’ and ‘o’, as well as a slight difference between the ‘o’s with DirectWrite on the right side, because they are being rendered on sub-pixel offsets:

"Volcano" close-up comparison with GDI and DirectWrite

The difference between animating with sub-pixel rendering and without is staggering when we view it in motion:

"Volcano" animation comparison with GDI and DirectWrite

Prior to DirectWrite the normal way to animate like this was to render to a texture with monochrome anti-aliasing (that is, without ClearType), and transform the texture while rendering. The problem with that is the transform will introduce a lot of imperfections without expensive super-sampling, and of course it won’t be able to use ClearType. With DirectWrite you get pixel-perfect ClearType rendering every time.

Apparently WPF 4 is already using Direct2D and DirectWrite to some degree, hopefully there will be high-quality text integrated in Flash’s future. Firefox has also been looking at adding DirectWrite support, but I haven’t seen any news of Webkit (Chrome/Safari) or Opera doing the same. It looks like Firefox might actually get it in before Internet Explorer. Edit: looks like Internet Explorer 9 will use DirectWrite—wonder which will go gold with the feature first?

Direct2D and DirectWrite are included in Windows 7, but Microsoft has backported them in the Platform Update for Windows Server 2008 and Windows Vista so there’s no reason people who are sticking with Vista should be left out. Are there people sticking with Vista?

Windows 7 to support non-OEM CableCARD

TV-on-PC users rejoice! CableCARD support is finally coming to PC expansion cards available through retail channels.

Windows has long used the Broadcast Driver Architecture (BDA) to communicate with TV tuner cards, but the folks in charge of CableCARD had a major problem with it: there's no DRM support. Because of this they forbade selling any add-on cards alone, and any TV tuners you could buy would only work with analog or ClearQAM (unencrypted) channels, which typically means low-def or local channels only. The only way to get CableCARD support on a PC was to buy a full OEM setup that included the tuners.

One of the new features in Windows 7 is the new PBDA (Protected BDA) API which, you guessed it, supports DRM. With PBDA, WDDM, and HDCP, the signal can be protected from the tuner all the way to the monitor. Microsoft kept quiet and avoided acknowledging any questions about it during the test, but many testers speculated it would be part of a bigger push from Microsoft to open up CableCARD add-on support, and it turns out we were right. I wouldn't be surprised to see announcements of new hardware from Hauppauge and other tuner manufacturers.

I watch a lot of TV—usually in the form of a small box in the corner of the screen while I'm coding, so I've got plenty of time. I currently have two Hauppauge HVR-2250 cards for a total of four tuners. This works great for my local channels like NBC and FOX but there are always some shows I like on cable channels, so I'll be looking forward to some of the new hardware, like Ceton's new 6-tuner CableCARD behemoth.

Windows 7 is RTMed

After a week of speculation, it's finally been confirmed. Today, 7600 was signed off as the final RTM build for Windows 7.

Feature-wise, Windows 7 is a compelling evolution. It fixes a lot of the issues people had with Vista and adds in a number of great user-, it-, and developer-focused features. Things like Direct2D and GDI improvements, User Mode Scheduling, improved NUMA support, improved concurrency, SSD support, and improved power management will all work together to provide higher performance compared to previous OSes. Libraries, greater multimedia support (such as AAC and AVC), mouse gestures, Media Center, and a completely redesigned taskbar provide a greater user experience. I think this is definitely the best Windows to date -- better than XP, and better than Vista.

Testing Windows 7 was a very frustrating experience. In contrast to previous betas where we got a regular stream of beta builds to test, in Windows 7 we got only two builds, Beta 1 and the RC. A lot of us experienced our bugs being set as not reproducible in internal builds, with no way to test if that were true. Worse yet, shortly after the RC came out many of us had a lot of bug reports disappear when Microsoft told us to not report any bugs that didn't cause the OS to bluescreen or fail installing—so there may well be a large number of unfixed cosmetic and usability issues in the RTM.

Instead, Microsoft created a much smaller team of special testers called Test Pilots who, along with TAP partners, would be the ones to get intrim builds and provide the majority of the useful feedback. I'm not sure who this team was made up of, but I would guess they are testers from past betas who chose to devote most of their waking hours to testing.

This triggered something I'd never expected to see—somewhat of a revolt among testers who felt that their feedback was doing nothing. Morale went down, bug reports stopped coming in, and a lot of heated discussion happened between testers. Even the die-hard testers realized something was wrong, some of them feeling the need to mark their discussions to differentiate them as a "proud" tester.

Some believe Steve Sinofsky (who replaced Jim Allchin as the head of the Windows division) is the reason for this total restructuring of the Windows beta, but as far as I know nothing of the sort has been confirmed. Either way, with Microsoft seemingly frustrated at our performance and our frustration at not being able to test properly, it feels like we were of little use this time around despite submitting a large amount of bugs. I would not be surprised if the tech beta gets scrapped entirely for Windows 8.

User Mode Scheduling in Windows 7

Don’t use threads. Or more precisely, don’t over-use them. It’s one of the first thing fledgling programmers learn after they start using threads. This is because threading involves a lot of overhead. In short, using more threads may improve concurrency, but it will give you less overall throughput as more processing is put into simply managing the threads instead of letting them run. So programmers learn to use threads sparingly.

When normal threads run out of time, or block on something like a mutex or I/O, they hand off control to the operating system kernel. The kernel then finds a new thread to run, and switches back to user-mode to run the thread. This context switching is what User Mode Scheduling looks to alleviate.

User Mode Scheduling can be thought of as a cross between threads and thread pools. An application creates one or more UMS scheduler threads—typically one for each processor. It then creates several UMS worker threads for each scheduler thread. The worker threads are the ones that run your actual code. Whenever a worker thread runs out of time, it is put on the end of its scheduler thread’s queue. If a worker thread blocks, it is put on a waiting list to be re-queued by the kernel when whatever it was waiting on finishes. The scheduler thread then takes the worker thread from the top of the queue and starts running it. Like the name suggests, this happens entirely in user-mode, avoiding the expensive user->kernel->user-mode transitions. Letting each thread run for exactly as long as it needs helps to solve the throughput problem. Work is only put into managing threads when absolutely necessary instead of in ever smaller time slices, leaving more time to run your actual code.

A good side effect of this is UMS threads also help to alleviate the cache thrashing problems typical in heavily-threaded applications. Forgetting your data sharing patterns, each thread still needs its own storage for stack space, processor context, and thread-local storage. Every time a context switch happens, some data may need to be pushed out of caches in order to load some kernel-mode code and the next thread’s data. By switching between threads less often, cache can be put to better use for the task at hand.

If you have ever had a chance to use some of the more esoteric APIs included with Windows, you might be wondering why we need UMS threads when we have fibers which offer similar co-operative multitasking. Fibers have a lot of special exceptions. There are things that aren’t safe to do with them. Libraries that rely on thread-local storage, for instance, will likely walk all over themselves if used from within fibers. A UMS thread on the other hand is a full fledged thread—they support TLS and no have no real special things to keep in mind while using them.

I still wouldn’t count out thread pools just yet. UMS threads are still more expensive than a thread pool and the large memory requirements of a thread still apply here, so things like per-client threads in internet daemons are still out of the question if you want to be massively scalable. More likely, UMS threads will be most useful for building thread pools. Most thread pools launch two or three threads per CPU to help stay busy when given blocking tasks, and UMS threads will at least help keep their time slice usage optimal.

From what I understand the team behind Microsoft’s Concurrency Runtime, to be included with Visual C++ 2010, was one of the primary forces behind UMS threads. They worked very closely with the kernel folks to find the most scalable way to enable the super-parallel code that will be possible with the CR.

My Windows Vista/7/8 Wishlist

These are some changes I’ve been trying to get made since Vista entered beta. Now 7’s beta has begun and still chances look bleak. Maybe I’ll have more luck in 8?

Windows 7 Beta will be free to the public

Not part of the one of the Windows 7 beta teams? On January 9th, the first 2.5 million people to visit the Windows 7 homepage will be able to download the beta for free.

I just got my copy installed a few hours ago, so far I’ve seen a few new features I like and couple that I’m not sure about. I will blog about specifics as soon as I’m certain what I’m allowed to mention.

And so, the Windows 7 tech beta begins.

Got my invite to the Windows 7 tech beta today. The first beta won’t be out until early 2009, but from what I’ve heard the current internal copies of Windows 7 are already a pretty good improvement over Vista, in both performance and usability. Looking forward to working with all the fine folks I’ve met from the last few betas!