January 2011 Archives

Asynchronous page faults

With I/O, we’ve got some choices:

  1. Synchronous, copying from OS cache ( fread). This is the simplest form of I/O, but isn’t very scalable.
  2. Synchronous, reading directly from OS cache (memory mapping). This is wicked fast and efficient once memory is filled, but aside from some cases with read-​ahead, your threads will still block with page faults.
  3. Asynchronous, copying from OS cache ( ReadFile). Much more scalable than fread, but each read still involves duplicating data from the OS cache into your buffer. Fine if you’re reading some data only to modify the buffer in place, but still not very great when you’re treating it as read only (such as to send over a socket).
  4. Asynchronous, maintaining your own cache ( FILE_FLAG_NO_BUFFERING). More scalable still than ReadFile, but you need to do your own caching and it’s not shared with other processes.

Note that there’s one important choice missing: memory mapping with asynchronous page faults. As far as I know there are no operating systems that actually offer this—it’s kind of a dream feature of mine. There are two APIs that will help support this:

HANDLE CreateMemoryManager();
BOOL MakeResident(HANDLE, LPVOID, SIZE_T, LPOVERLAPPED);

CreateMemoryManager opens a handle to the Windows memory manager, and MakeResident will fill the pages you specify (returning true for synchronous completion, false for error/async like everything else). The best of both worlds: fast, easy access through memory, a full asynchronous workflow, and shared cache usage. This would be especially useful on modern CPUs that offer gigantic address spaces.

The memory manager already has similar functionality in there somewhere, so it might not be difficult to pull into user-​mode. Just an educated guess. Maybe it’d be terribly difficult. Dream feature!