Optimizing IP range searching in PeerGuardian

I was working on something completely different last night, when an elegant idea came to mind on how to significantly speed up PeerGuardian’s IP searching. It’s funny how an idea can just pop into the mind about a problem that hasn’t been thought of in a long time.

Right now PeerGuardian uses a binary search to match IPs. This is already pretty efficient, running in ⌈log 2 N⌉—so for 100,000 IP ranges, about 16 tests need to be done. This has the additional advantage of having no memory overhead.

My idea is to use a structure similar to a B+tree, packing as many IPs and branch pointers into a cache line as possible. On today’s architectures, a cache line is typically 64 bytes, so 8 IPs and 9 branch pointers would fit on each node, making it only need to read about ⌈log 9 N⌉ nodes to find a match. So in order to find a match in 100,000 IP ranges, only about 5 nodes would need to be read.

CPUs always read and cache data in blocks (a cache line), so an algorithm that keeps this in mind to minimize memory reads and maximize cache usage should be incredibly fast. Even though this introduces significant overhead for branch pointers (about 2x the storage would be required), it should be far more efficient overall.

But this algorithm improves in another way too: branching. I’m talking in terms of branch instructions, not the branch pointers mentioned above. The fewer branches code takes, the faster a superscalar or pipelined CPU will be able to run your code. For this algorithm, an entire node could be processed (that is, comparing the IPs and determining which branch node to go into) with zero branches using integer SSE2 (PCMPGTD, PMOVMSKB), and bit-scan forward (BSF).

I can’t be sure how much of a speed difference this would make until I code it up, but I bet it would be at least 200% faster. I’ve been too busy to work on PeerGuardian for quite a while, so I don’t know if this will ever make it into PG. We’re looking for a new coder with more time on their hands.

Posted on April 16, 2009 in Coding, Optimizing, PeerGuardian, SIMD

Related Posts