C

Is C# the Boost of C-family languages?

For all the cons of giving a single entity control over C#, one pro is that it gives the language an unmatched agility to try new things in the C family of languages. LINQ—both its language integration and its backing APIs—is an incredibly powerful tool for querying and transforming data with very concise code. I really can’t express how much I’ve come to love it.

The new async support announced at PDC10 is basically the holy grail of async coding, letting you focus on what your task is and not how you’re going to implement a complex async code path for it. It’s an old idea that many async coders have come up with, but, as far as I know, has never been successfully implemented simply because it required too much language support.

The lack of peer review and standards committee for .​NET shows—there’s a pretty high rate of turnover as Microsoft tries to iron down the right way to tackle problems, and it results in a very large library with lots of redundant functionality. As much as this might hurt .​NET, I’m starting to view C# as a sort of Boost for the C language family. Some great ideas are getting real-​world use, and if other languages eventually feel the need to get something similar, they will have a bounty of experience to pull from.

C++, at least, is a terrifyingly complex language. Getting new features into it is an uphill battle, even when they address a problem that everyone is frustrated with. Getting complex new features like these into it would be a very long process, with a lot of arguing and years of delay. Any extra incubation time we can give them is a plus.

strncpy is not your friend

Being in IRC, every so often you will find someone heralding the use of strncpy for writing secure code. A lot of the time they are just going off what others have said, and can’t even tell you what strncpy really does. strncpy is a problem for two reasons:

Bugs happen. Sometimes we build sanity checks into programs to combat unknown ones before they become a problem. But strncpy is not a sanity check or security feature—using it instead of resizing a buffer to accommodate the data, or just outright rejecting the data if it gets too big is a bug.

Writing a good parser

From a simple binary protocol used over sockets to a complex XML document, many applications depend on parsing. Why, then, do the great majority of parsers out there just plain suck?

I’ve come to believe a good parser should have two properties:

Several parsers meet the first requirement, but almost none meet the second: they expect their input to come in without a break in execution. So how do you accomplish this? Lately I’ve been employing this fairly simple design:

struct buffer {
  struct buffer *next;
  char *buf;
  size_t len;
};

struct parser {
  struct buffer *input;
  struct buffer *lastinput;
  struct buffer *output;
  int (*func)(struct parser*, struct *state);
};

enum {
  CONTINUE,
  NEEDMORE,
  GOTFOO,
  GOTBAR
};

int parse(struct parser *p) {
  int ret;
  while((ret = p->func(p)) == CONTINUE);

  return ret;
}

The idea should be easy to understand:

  1. Add buffer(s) to input queue.
  2. If parse() returns NEEDMORE, add more input to the queue and call it again.
  3. If parse() returns GOTFOO or GOTBAR, state is filled with data.
  4. The function pointer is continually updated with a parser specialized for the current data: in this case, a FOO or a BAR, or even just bits and pieces of a FOO or a BAR. It returns CONTINUE if parse() should just call a new function pointer.
  5. As the parser function pointers eat at the input queue, put used buffers into the output stack.

Other than meeting my two requirements above, the best thing about this design? It doesn’t sacrifice cleanliness, and it won’t cause your code size to increase.