<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>int64.org &#187; optimizing</title>
	<atom:link href="http://int64.org/tag/optimizing/feed" rel="self" type="application/rss+xml" />
	<link>http://int64.org</link>
	<description>When 4GiB just isn&#039;t enough</description>
	<lastBuildDate>Thu, 24 Jun 2010 11:50:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Optimizing exceptions</title>
		<link>http://int64.org/2008/04/27/optimizing-exceptions</link>
		<comments>http://int64.org/2008/04/27/optimizing-exceptions#comments</comments>
		<pubDate>Sun, 27 Apr 2008 23:38:13 +0000</pubDate>
		<dc:creator>Cory</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[optimizing]]></category>

		<guid isPermaLink="false">http://blog.int64.org/?p=21</guid>
		<description><![CDATA[You might often hear about exceptions being slow. For this reason they are usually shunned in the embedded space, and sometimes even for regular desktop/server programming. What makes them slow? When one is thrown it needs to search through the call stack for exception handlers. I guess I don&#8217;t understand this line of thinking. For [...]]]></description>
			<content:encoded><![CDATA[<p>You might often hear about exceptions being slow.  For this reason they are usually shunned in the embedded space, and sometimes even for regular desktop/server programming.  What makes them slow?  When one is thrown it needs to search through the call stack for exception handlers.</p>
<p>I guess I don&#8217;t understand this line of thinking.  For one, exceptions are meant for <em>exceptional</em> situations: things you don&#8217;t expect to happen under normal operation.  Code that uses exceptions will run just as fast (or maybe even faster) as code without, until you throw one.  These exceptional situations are truely rare, so I usually don&#8217;t care if they do happen to run slower.</p>
<p>A compiler can actually use exceptions to optimize your code.  Consider this inefficient (but typical) pseudo-C:</p>
<pre class="prettyprint">int dosomething(void) {
   /* do something A */
   if(err) return -1;

   /* do something B */
   if(err) {
      /* cleanup previous work A */
      return -1;
   }

   /* do something C */
   if(err) {
      /* cleanup previous work B */
      /* cleanup previous work A */
      return -1;
   }

   return 0;
}</pre>
<p>Or even this more efficient (yes boys and girls, goto actually has a good use case in C, get over it) pseudo-C:</p>
<pre class="prettyprint">int dosomething(void) {
   /* do something A */
   if(err) return -1;

   /* do something B */
   if(err) goto err1;

   /* do something C */
   if(err) goto err2;

   return 0;

   err2:
   /* cleanup previous work B */

   err1:
   /* cleanup previous work A */

   return -1;
}</pre>
<p>Why are these bad?  Cache locality.  In the first example, you have error handling code inline with your regular code.  In the second you have it slightly better and off to the end of the function.  Ideally the code you run will all be compacted in as few cache lines as possible, and erroring handling this way will waste significant space on cleanup code that in the large majority of cases won&#8217;t be run.</p>
<p>But with exceptions, the compiler is free to take all the cleanup code in your entire app, and shove it into a single separate area of code.  All your normal code that you expect to run can be compact and closer together.  Of course, this <em>will</em> make exceptions run slower.  If your code is heavy on throwing exceptions (which would probably be an abuse) it will probably cause a significant overall slowdown.  But if they are used correctly&#8211;for <em>exceptional</em> situations&#8211;then the common case will be improved cache usage and therefor faster code.</p>
]]></content:encoded>
			<wfw:commentRss>http://int64.org/2008/04/27/optimizing-exceptions/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scalability isn&#8217;t everything</title>
		<link>http://int64.org/2008/03/04/scalability-isnt-everything</link>
		<comments>http://int64.org/2008/03/04/scalability-isnt-everything#comments</comments>
		<pubDate>Wed, 05 Mar 2008 04:05:14 +0000</pubDate>
		<dc:creator>Cory</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[lock-free]]></category>
		<category><![CDATA[optimizing]]></category>

		<guid isPermaLink="false">http://blog.int64.org/2008/03/04/scalability-isnt-everything</guid>
		<description><![CDATA[In the beginning, you write threaded apps with great ignorance to scalability.&#160; That&#8217;s usually okay—most apps don&#8217;t need it, but sooner or later you will come across a problem that demands it.&#160; With enough searching, you will come across lock–free algorithms.&#160; Tricky to get right, but promising fantastic scalability if you do. Even trickier, though, [...]]]></description>
			<content:encoded><![CDATA[<p>In the beginning, you write threaded apps with great ignorance to scalability.&nbsp; That&#8217;s usually okay—most apps don&#8217;t need it, but sooner or later you will come across a problem that demands it.&nbsp; With enough searching, you will come across lock–free algorithms.&nbsp; Tricky to get right, but promising fantastic scalability if you do.</p>
<p>Even trickier, though, is knowing when to not use them.&nbsp; Lock–free algorithms come with a price: although they are indeed very scalable, their performance can be much worse than a well designed algorithm for single–threaded applications.&nbsp; Do a little benchmarking and you might find something surprising: the performance hit can actually be so large that a simple locked single–threaded algorithm with no scalability will give better overall performance than a 100% scalable lock–free version.</p>
<p>This is more common than you might think.&nbsp; Take a queue.&nbsp; A single–threaded version will typically have very minimal memory overhead: maybe a pointer for every <em>n</em> objects.&nbsp; A lock–free version will need two pointers for every object (or one, if you use a GC).&nbsp; Now the amount of overhead greatly depends on what your object is.&nbsp; If your object is large, a lock–free queue will probably be a better choice.&nbsp; But if your object is small—say one or two pointers—the overhead can be enough that cache misses will significantly affect your application.</p>
<p>I recently had to tackle this problem.&nbsp; My application needed a queue of small objects, and on a modern quad–core CPU the cache misses were hurting performance so much that although a lock–free queue did have near 100% scalability, the overall operation was completing 165% faster with a locked queue with zero scalability.</p>
<p>The next best thing is to combines the best of both worlds: design a queue with low overhead and medium scalability.&nbsp; Using a reader–writer lock with a combination of lock–free operations, I came up with a queue that only needs to do a full lock once every 32 or 64 operations.&nbsp; The result?&nbsp; Scalability 5% lower than a lock–free queue, with overall performance 210% better.</p>
<p>OK, I&#8217;ll admit it: I cheated, somewhat.&nbsp; Lock–free algorithms are good for more than just scalability.&nbsp; They also offer immunity to nasty effects like deadlock, livelock, and priority inversion.&nbsp; In my case I wasn&#8217;t in a situation to worry about these, but you might be.&nbsp; The lesson here is to know your situation and decide carefully, and don&#8217;t trust what others tell you: always try things yourself and profile.</p>
]]></content:encoded>
			<wfw:commentRss>http://int64.org/2008/03/04/scalability-isnt-everything/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
