Superposed_Cat said:
Hey all, I was writing this program just now and wondered whether it would be easier for the cpu if I stored a number before I divided it by 2 or if i multiplied by 2 later to get that number back, I know it won't affect speed at all really but was just wondering, Any help appreciated.
As Mark mentioned, the answer is not easy.
However, there is an easy answer, which is to do the obvious, at least initially.
In modern software, the main bottleneck oftentimes is not performance. It's human understandability of your code. When you finish a task and turn it over to someone else, you want your code to be understandable to your stand-in. Is your stand-in going to be able to understand your cleverness? When you finish a task and don't turn it over to someone else, your code and your cleverness will come back to haunt you six months later when someone reports a problem. Are you yourself going to be able to understand your cleverness six months from now?
Do not obscure your code with what you think are performance optimizations. (Typically, what you think is an optimization is in fact a dis-optimization.) Do not even obscure your code with what you know from experience are performance optimizations. Optimize for performance only when testing and performance analyses say you need to optimize for performance.I'll give a specific example. A numerical computing system I worked on involved a function with nested loops in which the loop indices were used both as indices into arrays (so ideally you would want integers) and as multipliers in floating point calculations (so ideally you would want floating point values). So which to use, integer or floating point as the base type of those indices?
The obvious answer was to use an integer because that's the natural thing to use for a loop index. Accessing an array (e.g., some_array[index]) and performing a floating point calculation (e.g., index*some_floating_point_value) both work because integers automatically convert to floating point. The conversion is performed in the CPU. This was the obvious approach because the math used in the source code looked very much like the math used in the underlying technical papers.
Testing showed that our function was a performance bottleneck. More detailed analyses showed that one of the problems was in the conversion from integer to floating point. So we switched to using floating point values as our loop variables and computing the integer index. Now that conversion was a bottleneck. Next we tried making the loop variable and its floating point counterpart independent loop variables. Now the act of incrementing became a bottleneck.
Finally we tried using a lookup table, where 0 (integer) mapped to 0.0 (double), 1 to 1.0, 2 to 2.0, etc. Then we arranged memory so that this table was very likely to be in the fastest cache. This was faster than any other approach.
We would never have thought ahead of time that a memory lookup would be faster than CPU calculations. But it was. We also never would have used this approach ahead of time, even if we had known that about how much faster it is than the obvious solution.