Which is faster? reading from memory or arithmetic?

In summary: Turns out the conversion from integers to floating point was taking a few cycles, and that fact was hidden by the compiler because the compiler was optimizing for performance.
  • #1
Superposed_Cat
388
5
Hey all, I was writing this program just now and wondered whether it would be easier for the cpu if I stored a number before I divided it by 2 or if i multiplied by 2 later to get that number back, I know it won't affect speed at all really but was just wondering, Any help appreciated.
 
Technology news on Phys.org
  • #2
Superposed_Cat said:
Hey all, I was writing this program just now and wondered whether it would be easier for the cpu if I stored a number before I divided it by 2 or if i multiplied by 2 later to get that number back, I know it won't affect speed at all really but was just wondering, Any help appreciated.
There are a lot of factors that determine the answer to your question.

For the division part, what sort of number are you dividing? Integer division is a lot faster than floating point division. Back in the 80s, many CPUs (Intel 8088/8086 through 80386) didn't have any native hardware support for floating point division, so division was done in software rather than in the CPU. To combat the slowness of software division, people could purchase a floating point unit (FPU). Most motherboards had an empty FPU socket that you could plug one of these units into.

For the division of an integer value by 2, a quick way to do this is to shift all of the bits one place to the right.

For storing a number, the time it takes depends on where the number is stored, which could be in an unused register (very fast), in one of the caches, in RAM, or on disk. Modern computers have several caches, which provide quicker access then regular RAM and much faster access than disk storage.

if I stored a number before I divided it by 2 or if i multiplied by 2 later to get that number back
If I understand what you're asking here, I think it would be faster to store the number in a variable, divide it by 2 to get you halved value, and then retrieve the undivided number from the variable. In this scenario you have one store, one division, and one read. In your alternative, you have one division and one multiplication. In both scenarios we have a division, which if done naively, takes a lot of processor cycles. What it boils down to is whether it takes fewer cycles to do a write followed by a read vs. a multiplication. To answer this question, you need to know a lot of details about the specific CPU involved.
 
  • Like
Likes Superposed_Cat
  • #3
In a intel I7 what takes more cycles?
 
  • #4
Superposed_Cat said:
Hey all, I was writing this program just now and wondered whether it would be easier for the cpu if I stored a number before I divided it by 2 or if i multiplied by 2 later to get that number back, I know it won't affect speed at all really but was just wondering, Any help appreciated.
As Mark mentioned, the answer is not easy.

However, there is an easy answer, which is to do the obvious, at least initially.

In modern software, the main bottleneck oftentimes is not performance. It's human understandability of your code. When you finish a task and turn it over to someone else, you want your code to be understandable to your stand-in. Is your stand-in going to be able to understand your cleverness? When you finish a task and don't turn it over to someone else, your code and your cleverness will come back to haunt you six months later when someone reports a problem. Are you yourself going to be able to understand your cleverness six months from now?

Do not obscure your code with what you think are performance optimizations. (Typically, what you think is an optimization is in fact a dis-optimization.) Do not even obscure your code with what you know from experience are performance optimizations. Optimize for performance only when testing and performance analyses say you need to optimize for performance.I'll give a specific example. A numerical computing system I worked on involved a function with nested loops in which the loop indices were used both as indices into arrays (so ideally you would want integers) and as multipliers in floating point calculations (so ideally you would want floating point values). So which to use, integer or floating point as the base type of those indices?

The obvious answer was to use an integer because that's the natural thing to use for a loop index. Accessing an array (e.g., some_array[index]) and performing a floating point calculation (e.g., index*some_floating_point_value) both work because integers automatically convert to floating point. The conversion is performed in the CPU. This was the obvious approach because the math used in the source code looked very much like the math used in the underlying technical papers.

Testing showed that our function was a performance bottleneck. More detailed analyses showed that one of the problems was in the conversion from integer to floating point. So we switched to using floating point values as our loop variables and computing the integer index. Now that conversion was a bottleneck. Next we tried making the loop variable and its floating point counterpart independent loop variables. Now the act of incrementing became a bottleneck.

Finally we tried using a lookup table, where 0 (integer) mapped to 0.0 (double), 1 to 1.0, 2 to 2.0, etc. Then we arranged memory so that this table was very likely to be in the fastest cache. This was faster than any other approach.

We would never have thought ahead of time that a memory lookup would be faster than CPU calculations. But it was. We also never would have used this approach ahead of time, even if we had known that about how much faster it is than the obvious solution.
 
Last edited:
  • Like
Likes ScottSalley and DrClaude
  • #5
You may not have as much control as you think. It's hard to know what an optimizing compiler will do and what effect that has given the hardware design. Long ago, I tried to avoid memory access. It sometimes had the opposite effect that I expected. You might look at the assembly code of each version and see if anything obvious stands out. I would be surprised if your choice made any noticeable difference.
 
  • #6
To understand just a tiny part of your question look at Uli Drepper's paper on how cpu and memory design impact performance. An important keyword here is locality, one of the concepts factchecker and Mark44 are indirectly referring to. This is the stuff that software engineers up at night.

http://www.akkadia.org/drepper/cpumemory.pdf == What every computer programmer should know about memory.
 
  • #7
To store a intermédiate result is a good way to avoid useless opération and then optimize the execution time but had complexity. I want to say that you can Download Intel compiler and intel profiler for free as a student version. The profiler can profile each opération and count cache misses, if bad prédiction and is very helpfull.
 

1. Which is faster, reading from memory or arithmetic?

It depends on the specific context and the type of operations being performed. However, in general, reading from memory is faster than performing arithmetic operations.

2. Why is reading from memory faster than arithmetic?

Reading from memory involves retrieving data from a pre-determined location, while arithmetic operations require multiple steps such as fetching, decoding, and executing instructions. This makes reading from memory a simpler and faster process.

3. Can reading from memory ever be slower than arithmetic?

Yes, in certain cases, such as when the data being retrieved from memory is large and requires multiple fetch operations. Additionally, if the arithmetic operations being performed are simple and can be executed quickly, they may be faster than reading from memory.

4. How does the speed of reading from memory and arithmetic affect computer performance?

The speed of reading from memory and arithmetic operations plays a crucial role in determining the overall performance of a computer. A faster memory access can improve the speed of data processing and reduce the processing time, resulting in better performance.

5. Is there a way to improve the speed of both reading from memory and arithmetic?

Yes, advancements in technology have led to the development of faster memory and processors, which can improve the speed of both reading from memory and performing arithmetic operations. Additionally, optimizing the code and reducing the number of memory accesses can also improve overall performance.

Similar threads

  • Programming and Computer Science
Replies
21
Views
533
  • Programming and Computer Science
Replies
6
Views
1K
  • Programming and Computer Science
4
Replies
118
Views
6K
  • Programming and Computer Science
Replies
5
Views
8K
  • Engineering and Comp Sci Homework Help
Replies
3
Views
670
Replies
5
Views
960
Replies
10
Views
960
Replies
20
Views
1K
  • Programming and Computer Science
Replies
13
Views
1K
  • Programming and Computer Science
Replies
1
Views
549
Back
Top