What are the approximate costs for operations on common x86 chips?

Click For Summary

Discussion Overview

The discussion revolves around the approximate costs of operations on common x86 chips, with a focus on understanding the cycle counts associated with various arithmetic and logical operations. Participants explore the implications of these costs for optimizing code, particularly in the context of algorithmic improvements versus micro-optimizations.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant suggests that while algorithmic improvements are generally more beneficial than minor code optimizations, understanding the cost of operations is crucial for early planning in coding.
  • Another participant provides a link to Wikipedia for benchmarking information, noting that it may not directly address the original query but could offer useful references.
  • A third participant mentions that for older processors, it was straightforward to find cycle counts in the programmer's reference manual, but acknowledges that modern processors involve complexities such as prefetching.
  • This participant also shares links to Intel's processor manuals, indicating that they contain relevant cycle counting information.

Areas of Agreement / Disagreement

Participants do not reach a consensus on a specific list of operation costs, and multiple approaches to finding this information are discussed without resolving which is the most effective.

Contextual Notes

The discussion highlights the complexity of modern processor architectures and the variability in operation costs, which may depend on factors such as addressing modes and prefetching mechanisms.

Who May Find This Useful

Individuals interested in optimizing code for x86 architecture, particularly software developers and computer engineers focused on performance tuning and benchmarking.

CRGreathouse
Science Advisor
Homework Helper
Messages
2,832
Reaction score
0
As a first step to optimizing code, I like to think about faster ways to do things. Algorithmic improvements are far better than minute improvements to code, or even use of inline assembly, in general. But most optimizations beyond common subroutine elimination and its ilk substitute inexpensive operations for expensive ones, rather than eliminating chunks entirely. Now when two solutions present themselves I can just code up both and time them, but for the purpose of planning early on into the coding it's good to have an idea of what costs more. For example, addition and negation are cheap while square roots and other transcendentals are expensive.

Toward that end, is there a decent list of approximate costs for operations (on some common x86 chip, perhaps a Pentium IV or Core II Duo or Athlon 64)? I'm looking for some kind of chart with figures like "division: 28 cycles (max throughput 5 cycles)". Since I'm just looking for general ballparks, I'm not too sensitive about the particular chip it applies to -- although general notes about what ops chips are good with would be great as well.
 
Computer science news on Phys.org
This is not exactly what you are looking for, but it's pretty amazing. I checked wikipedia.org for computer benchmark info, and got a big page with lots of info and outside references:

http://en.wikipedia.org/wiki/Benchmark_(computing)

I know you want to compare different algorithms on the same processor, and that is different than benchmarking different processors against the same algorithm, but maybe poke around some of the links to see if you can find what you want, or see other terms to search on. Pretty interesting links that I followed...
 
For older processors this was fairly easy.
Look up the instruction in the programmer's reference manual.
Pick the cycle count for the addresing mode being used.

For the proceesors you list this gets very complex with prefetch and what not.

This should be the page you want for Intel processors.

http://www.intel.com/products/processor/manuals/index.htm

and at least some cycle counting info appears here

http://www.intel.com/design/processor/manuals/248966.pdf

You will have to search for other manufactures
 
Last edited by a moderator:
Thanks, that looks like it just might serve.