- #1
- 2,844
- 0
Shifting a double-wide variable -- bit twiddling
I have a number too long to fit into the largest variable size (unsigned long long, 64 bits in this case) so I'm storing it in two variables, left and right. I want to shift it -- and to do it quickly, as I'll do this more than 1e12 times.
The 'obvious' way is
for a total of 5 operations and two assignments. This example is a right shift of 2, but right shifts by 1 and 4 are also important. In any case the size of the shift should be known at compile time.
1. Is a shift of 62 too large to be done efficiently? I know that older processors used to take O(n) time to shift n places (something like 3+n micro-ops), but I don't know if this holds on modern processors. Perhaps I should make a switch statement with constant to add (1 << 62, 2 << 62, and 3 << 62 precomputed)? Or would the branch misprediction outweigh this?
2. Is there a way to do this with fewer operations? I'm sure there are some people out there who have been doing bit manipulations far longer than I have.
For what it's worth I'm on a Core Duo, but I'm hoping for improvements to my program that work on most systems.
I have a number too long to fit into the largest variable size (unsigned long long, 64 bits in this case) so I'm storing it in two variables, left and right. I want to shift it -- and to do it quickly, as I'll do this more than 1e12 times.
The 'obvious' way is
Code:
unsigned long long right, left;
. . .
right = ((left&3) << 62) + (right >> 2);
left >>= 2;
for a total of 5 operations and two assignments. This example is a right shift of 2, but right shifts by 1 and 4 are also important. In any case the size of the shift should be known at compile time.
1. Is a shift of 62 too large to be done efficiently? I know that older processors used to take O(n) time to shift n places (something like 3+n micro-ops), but I don't know if this holds on modern processors. Perhaps I should make a switch statement with constant to add (1 << 62, 2 << 62, and 3 << 62 precomputed)? Or would the branch misprediction outweigh this?
2. Is there a way to do this with fewer operations? I'm sure there are some people out there who have been doing bit manipulations far longer than I have.
For what it's worth I'm on a Core Duo, but I'm hoping for improvements to my program that work on most systems.