# Modular Multiplication

1. Dec 29, 2006

### chesschi

I want to write a C/C++ program and encounter a problem.If I have three 64 bit numbers and need to manipulate
"a * b mod c"

Is there any well-known, efficient and simple algorithm to implement it?

I only know it can be calculated sth. like this..
__int64 a, b, c
a = x1 * 2^32 + x0
b = y1 * 2^32 + y0

then a * b = x1y1 * 2^64 + (x0y1 + x1y0) * 2^32 + x0y0

but I dunno what to do in the next step

Note that this cannot be calculated using some classes like BigInt, gmp

Thank you very much!!!

2. Dec 31, 2006

### Ouabache

Sounds like an interesting question. My C++ experience is limited. Sorry I cannot offer much help there. However, I think you will have much better luck (more software folks) posting this in our programming subforum. If you wish to do this, just ask any moderator or mentor for assistance. Infact there are a couple of excellent C++ tutorials in that subforum.

Last edited: Dec 31, 2006
3. Dec 31, 2006

### nmtim

Is c an arbitrary integer?

Does "a*b mod c" mean (a*b) mod c?

Also, what's with x0, x1, etc? Are you trying to do this with 32 bit integers instead of using 64 bit integers?

For the mod c, how about bitwise and with c? At least that should work if a, b, c >= 0. Edit: er, ok, maybe only if c is a power of 2. Edit 2: What's wrong with the modulo operator "%"? Forgot that was there...I usually use powers of two, where the & (c-1) trick works.

Cheers,
Tim

Last edited: Dec 31, 2006
4. Dec 31, 2006

### Hurkyl

Staff Emeritus
Er, gmp does allow you to compute such things.

Anyways, I presume you're either going to have to do a Montgomery multiply, or you're going to have to use a divide & conquer division algorithm. (which will require splitting c into a high and low word).

5. Dec 31, 2006

### chesschi

a, b and c both are 64 bit integers, c is arbitary number, may not be in power of 2.
but a * b will generate a 128 bit intermediate integers which will overflow. So I believe there is an algorithm to deal with such problem

I'm not sure if this is correct....I just read some websites talking about it but they didn't provide details implementation. a and b are 64-bit integers, it tries to represent a,b in the power of 2 ^ 32 with the corresponding x1, x0, y1, y0 coefficients If there is any good suggestion other than this method, I'm also interested to know it!

For the montgomery multiply, the output is o = ab R^(-1) mod c, where R = 2^64...if you need to recover it...you still need to manipulate o * R mod c...
It seems it cannot solve in this case because o, R and c are still 64-bit integers

Last edited: Dec 31, 2006
6. Jan 3, 2007

### Jheriko

I am probably missing something enormous, but what is wrong with the '%' operator?

At worst you can use the 64-bit CPU instruction with an __asm block... (I am assuming you are using windows with your highly MS-specific __int64)

Last edited: Jan 3, 2007
7. Jan 3, 2007

### verty

Jheriko, the (a*b) will overflow for large numbers.

8. Jan 3, 2007

### Jheriko

As for splitting the multiplication it has almost been done above:

"a * b = x1y1 * 2^64 + (x0y1 + x1y0) * 2^32 + x0y0"

The key is to realise that the xi, yi are all 32-bit integers (although we store them in 64-bits) and so their products are 64 bits. We can jiggle the bits around to make the correct answer in two 64-bit integers.

__int64 upper = x1*y1 + ((x0y1 + x1y0) >> 32);
__int64 lower = x0*y0 + ((x0y1 + x1y0) << 32);

This should work as far as I can see... you can then use the existing 64-bit modulus operator.

Last edited: Jan 3, 2007
9. Nov 5, 2007

### CRGreathouse

Apologies for the necromancy, but this is just the problem I was working on.

Include
#define llong __uint64
or
#define llong unsigned long long

Code (Text):
llong mod_mult (llong a, llong b, llong modulus) {
// Compute a * b = 2^64 * upper + lower
llong a1 = a & LOW_MASK;
llong a2 = a << 32;
llong b1 = b & LOW_MASK;
llong b2 = b << 32;
llong tmp = a1 * b2 + a2 * b1;  // (1)
llong upper = a1 * b1 + tmp >> 32;
llong lower = a2 * b2 + (tmp << 32);

// Reduce mod modulus
lower %= modulus;
tmp = (0 - modulus) % modulus;  // tmp = 2^64 % modulus
lower += upper * tmp;   // (2)

return lower % modulus;
}
The 'obvious' code here has a number of problems, aside from the potentially non-portable use of integer wrapping and the assumption that llong is exactly 64 bits.

First, the assignment (1) could be as large as 2^65 - 3^34 + 2 which will cause upper to be off by 1. Second, the multiplication in (2) could be huge, almost as large as 2^128 -- and solving this problem is in essence no different from solving the original problem. So the naive method suggested above doesn't seem to work. Remarks?

10. Nov 6, 2007

### Hurkyl

Staff Emeritus
CRG: are you familiar with Montgomery multiplication, Barrett reduction, and/or Newton division?

You could even go so far as to write a multiprecision division algorithm to compute the remainder.

11. Nov 6, 2007

### Hurkyl

Staff Emeritus
This makes no sense to me at all.

12. Nov 6, 2007

### rcgldr

Maybe I'm suffering from brain fade here thinking back on my finite field days and ecc, but isn't the following true (assuming integer math here):

(a x b) mod c = ((a mod c ) x (b mod c )) mod c

(a + b) mod c = ((a mod c ) + (b mod c )) mod c

(a - b) mod c = ((a mod c) - (b mod c)) mod c

a mod c = (a - (n x c)) mod c

If c < 2^32, there's no overflow issue.

If c >= 2^32, then there's an issue with potential overflow.

This would all be so easy in 64 bit mode assembly where there is a 128 product and 128 bit dividend to work with.

Anyway, to divide 128 bit number by a 64 bit number in 3 steps:

normalize both numbers (left shift until most significant bit is 1), keeping track of shift counts.
Divide normalized a x b by normalized c.
Shift the quotient right by the appropriate bit count.
Subtract 1 from this quotient to get est quotient #1 .
Multiply c times est quotient #1 (producing yet another 128 bit number).
Subtract c times est quotient #1 from the 128 bit product of a x b

Repeat this process again, with est quotient #2.

At this point, c will go into the result 1 or 0 times.

Last edited: Nov 6, 2007
13. Nov 6, 2007

### KTC

I just reread what I wrote, I was being special. I blame it on lack of sleep.

14. Nov 7, 2007

### CRGreathouse

Yes, yes, and yes. I don't think I could write them correctly and efficiently without a reference, but I understand the basic principles and may have coded some of them (not recently).

My question is essentially, "What's an efficient multiprecision modular reduction/division algorithm for two-word numbers?".

15. Nov 7, 2007

### CRGreathouse

Good point. Maybe I should just buckle down and try this in Intel assembly (gcc-style).