| New Reply |
Determining efficiency of multiplication algorithm |
Share Thread | Thread Tools |
| Jan20-13, 02:18 AM | #1 |
|
|
Determining efficiency of multiplication algorithm
Hi
I'm writing a BigNum library for Javascript and have come up with a multiplication algorithm which I think is pretty efficient (a modification of shift/add), but I don't understand O(log ...) notation, so don't really know how to compare my method with something like the Schönhage–Strassen algorithm which has a complexity of O(N log N log log N). I've determined, that in the worst case, numbers of bit width w require w/2 iterations with my method. Each iteration performs a number of individual steps, but I don't know whether to count those sub steps, or only the iteration count. Is w/2 enough to define efficiency? Or do I have to get more granular and define every bit manipulation? Thanks for any help |
| Jan20-13, 04:06 AM | #2 |
|
|
This explanation is from my text book for analyzing time efficiency of (non-recursive) algorithms.
1) Decide on a parameter indicating the input size 2) Identify the algorithm's basic operation 3) Setup a sum expressing the number of times the basic operation is executed 4) Using sum manipulations and standard formulas find a closed form formula for the count and/or establish it's order of growth. So, first you want to be able to measure the size of your input. For example, if you have an array of integers to represent your bignum, then your input size is the size of the array. Second, locate the basic operation, generally this is inside the inner loop of the algorithm. Third, express the number of times this basic operation is executed as a sum, i.e. [tex]\sum_{i=0}^{n} a_{i}[/tex] where n is the number of iterations and a is the basic operation. Four, manipulate the sum to obtain a nice formula. Here's an example of a matrix multiplication algorithm: Code:
MatrixMultiply: A*B=C
for (i = 0, i < n, i++)
{
for (j = 0; j < n; j++)
{
for (k = 0; k < n; k++)
{
C[i][j] += A[i][k] * B[k][j]
}
}
}
[tex]\sum_{k=0}^{n} 1[/tex] and the total number of basic operations can be represented by: [tex]\sum_{i=0}^{n} \sum_{j=0}^{n} \sum_{k=0}^{n} 1[/tex] It should be clear that the innermost sum: [tex]\sum_{k=0}^{n} 1[/tex] is equal to n, because we sum 1 from k = 0 to k = n. We can reduce the triple sum to: [tex]\sum_{i=0}^{n} \sum_{j=0}^{n} n[/tex] and with the same reasoning, reduce this double sum to: [tex]\sum_{i=0}^{n} n^{2}[/tex] and further reduce it to: [tex]n^{3}[/tex] So the cost of this algorithm, as a function of input size is C(n) = c*n3 where c is the cost of the basic operation, and you would say that the order of growth is O(n3) - the constant c is irrelevant for Big Oh notation because the size of n3 will eclipse any constant c for large n. This is a bit simplified from the text, but that should give you an example of how to go about calculating the complexity of your algorithm. |
| Jan20-13, 09:26 AM | #3 |
|
|
The primary advantage my algorithm imparts is due to the fact that in base 2 v << m + n = v << (m + 1) - n This is put into practice to reduce the number of add operations. For example, using traditional shift/add 15 * 15 1111 * 1111 ------------------------------------------- 11110 + 1111 = 101101 (30 + 15 = 45) 1011010 + 1111 = 1101001 (90 + 15 = 105) 11010010 + 1111 = 11100001 (210 + 15 = 225) ------------------------------------------- 3 shifts, 3 adds Using subtraction method 15 * 15 1111 * 1111 ------------------------------------------- 11110000 - 1111 = 11100001 (240 - 15 = 225) ------------------------------------------- 4 shifts, 1 subtract So, in general, for bitwidth w, simple shift/add yields worst case of w - 1 shifts, w - 1 adds subtraction method yields w shifts, 1 subtract Ratio between two methods equals (w + 1) / (2w - 2) As w approches infinity, (w + 1) / (2w - 2) approaches 1/2, or twice as fast as shift/add So, if bits are used as the unit of measure, would that mean shift/subtract would be O(w + 1)? |
| Jan20-13, 11:02 AM | #4 |
|
Recognitions:
|
Determining efficiency of multiplication algorithm
Normally shifting isn't used with extended precision multiplication, and the general goal is to reduce the number of multiplies at the expense of increased additions or subtractions (since addition or subtraction is usually faster than multiplication).
Wiki mentions The Karatsuba algorithm was the first multiplication algorithm asymptotically faster than the quadratic "grade school" algorithm. http://en.wikipedia.org/wiki/Karatsuba_multiplication Note that the full precision of each word in memory is not used in order to avoid having to deal with carry operations as often. In the wiki article, the numbers are broken up into 31 bit components for usage with 32 bit registers and memory. From wiki article: As a rule of thumb, Karatsuba is usually faster when the multiplicands are longer than 320–640 bits. Then there is Toom_Cook. The wiki article doesn't mention at what size numbers that this method is better than classic or Karatsuba. Algotirhms like Schönhage–Strassen further enhance this by using FFT (Fast Fourier Transform) combined with finite field (rings) math modulo 2n - 1, which is not only a prime number, but also has a "primitive" b, a number than when multiplied by itself will produce every number in the ring except zero, until bn, in which case bn mod (2n - 1) = 1. From wiki article: In practice the Schönhage–Strassen algorithm starts to outperform older methods such as Karatsuba and Toom–Cook multiplication for numbers beyond 2^(2^15) to 2^(2^17) (10,000 to 40,000 decimal digits). A couple of links for number theoretic transform: Number-theoretic_transform http://apfloat.org/ntt.html Normally division is avoided by using an optimized algorithm to invert a number (calculates 1/x), then doing an extended precision multiply. In the case of a series, if possible, the denominator is calculated separately via multiplications, so that inversion of the denominator product is only done once at the end of the algorithm. These methods get increasling complex, and only save time if the numbers you're working with are fairly large. If the numbers you're working with are not large, then a simpler approach may be more appropriate. |
| Jan20-13, 02:23 PM | #5 |
|
|
a = multiplicand b = multiplier i = b.length - 1; w = bitwidth of b while (i > -1) { r = GetShift(b); a = Subtract(Shift(a, r.shift), a); i = r.index; } From Adyssa's post, I'm guessing "GetShift" qualifies as a constant, because it only makes a single pass through b Worst case, the number of partial shifts in "Shift" equals w Worst case, the number of partial subtracts in "Subtract" equals w So, as I interpret Adyssa's post, that would give an "O" value of O(2 N). This is most likely wrong, though, since Electronic usage under Wikipedia multiply, states that shift add has a complexity of Θ(n^2). My second post above clearly shows that shift/subtract approaches 1/2 the number of steps as shift/add. So would that make it Θ(n^2 / 2)? Regardless of the "O" value, the actual code seems to be pretty zippy. Orders of magnitude faster than some other Javascript BigNum libraries. |
| Jan20-13, 04:21 PM | #6 |
|
Recognitions:
|
It's not clear to me how your shift subtract method will be used to implement an extended precision multiply. Assume you had 8 bit registers / memory and stored 4 bits or 1 "nibble" of a number per memory location or register. So if you have 32 bit numbers A and B, you split them up into 4 nibbles and you want to determine the product:
(A0 x 2^12 + A1 x 2^8 + A2 x 2^4 + A3) x (B0 x 2^12 + B1 x 2^8 + B2 x 2^4 + B3) where does your subtract method come into play, and is it significantly different than the "divide and conquer" approach as used by the Karatsuba method? |
| Jan21-13, 02:30 AM | #7 |
|
|
In a real world test using a 200,000 bit random number, SS performed 22.35 percent fewer calculations than SA. Don't know how that translates to "O" notation, but a 20+ percent efficiency boost, is respectable. |
| Jan21-13, 08:17 AM | #8 |
|
Recognitions:
|
(A0 x 2^4 + A1) x (B0 x 2^4 + B1): X0 = (A0 x B0) X1 = (A0 + B1) x (A1 + B0) X2 = (A1 x B1) (A0 x 2^4 + A1) x (B0 x 2^4 + B1) = X0 x 2^8 + (X1 - X0 - X2) x 2^4 + X2 For larger numbers, perform this algorithm recursively, where the initial A0, A1, ... B0, B1, ... terms are n bits, then on the recursive call, n/2 bits, then on the next layer of recursion n/4 bits, ... . Code:
A0 A1
x B0 B1
----------------------
A1 x B1
A0 x B1
A1 x B0
A0 x B0
----------------------
sum of the terms above
|
| New Reply |
| Thread Tools | |
Similar Threads for: Determining efficiency of multiplication algorithm
|
||||
| Thread | Forum | Replies | ||
| Determining efficiency of fuel cells | Chemistry | 0 | ||
| Efficiency and the Euclidean Algorithm | Linear & Abstract Algebra | 2 | ||
| Determining Fuel Efficiency | Classical Physics | 5 | ||
| Simple Hex Multiplication question from AES Algorithm | General Math | 1 | ||
| Determining the Efficiency of a Fuel | Introductory Physics Homework | 3 | ||