The worst case for trial division factorisation of n?

In summary, Fermat's method can find factors of very close sizes, but it takes more time than trial division.
  • #1
Phillips101
33
0
What's the worst case for the factorisation of n using trial division? Worst case in terms of arithmetic operations that is.

Many places tell me that it's n=pq with p and q prime and close to each other (and hence close to root(n) ), but I can't prove it.

Help would be appreciated.
 
Physics news on Phys.org
  • #2
Assuming you test for divisibility by 2, 3, 5, 7, 11, ... or 2, 3, 4, 5, 6, ... or something similar (wheel mod 30?), the runtime is determined by size of the second-largest prime factor. Thus the longest runtime is when the number has two prime factors and they're very close -- perhaps even equal. Of course a real test might check for powers first.
 
  • #3
Obviously if the two numbers are close then starting with small factors will consume a lot of time, since the answer is near the square root of the product, as CRGreathouse explains.

Fermat developed a method to deal with close prime factors say pq. He looks at the difference of squares
n^2-m^2 =pq.

He picks a larger number near the square root of pq and then subtracts off pq. Thus we have n^2-pq = ? He continues in this way always looking for a square difference. And once he finds it, he has solved the original equation.

In the case of 97x103 = 9991, The first choice is 100^2. 100^2-9991 = 9. So having two squares we have found the factors (100+3)(100-3) = 9991.
 
Last edited:
  • #4
CRGreathouse said:
Assuming you test for divisibility by 2, 3, 5, 7, 11, ... or 2, 3, 4, 5, 6, ... or something similar (wheel mod 30?), the runtime is determined by size of the second-largest prime factor. Thus the longest runtime is when the number has two prime factors and they're very close -- perhaps even equal. Of course a real test might check for powers first.

What do you mean by 'the run time is determined by'? Do you mean that the factor it would take the longest to find would be the second largest? And if so, why exactly does this imply having only two factors would be quickest? I don't intuitively see why counting up to 17 5000 times will be quicker than counting to a prime approximately equal to 17^5000. (In terms of purely arithmetic operations - obviously there are practical considerations but I'm ignoring them).

robert Ihnot said:
Obviously if the two numbers are close then starting with small factors will consume a lot of time, since the answer is near the square root of the product, as CRGreathouse explains.

Fermat developed a method to deal with close prime factors say pq. He looks at the difference of squares
n^2-m^2 =pq.

He picks a larger number near the square root of pq and then subtracts off pq. Thus we have n^2-pq = ? He continues in this way always looking for a square difference. And once he finds it, he has solved the original equation.

In the case of 97x103 = 9991, The first choice is 100^2. 100^2-9991 = 9. So having two squares we have found the factors (100+3)(100-3) = 9991.

I've come across this in a couple of places now. I think I'll do something on this for comparison purposes.
 
  • #5
Phillips101 said:
What do you mean by 'the run time is determined by'? Do you mean that the factor it would take the longest to find would be the second largest?

The second-largest factor would be the last to find, but what I meant was exactly what I wrote: the runtime is a function of the second-largest prime factor S. If you're sieving with all numbers (or odds or numbers relatively prime to k for some k), numbers tested is proportional to S; if with primes, it's proportional to S/log S.* In both cases time needed for each division is roughly a function of S -- S^2 for basecase or better for more advanced methods.

Phillips101 said:
And if so, why exactly does this imply having only two factors would be quickest?

I would hope you're able to figure that out! Consider the case of N = pqr and N' = p'q' for p < q < r and p' < q' prime, where N and N' are of similar sizes.* Literally, this requires oracle access to a primality test. But primality tests are generally cheap. If you're working with small numbers, you may have extra cost beyond finding the second-smallest factor, but usually not too much.
 
  • #6
It's worth mentioning that there are numbers that are bad for trial division *and* Fermat. If p is about 2q, p and q prime, then you need something like q (or q/log q) steps to factor pq with trial division, and about q steps with Fermat. This is because the distance from p to q is about as far as the distance from q to 1, even though p is "only" bigger by one bit.
 
  • #7
CRGreathouse said:
The second-largest factor would be the last to find, but what I meant was exactly what I wrote: the runtime is a function of the second-largest prime factor S. If you're sieving with all numbers (or odds or numbers relatively prime to k for some k), numbers tested is proportional to S; if with primes, it's proportional to S/log S.* In both cases time needed for each division is roughly a function of S -- S^2 for basecase or better for more advanced methods.

I'm probably coming across as quite dim here, but I still don't see how this answers my question.

You have a number N. You try dividing it by every number less than sqrt(N) until you get a hit. If no hits, then it's prime. If you get a hit, redefine N as N/hit, and go again. This way, you have roughly p(1)+p(2)+p(3)+...+p(k-1) +sqrt(p(k)) trial divisions, where the p(i) are all the prime factors of N (and in ascending order), and the sqrt(p(k)) is added onto the end because you still have to make sure that p(k) is actually prime.

I don't see how the 'number of numbers tested', ie p(1)+p(2)+p(3)+...+p(k-1) +sqrt(p(k)) or your runtime function, is proportional to p(k-1)? If you can explain that, then I agree that it implies to get the most iterations needed, the runtime function need be as large as possible so n=pq with p&q close!

EDIT: Unless of course you don't reset your count every time, and say - 7 is a factor. Is 7 a factor of n/7? And then continue counting upwards afterwards. In this case, you would get a prime factor, and redefine your stopping point at sqrt(n/prime factor). So, imagine you have all but one prime factors, say p(k), but you need to verify that it's prime. To do this, you would count upwards from p(k-1) to sqrt(p(k)), testing each number. Of course, p(k) is prime so you will indeed count all the way up to sqrt(p(k)). If however you have already passed it, you will stop. This happens if p(k-1)>sqrt(p(k)) => p(k)<p(k-1)^2.

So, in this case, to give us as many things to try as possible, we should pick p(k) such that it is greater than p(k-1)^2 ! But, in this case, we have that p(k)<<n => sqrt(p(k))<<sqrt(n), whereas if p(k-1) is near sqrt(n) (say well within a factor of 2 so p(k-1) is much closer to sqrt(n) than sqrt(p(k)) was in the previous case), then the number of things tried will be near sqrt(n), and hence greater than if p(k) was large. Hence p(k-1)~=p(k)~=sqrt(n). And as a consequence of this, we need there to be no other prime factors.

Is my very warped and probably not very legible logic correct?

(of course, if you had n as a prime number, then it would take sqrt(n) divisions anyway - but I suppose I'm ignoring this case).
 
Last edited:
  • #8
Phillips101 said:
I'm probably coming across as quite dim here, but I still don't see how this answers my question.

You have a number N. You try dividing it by every number less than sqrt(N) until you get a hit. If no hits, then it's prime. If you get a hit, redefine N as N/hit, and go again. This way, you have roughly p(1)+p(2)+p(3)+...+p(k-1) +sqrt(p(k)) trial divisions, where the p(i) are all the prime factors of N (and in ascending order), and the sqrt(p(k)) is added onto the end because you still have to make sure that p(k) is actually prime.

Suppose you start with a number around 10^10. You start looking for factors, and you find one around 10^2, leaving you with a cofactor around 10^8. Now you search up to about 10^4 and then stop, having hit the square root of the cofactor. If there hadn't been a factor around 10^2, you would have searched up to 10^5 -- ten times higher.
 
  • #9
CRGreathouse said:
Suppose you start with a number around 10^10. You start looking for factors, and you find one around 10^2, leaving you with a cofactor around 10^8. Now you search up to about 10^4 and then stop, having hit the square root of the cofactor. If there hadn't been a factor around 10^2, you would have searched up to 10^5 -- ten times higher.

I believe this is one reason why you would search for smaller factors first. But doesn,t the Fermat search (looking for the difference of two squares) find the factors closest to the square root first?
 
  • #10
ramsey2879 said:
But doesn,t the Fermat search (looking for the difference of two squares) find the factors closest to the square root first?

Yes. But that's a bad way to search -- you're much more likely to find a small factor than a large factor. If your number is around 10^30 and you test a billion numbers around the square root, you have a 1-in-a-million chance to find a factor. In the first billion, the chance is 97.3%.

Of course there are improvements to Fermat's method that vastly improve it.
 

Related to The worst case for trial division factorisation of n?

1. What is trial division factorisation and how does it work?

Trial division factorisation is a method used to find the prime factors of a given number, n. It involves dividing n by every prime number less than or equal to the square root of n, and if the remainder is 0, that prime number is a factor of n. This process is repeated until all prime factors are found.

2. What is the worst case scenario for trial division factorisation?

The worst case scenario for trial division factorisation is when n is a large prime number. This means that the process of dividing n by every prime number less than or equal to the square root of n must be repeated many times before all prime factors are found. This makes the method very inefficient for large primes.

3. How does the efficiency of trial division factorisation compare to other factorisation methods?

Trial division factorisation is one of the simplest and most basic methods of factorisation, but it is also one of the least efficient. Other methods, such as the sieve of Eratosthenes and the quadratic sieve, are much faster and more efficient for larger numbers.

4. Can trial division factorisation be improved in any way?

Yes, there are some ways to improve the efficiency of trial division factorisation. For example, you can stop the process early if the remainder is not 0, as this means that the number being divided is not a factor. Additionally, using a list of pre-computed primes can also speed up the process.

5. Are there any real-world applications of trial division factorisation?

Trial division factorisation is mostly used in mathematics and computer science for the purpose of studying and understanding the properties of prime numbers. However, it can also be applied in cryptography for breaking certain encryption algorithms that rely on the difficulty of factorising large numbers.

Similar threads

  • Linear and Abstract Algebra
Replies
6
Views
1K
  • Linear and Abstract Algebra
Replies
1
Views
977
  • Linear and Abstract Algebra
Replies
2
Views
1K
  • Linear and Abstract Algebra
Replies
9
Views
1K
  • Calculus and Beyond Homework Help
Replies
3
Views
694
Replies
6
Views
890
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
991
  • Math Proof Training and Practice
2
Replies
61
Views
6K
  • Linear and Abstract Algebra
Replies
5
Views
4K
Back
Top