# Fermat Factoring (Zeda's Method)

1. Dec 26, 2013

### Zeda

Hey everybody, I thought I would share an algorithm for factoring that I designed a few years ago while working on a Z80 project. I noted the same thing Fermat must have:
$a^{2}-b^{2}=(a+b)(a-b)$
After a few moments of looking at it, I realised the translation of that simple expression: If I can find a difference of two squares to equal a given number $c$, then I have factors of c. So for example, 102-52=100-25=75, so 75=(10+5)(10-5)=15*5. Now I also knew that the difference of consecutive squares produced the odd numbers, so 22-12=3, 32-22=5, 42-32=7, ... so I knew that for all odd numbers, there had to be a trivial solution. As well, I was interested in the specific application of factoring RSA keys, so my inputs were always going to be odd and the product of two distinct primes (inputs of 4n,4n+1, and 4n+3 work, but inputs of 4n+2 do not).

So now, here is the problem I was looking at:
Given a natural number $c=pq$ with $p,q\in \mathbb{p}$, then find integer solutions for the equation $a^{2}-b^{2}=c$.

So from here, I rewrote what I had, arriving to:
$b^{2}=a^{2}-c$

So then we get a restriction that $a^{2}\gt c$ and we need $a^{2}-c$ to be a perfect square. A more naive approach, then, is:
Code (Text):

ceil(√(c))→a
a*a-c→b
while int(√(b))≠√(b)
a+1→a
a*a-c→b
endwhile
return {a-√(b),a+√(b)}

This works, but there is an obvious optimisation we can use with a little math. We note that if we change the order of the code so that a is incremented after b is adjusted, then we need use (a+1)(a+1):
Code (Text):

ceil(√(c))→a
a*a-c→b
while int(√(b))≠√(b)
(a+1)(a+1)-c→b
a+1→a
endwhile
return {a-√(b),a+√(b)}

However, this is not the optimisation. The optimisation is that we can see that we are just adding 2a+1 to the original value of b. This replaces a multiplication with an addition and on top of that, it is in the main loop, reducing complexity:
Code (Text):

ceil(√(c))→a
a*a-c→b
while int(√(b))≠√(b)
b+1+2a→b
a+1→a
endwhile
return {a-√(b),a+√(b)}

This is a rather standard approach, but those square roots being part of the main loop is ugly to me. I am used to integer add/sub/rotate so square roots or very slow. Now is where I come in with a different approach.

Instead of testing if b is a perfect square at each iteration, we can just test how far away from a perfect square b is. Note that the next perfect square after x2 is x2+2x+1. If we find that b is 2 higher than a perfect square initially, then after the first iteration, it is increased to 2+2a+1=2a+3. if this number is greater than or equal to 2b+1, then we know that it is larger than the next perfect square, so subtract of the 2b+1 and increment b. If this is still larger than 2b+1, repeat. Once this hits 0, we know that b has become a perfect square.

To put this all together:
Code (Text):

ceil(√(c))→a
a*a-c→b
int(√(b))→e
b-e*e→e       ;e is how much larger b is than the largest perfect square less than or equal to b.
while e≠0
e+2a+1→e
a+1→a
while e>=(2b+1)
e-(2b+1)→e
b+1→b
endwhile
endwhile
return {a-√(b),a+√(b)}

Now the two square root operations are the most complicated routiens required, and they are just overhead. The loops only perform addition and subtraction, which is what I prefer. However, to remove some of the operations needed:
Code (Text):

ceil(√(c))→a
a*a-c→e
int(√(e))→b
e-b*b→e
(a<<1)+1→a
(b<<1)+1→b
while e≠0
e+a→e
a+2→a
while e>=b
e-b→e
b+2→b
endwhile
endwhile
(a-1)>>1→a
(b-1)>>1→b
return {a-b,a+b}

And if we assume all integer operations:
Code (Text):

1+√c→a
a*a-c→e
√e→b
e-b*b→e
1+a<<1→a
1+b<<1→b
while e≠0
e+a→e
a+2→a
while e>=b
e-b→e
b+2→b
endwhile
endwhile
return {a>>1-b>>1,a>>1+b>>1}

The following C code took 11 seconds to factor the 64 bit number:
Code (Text):

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <math.h>

int main()
{
uint64_t x=0xFFFFFFF6E0172B75LL;
uint64_t r=ceil(sqrt(x));
uint64_t err=r*r-x;
uint64_t s=sqrt(err);
err-=s*s;
s+=s+1;
r+=r+1;
while(err>0)
{
err+=r;
r+=2;
while(err>=s)
{
err-=s;
s+=2;
}
}
printf("%u,",r/2-s/2);
printf("%u",r/2+s/2);
return 0;
}

It was done in 32-bit mode on a 1.6GHz AMD processor. Also, I am not fluent enough in C to apply aggressive optimisations like I would in Z80. Maybe somebody could port this to other assembly languages to test benchmarks?

2. Dec 31, 2013

### Zeda

To expand a little more on the efficiency comparison, while this is on the same time complexity scale as trial division, Fermat factoring is a bit more elegant. That said, the speed savings of the algorithm I proposed is that, asymptotically, one of the square roots computed is replaced by one addition and one subtraction. In fact, on the chip level, the addition can be replaced by an increment (invert bit 1, if that returns a 0, invert bit 2, ...) so that can be even faster and nearly trivial being on a smaller time complexity than addition/subtraction.

In my experience, square roots are on the same time complexity as division and multiplication using a restoring squares algorithm (elegantly coded, of course). So we are replacing an O(M(n)) operation with an O(n) subtract and an O(1) increment which essentially comes out to O(n).

It still requires the same number of iterations of the main loop as a typical Fermat factoring routine, but it can be done faster. What I am saying is, if you are using this technique to factor, you can replace it with the above algorithm to increase the efficiency a bit.