# PE 75 primitive Pythagorean triples

• Arman777
In summary: Second, we can use a faster function in python to find primitive triples called gcd(). It takes O(n*log(n)) time. Third, we can use a different algorithm called sieving. In summary, The question asks for primitive Pythagorean triples such that S=a+b+c<15×105. The first way is to find all the primitive triples and it takes 1.6 seconds in Python. The second way is to use a faster function called gcd(). It takes O(n*log(n)) time. The third way
Arman777
Gold Member
Homework Statement
It turns out that 12 cm is the smallest length of wire that can be bent to form an integer sided right angle triangle in exactly one way, but there are many more examples.

12 cm: (3,4,5)
24 cm: (6,8,10)
30 cm: (5,12,13)
36 cm: (9,12,15)
40 cm: (8,15,17)
48 cm: (12,16,20)

In contrast, some lengths of wire, like 20 cm, cannot be bent to form an integer sided right angle triangle, and other lengths allow more than one solution to be found; for example, using 120 cm it is possible to form exactly three different integer sided right angle triangles.

120 cm: (30,40,50), (20,48,52), (24,45,51)

Given that L is the length of the wire, for how many values of L ≤ 1,500,000 can exactly one integer sided right angle triangle be formed?
Relevant Equations
none
The question simply asks primitive pythagorean triples ##(a,b,c)## such that ##S = a + b + c <15\times 10^{5}##
Python:
import time
import math

start = time.perf_counter()

pythagorean_triples = {(3, 4, 5) : 12}

for m in range(0, 10**3, 2):
for n in range(1, m, 2):
if math.gcd(m, n) == 1:
pythagorean_triples.update( {(m**2 - n**2, 2*m*n, m**2 + n**2): 2*(m*(m+n))} )

for m in range(1, 10**3, 2):
for n in range(0, m, 2):
if math.gcd(m, n) == 1:
pythagorean_triples.update( {(m**2 - n**2, 2*m*n, m**2 + n**2): 2*(m*(m+n))} )

pythagorean_triples_copy = pythagorean_triples.copy()
for key, value in pythagorean_triples.items():
if value >= 15 * 10 ** 5:
del pythagorean_triples_copy[key]

print(len(pythagorean_triples_copy))

end = time.perf_counter()

print(end - start, "sec")

According to the rule ##gcd(m,n) = 1## and if ##m## is even ##n## must be odd or vice verse to create primitive pythagorean triples. So here is my code however it seems something is wrong.

Last edited:
Arman777 said:
The question simply asks primitive pythagorean triples (a,b,c)(a,b,c) such that S=a+b+c<15×105

Does it? (3,4,5) and (5,12,13)

Arman777
oh yes sorry. I understand it now. Let me try like that.

Why do you need primitive Pythagorean triples? As you wrote, 6-8-10 is a perfectly good solution.
Why is looping over m and n better than looping over a and b?
Why are you storing triples and not lengths?

Why do you need primitive Pythagorean triples? As you wrote, 6-8-10 is a perfectly good solution.
Why is looping over m and n better than looping over a and b?
Why are you storing triples and not lengths?
I solved a similar problem (just the topic) like this on another site and I did not much pay attention to the PE part of the question. As you said ints unnecessery to write triples. I ll try to solve it and reply back as soon as I can.
Yes as you said 6-8-10 are also good solutions.
Well this method driectly gives us primatives and by multiplying each term with constant value we can get other solutions. Maybe we can do some sort of a sieve method.

Why is looping over m and n better than looping over a and b?

You seem to want to rush into writing code without thinking about what you want to write. This is likely to be as successful as building a house by cutting wood before drawing up plans.

jim mcnamara
For example, here is where I might start:

Code:
create array of ints from 12 to 1500000 called tally initialized to zero
loop over a from 3 to 459340 // largest a can be
loop over b from a+1 to 459340
c = sqrt(a^2 + b^2)
if c is an integer increment tally[a+b+c]
end loop over a and b

int n = 0
loop over i from 12 to 1500000
if tally[i] is 1, increment n
end loop over i

print n

The problem is that there are 100 billion square roots here. That may cause things to be slower than we would like. (PS there is also a bug - can you find it?)

Last edited:
Why is looping over m and n better than looping over a and b?
Well, initially I thought that we are just searching for primitive triples. In that case, using m and n is much better than the a and b ( I know that because I tried )

(PS there is also a bug - can you find it?)
probably here
loop over i from 12 to 1500000
if tally is 1, increment n

I am not sure I understand your code but you are referring tally[12]to tally[1500000] and that's the bug? and why that should be equal to 1?

The problem is that there are 100 billion square roots here. That may cause things to be slower than we would like.
To increase the speed of the code. As you said this approach cannot be good.

I thought some approaches and I would like to share them.

I thought 3 ways.

First, we can find all the primitive triples which its really easy by using ##m## and ##n## ( takes 1.6 seconds in python). And we know that any ##S##, that is multiple of 2 primitive triples, will not satisfy the condition.

For instance, ##(3,4,5) : 12## is a primitive triple and ##(5,12,13) : 30## another one. So any multiple of 60 (since 60 is the first number which divisible by 12 and 30) should be excluded. So the problem reduces to catch all these multiple numbers are eliminate them.

Second, every triple can be found by an even ##m## and odd ##n## (or vice versa) and if ##gcd(m, n) = 1## then every, ##(m^2 - n^2 , 2mn, m^2 + n^2)## creates a primitive triple.

So

$$S = m^2 - n^2 + 2mn + m^2 + n^2 = 2m(m+n)$$

This implies 2 things. First every ##S## (primitive or not) should be divisible by ##2##. Second, every ##S## should be divisible by ##m## and ##(m+n)##. we also know that ##m## must be larger than ##n##. By using this trick we might get somewhere. ( low probability? )

And also all non primitive triples are can be found like ##S = C\times 2m(m+n)## where C is some constant.The third way, We can factor the ##S##. If the ##S## contains at least two of the primitive triple ##S## ( For instance 60 contains 12 and 30 in its factors so it should be excluded). Or 120 contains ( 12, 30 and 40)

Arman777 said:
In that case, using m and n is much better than the a and b ( I know that because I tried )

It is true that it is better, but why? This is what I mean by thinking and planning before jumping into the writing. If nothing else, it's inefficient to code things up N different ways and then picking.

The issue with using m and n instead of a, b, and c is you need three things.
1. It needs to be necessary. It produces only triples, with no non-triples.
2. It needs to be sufficient. It produces all the triples, with none left behind.
3. It needs to be unique. It produces each triple exactly once.
You need to understand why these are true - or more precisely, what you need to do in your code to ensure these are true.

Arman777 said:
I thought some approaches and I would like to share them.

Remember "profile your code"? I bet you are spending almost all of your time in math.gcd. (My code spends most of its time checking for coprimality.) So unless you can drastically speed up math.gcd or substantially reduce the number of times you call it, nothing will help much.

For what it's worth, my code runs at 10 ms. (The granularity of my timer) I didn't code up the algorithm I posted, but suspect it will take around an hour.

It is true that it is better, but why? This is what I mean by thinking and planning before jumping into the writing. If nothing else, it's inefficient to code things up N different ways and then picking.

I did not code actually I am still in the thinking process. The code that I shared was code for another problem. I just modified a bit. But its totally useless for this problem.

I am profiling when I need. For instance,

bet you are spending almost all of your time in math.gcd

In the original post if I loop over ##a## and ##b##, float(c).is_integer() method takes much more time.

So it's better to use math.gcd(), ##m's## and ##n's## to save time. However, it finds only the primitive triples and we don't need them for this problem so its most likely useless.

As you said I ll loop over ##a## and ##b## and try to find a way an algorithm and then code.

In these days I am trying to code in C. I 'll take algorithm and data structure course next semester (which the course instructor will use C and I need to learn C). Since I have no idea about those things it might help me to come up with more clever solutions in general.

In the meantime I ll try to think about the problem and maybe I write my code in C instead of python :)

Arman777 said:
The code that I shared was code for another problem. I just modified a bit. But its totally useless for this problem.

I'm not trying to give you a hard time. I am trying to get you to think logically and methodically.

Arman777 said:
it finds only the primitive triples and we don't need them for this problem so its most likely useless.

It's not true that it only finds primitive triples. It can be made to do so, but so can other generators. It's also true that primitive triples are not the answer, but does that mean it can't be part of an answer?

Make a plan. You need to make some decisions, so list the pros and cons of these decisions. Only then start writing code.

Well it doesn't particularly for this problem however it shows that if I try to loop over a and b and try do float(c).is_integer() method its not going to work well. As you mentioned it takes time since there are lots of numbers. I can loop over a and b maybe or not ( I am not sure ) but in any case I need something clever.
but does that mean it can't be part of an answer?
It can be I think yes.
Make a plan. You need to make some decisions, so list the pros and cons of these decisions. Only then start writing code.
Yes I ll try to make an algorithm and decide to do loop over which variable or how to shorten things.. and then try to work on it. Maybe I mix things up Idk.

I'm not trying to give you a hard time. I am trying to get you to think logically and methodically.
Yes I know. Thanks for helping me out.

The code is so fast now I am having a hard time timing it. As I said, the granularity is 10 ms, so "10 ms" can mean anything between 5 and 15. By running it 100x, it looks like it was a hair under 15 and with my latest attempts to speed it up, it's a hair under 12. So about a 300,000x improvement over the original hour.

I still want you to code it up yourself, but once you have done it (and not before), there are three lines of my code I want you to think about. Why would I write this?

C++:
 return (a<b) ? coprime(b,a) : !(a%b) ? (b==1) : coprime(b, a%b);
Code:
for(n=1+(m%2); n < m; n+=2)
Code:
halflength=msquared+m*n;

Last edited:
Arman777
I could not come up with a good algorithm. I tried to think about some approaches however they are too complex or useless. Meanwhile I am also trying to solve problems in CodeAbbey (which have great questions for proggraming).

Thats really fast, impressive.

I don't know what to do. I ll try to think again but I don't have much hope. Maybe you can give some hints ? Otherwise I can try own my own but it will take time.

Arman777 said:
I could not come up with a good algorithm.

Okay then I ll write some code and share by tomorrow.

Better still...write a design for some code.

Better still...write a design for some code.
Okay then ...

I thought something like this

Code:
Create an array (primitive_triples) that contains the sum of the all primitive triple values under 1.5 million.
Take a number from the primitive_triple and multiply it by c (for c in range(2, 1.5 * 10**6/num)) and put these numbers in the array (num_array)
Take the factors of each element in the num_array and then check if two of the factors are in the primitive_triple array.
If it contains, remove all of its multiples from the num_array.
Take the length of the num_array
Repeat this process for every number in the primitive_triples.
Sum the length's of each num_array

Its terrible I guess. The problem is I cannot think any mathematical trick to simplify the process

OK, now let's go through your algorithm by hand not for n = 1,500,000 but for n = 60.

1. Primitive_triples is 12, 30, 40, 56. How did you get that, you say? That's a sign that you need more detail in step one.

2. "Take a number from the primitive_triple..." okay, I'll pick 30, "and multiply it by c (for c in range 2..60)." I get 30, 60, 90, 120...up to 1800.

3. Take the factors of each element in num_array. {2,3,5,6,10,15,30}, {2,3,4,5,6,10,12,15,30,60}... and remove the ones where two factor are in the primitive triple array.

By this point, even doing this by hand should show you a couple of things.
1. You need an algorithm to generate Pythagorean triples.
2. You don't need to be multiplying by c up to 60.
3. It will take forever to get a list of factors of your elements, especially if that list is too long.
4. The thing you are working with, the list of factors of num_array, is not an array.
5. It's not yet obvious that this gets the right answer.
So, why don't you work out by hand the answer for n = 60. I believe the answer is 7. (60 itself is doubled) Then try to write down the algorithm you used.

I guess I wrote my algorithm wrong. Let me explain it again by using n = 60.

primitive_triples_sum = [12, 30, 40 ,56]
create arrays for each number such that
num_array_1 = [12, 24,36,48]
num_array_2 = [30, 60]
num_array_3 = [48]
num_array_4 = [56]

So, why don't you work out by hand the answer for n = 60. I believe the answer is 7. (60 itself is doubled) Then try to write down the algorithm you used.
Actually it was better for trying by hand

Python:
import math

max_sum = 15 * 10**5
pythagorean_triples = []

for m in range(0, 10**3, 2):
for n in range(1, m, 2):
if math.gcd(m, n) == 1 and 2*(m*(m+n)) <= max_sum:
pythagorean_triples.append (2*(m*(m+n)))

for m in range(1, 10**3, 2):
for n in range(0, m, 2):
if math.gcd(m, n) == 1 and 2*(m*(m+n)) <= max_sum:
pythagorean_triples.append (2*(m*(m+n)))

pythagorean_triples.sort()
pythagorean_triples.remove(2) # removing 2 from the list

num_0 = pythagorean_triples[-1]
num_set_0 = {num_0*c for c in range(1, max_sum // num_0 + 1)}

for num in pythagorean_triples[:-1]:
num_set = {num*c for c in range(1, max_sum // num + 1)}
num_set_0 = (num_set_0 | num_set) - (num_set & num_set_0)

print(len(num_set_0))

Maybe you can check for small max_num and if the code is correct then we might try to improve the time..?

Arman777 said:

Why? Or do you mean creating so many named arrays is bad?

Arman777 said:
Maybe you can check for small max_num a

Maybe you could?

Why? Or do you mean creating so many named arrays is bad?

Maybe you could?

Size of the arrays are huge so its bad I think.

Time value table
 max_sum run time result 15 0.18 sec 1 150 0.19 sec 17 1500 0.16 ec 197 15 000 0.45 sec 2332 / 2308 150 000 30.12 sec 23759 / 23521
I am running two different versions of my code and getting different answers. Can you tell me which one is true ? For instance 2332 or 2308 at max_sum = 15 000

This shows that it will find the solution about an hour

My new code

Python:
import math
import time

start = time.perf_counter()
max_sum = 100000
pythagorean_triples = set()

for m in range(0, 10**3, 2):
for n in range(1, m, 2):
if math.gcd(m, n) == 1 and 2*(m*(m+n)) <= max_sum:

for m in range(1, 10**3, 2):
for n in range(0, m, 2):
if math.gcd(m, n) == 1 and 2*(m*(m+n)) <= max_sum:
pythagorean_triples.add(2*(m*(m+n)))pythagorean_triples.remove(2) # removing 2 from the list

num_set_0 = set()
for num in pythagorean_triples:
num_set = {num*c for c in range(1, max_sum // num + 1)}
num_set_0 = (num_set_0 | num_set) - (num_set & num_set_0)

end = time.perf_counter()
print(len(num_set_0))
print("Time:", end - start)

Maybe you could?

How can I ? for small numbers it seems working but for like max_sum ≈ 10 000 I cannot be sure. Maybe both versions of my code is wrong.

Arman777 said:
Can you tell me which one is true ?

The usual way to debug these is to look at solutions that are not in common. But I think you would learn more by listing all 17 less than 150 and checking them by hand.

It is also wise to get the right answer before worrying about speed. Getting the wrong answer quickly is of little help.

Python:
import math
import time

start = time.perf_counter()
max_sum = 15 * 10 ** 5
pythagorean_triples = set()

for m in range(0, 10**3, 2):
for n in range(1, m, 2):
if math.gcd(m, n) == 1 and 2*(m*(m+n)) <= max_sum:

for m in range(1, 10**3, 2):
for n in range(0, m, 2):
if math.gcd(m, n) == 1 and 2*(m*(m+n)) <= max_sum:

pythagorean_triples.remove(2) # removing 2 from the list

num_set_0 = set()
common_num_set = set()
for num in pythagorean_triples:
num_set = {num*c for c in range(1, max_sum // num + 1)}
common_num_set = common_num_set | (num_set & num_set_0)
num_set_0 = (num_set_0 | num_set) - common_num_set

num_set_0.remove(max_sum)

end = time.perf_counter()
print(len(num_set_0))
print("Time:", end - start)

I think I understand what the problem is. For max_sum = 1500 answer is 161 ?

Arman777 said:
For max_sum = 1500 answer is 161 ?
I can't check at the moment, but it's close. (The answer is near max_sum/9)

I can't check at the moment, but it's close. (The answer is near max_sum/9)
I run for the #max_sum = 15\times 10^5# it took 72 min but solved the problem correctly. We can optimize now

Arman777 said:
We can optimize now

OK. You know my first question: "where is your program spending its time?"

OK. You know my first question: "where is your program spending its time?"

Code:
         277502 function calls in 110.125 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:103(release)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:143(__init__)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:147(__enter__)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:151(__exit__)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:157(_get_module_lock)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:176(cb)
2    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:222(_verbose_message)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:232(_requires_builtin_wrapper)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:307(__init__)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:311(__enter__)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:318(__exit__)
4    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:321(<genexpr>)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:369(__init__)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:416(parent)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:424(has_location)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:433(spec_from_loader)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:504(_init_module_attrs)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:576(module_from_spec)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:58(__init__)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:663(_load_unlocked)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:719(find_spec)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:740(create_module)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:748(exec_module)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:765(is_package)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:78(acquire)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:855(__enter__)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:859(__exit__)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:882(_find_spec)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:948(_find_and_load_unlocked)
1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:978(_find_and_load)
1  109.993  109.993  110.125  110.125 P75.py:1(<module>)
13388    0.078    0.000    0.078    0.000 P75.py:23(<setcomp>)
3    0.000    0.000    0.000    0.000 {built-in method _imp.acquire_lock}
1    0.000    0.000    0.000    0.000 {built-in method _imp.create_builtin}
1    0.000    0.000    0.000    0.000 {built-in method _imp.exec_builtin}
1    0.000    0.000    0.000    0.000 {built-in method _imp.is_builtin}
3    0.000    0.000    0.000    0.000 {built-in method _imp.release_lock}
2    0.000    0.000    0.000    0.000 {built-in method _thread.allocate_lock}
2    0.000    0.000    0.000    0.000 {built-in method _thread.get_ident}
1    0.000    0.000    0.000    0.000 {built-in method builtins.any}
1    0.000    0.000  110.125  110.125 {built-in method builtins.exec}
4    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
4    0.000    0.000    0.000    0.000 {built-in method builtins.hasattr}
1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
2    0.000    0.000    0.000    0.000 {built-in method builtins.print}
250000    0.052    0.000    0.052    0.000 {built-in method math.gcd}
2    0.000    0.000    0.000    0.000 {built-in method time.perf_counter}
14044    0.002    0.000    0.002    0.000 {method 'add' of 'set' objects}
1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
2    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
1    0.000    0.000    0.000    0.000 {method 'remove' of 'set' objects}
2    0.000    0.000    0.000    0.000 {method 'rpartition' of 'str' objects}

What is this represents ?

P75.py:1(<module>)---

I guess just running takes the time and using sets which contains large numbers.

Is there some other way than "set" to keep track of what lengths have been used and how often?

Is there some other way than "set" to keep track of what lengths have been used and how often?

Well without using set I cannot do my operations. So my algorithm fails. So I don't think there's within the current algorithm structure. But I am thinking.

Arman777 said:
Well without using set I cannot do my operations.

Nonsense.

What's wrong with a plain old array of 1 to maximum length?

Replies
3
Views
2K
Replies
80
Views
8K
Replies
7
Views
2K
Replies
6
Views
1K
Replies
27
Views
4K
Replies
7
Views
2K
Replies
6
Views
1K
Replies
1
Views
1K
Replies
18
Views
2K
Replies
39
Views
3K