Project Euler Question 60

  • Python
  • Thread starter Arman777
  • Start date
  • #1
1,777
139

Main Question or Discussion Point

https://projecteuler.net/problem=60

I wrote a piece of code to solve the problem but it still takes too much time to solve it Any ideas how can I improve my code, like adjustments that I can make ?

Code:
import itertools
import time

start = time.perf_counter()
def prime(N):
    if N == 0 or N == 1 or N == 5:
        return False
    for i in range(2, int(N ** 0.5) + 1):
        if N % i == 0:
            return False
    return True


def dig_sum(j, n):  # j is the number, n is the element in the list as tuple
    d = 0
    if j == 0:   
        for k in n:
            d += sum(int(digit) for digit in str(k))
        if d%3 == 0:
            return False
        return True
    else:
        s = sum(int(digit) for digit in str(j))
        for k in n:
            w = s + sum(int(digit) for digit in str(k))
            if w % 3 == 0:
                return False
        return True


def con_test(j, n):  # j is the number, n is the element in the list as tuple
    if j == 0:
        w = int(str(n[0]) + str(n[1]))
        w1 = int(str(n[1]) + str(n[0]))
        if prime(w) == False or prime(w1) == False:
            return False
        return True
    else:
        for i in n:
            w = int(str(j) + str(i))
            w1 = int(str(i) + str(j))
            if prime(w) == False or prime(w1) == False:
                return False
        return True


def st(n):            #to get rid of the repeated elements
    H = [tuple(sorted(i)) for i in n]
    G = set(H)
    return G


Primes = [i for i in range(3,8000) if prime(i) == True]   #Prime range 

Com_two = list(itertools.combinations(Primes,2))

G2 = [i for i in Com_two if dig_sum(0,i) == True and con_test(0,i) == True]

G2r = st(G2)

G3 = [(i,*j) for i in Primes for j in G2r if dig_sum(i,j) == True and con_test(i,j) == True ]

G3r = st(G3)

G4 = [(i,*j) for i in Primes for j in G3r if dig_sum(i,j) == True and con_test(i,j) == True ]

G4r = st(G4)

G5 = [(i,*j) for i in Primes for j in G4r if dig_sum(i,j) == True and con_test(i,j) == True ]

F = [(sum(j),j) for j in G5]

end = time.perf_counter()
Time = end-start
print(Time)
print(F)
When I set Primes range to 8000 it took 18 min to give an answer I checked the solution for the 4 set of primes and it gives me the right result.(If you want to check print (G4r))

Any ideas how can I make my program faster ?
 

Answers and Replies

  • #2
11,674
5,247
You should try timing each function individually to see how much time they take and then from there optimize the specific functions.

Its often said that program execution spends 80% executing 20% of the code. Try to find that 20% code and optimize it.
 
  • #3
1,777
139
You should try timing each function individually to see how much time they take and then from there optimize the specific functions.

Its often said that program execution spends 80% executing 20% of the code. Try to find that 20% code and optimize it.
I find the answer at by setting prime limit 10.000 . Yey for that it took 41 min. I ll try to I guess it mostly does it on the G2 level and then decreases cause the options are decreasing but G2 level is the starting point. I also think that its probably due to the algorithm . I mean I am not sure you understand what I am trying to do but I think there must be a mathematical trick.
 
  • #4
11,674
5,247
Yeah, I’m not sure what the code is doing but when faced with something like this that has performance issues, I try to identify what function is being the time hog and go from there.
 
  • Like
Likes Arman777
  • #5
jim mcnamara
Mentor
3,870
2,252
Consider profiling, ex: cProfile. This shows you how much time is spent by each function/method. Trying to guess is not the best choice unless you are very experienced.

How to profile, options - https://devopedia.org/profiling-python-code
 
  • Like
Likes Arman777
  • #6
verty
Homework Helper
2,164
198
Use a sieve to sieve primes up to say 10 million. I think your time hog is that prime check procedure.
 
  • Like
Likes Arman777
  • #7
1,777
139
Yeah, I’m not sure what the code is doing
Its my bad that I didnt explain it but I guess I dont need to from now on ( unless you want to).


Consider profiling, ex: cProfile. This shows you how much time is spent by each function/method. Trying to guess is not the best choice unless you are very experienced.

How to profile, options - https://devopedia.org/profiling-python-code
I finally manage to run it . I am saying finally cause it was hard for me to find how to run it. I add the image.

Functions have names and you see the time that took to complete them.

Line 63 took 168 sec and thats the G3 = [] line other then that I run this code for prime range 5000.

Use a sieve to sieve primes up to say 10 million. I think your time hog is that prime check procedure.
İt seems that I need a better prime tester and a better con_test function.

227.948833517
[]
Code:
         96268183 function calls in 227.950 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.018    0.018  227.950  227.950 pps.py:1(<module>)
  5775553    9.231    0.000   34.385    0.000 pps.py:16(dig_sum)
  2093904    0.437    0.000    0.437    0.000 pps.py:20(<genexpr>)
 26176944    6.086    0.000    6.086    0.000 pps.py:25(<genexpr>)
 40268129    9.080    0.000    9.080    0.000 pps.py:27(<genexpr>)
  2891920    5.823    0.000  191.264    0.000 pps.py:33(con_test)
        3    0.001    0.000    0.007    0.002 pps.py:49(st)
        3    0.004    0.001    0.006    0.002 pps.py:50(<listcomp>)
        1    0.001    0.001    0.006    0.006 pps.py:55(<listcomp>)
        1    0.068    0.068    9.512    9.512 pps.py:59(<listcomp>)
        1    1.642    1.642  168.858  168.858 pps.py:63(<listcomp>)
        1    0.549    0.549   48.872   48.872 pps.py:67(<listcomp>)
  3886258  185.447    0.000  185.447    0.000 pps.py:7(prime)
        1    0.009    0.009    0.675    0.675 pps.py:71(<listcomp>)
        1    0.000    0.000    0.000    0.000 pps.py:73(<listcomp>)
        1    0.000    0.000  227.950  227.950 {built-in method builtins.exec}
        2    0.001    0.000    0.001    0.000 {built-in method builtins.print}
    12885    0.002    0.000    0.002    0.000 {built-in method builtins.sorted}
 15162571    9.552    0.000   25.154    0.000 {built-in method builtins.sum}
        2    0.000    0.000    0.000    0.000 {built-in method time.perf_counter}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
 

Attachments

Last edited by a moderator:
  • #8
1,777
139
I changed my prime code to Miller–Rabin primality test ( I didnt write the code myself I copied ) and I find the result in 8 min 30 sec. I also need to change the con_test() function for improved result ..
 
Last edited:
  • #9
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
2019 Award
24,305
7,127
I think your time hog is that prime check procedure.
This runs counter to the advice earlier in the thread that the OP should learn to profile. He won't learn that if we tell him what the answer is. This is especially true if we guess.
 
  • #10
11,674
5,247
That's great!

One rule of thumb that you should consider is adding a comment about where you got the copied code. In a professional environment, this may not be allowed and you might need to rewrite it (or get someone in your company untouched by seeing the code) in order to make it original code to protect your company from copyright laws.
 
  • Like
Likes Vanadium 50 and Arman777
  • #11
verty
Homework Helper
2,164
198
This runs counter to the advice earlier in the thread that the OP should learn to profile. He won't learn that if we tell him what the answer is. This is especially true if we guess.
True. He should learn to profile his code and I see he did just that. So I'll leave it up to him now.
 
  • #12
rcgldr
Homework Helper
8,682
520
One issue is that Python is interpretative, so that will consume a lot of time compared to a compiled language like C or C++. I'm wondering how much faster the same algorithm would be in C or C++.
 
  • Like
Likes Arman777
  • #13
1,777
139
One issue is that Python is interpretative, so that will consume a lot of time compared to a compiled language like C or C++. I'm wondering how much faster the same algorithm would be in C or C++.
thats really good question Sadly I dont know C...And I dont think anyone here will try to convert it. I couldnt find a faster way to do the con_test(), so thats also another thing that can make the program run faster.
 
  • #14
11,674
5,247
You could try converting it to Julia. Its hot right now in Data Science and is arguably much faster than Python.

Python compiles its code into byte codes and follow on executions use that saving some conversion time. I'd run your algorithm a couple of times using the 'time' command if you were on Linux or MacOS.
 
  • #15
1,777
139
This is the last version with
You could try converting it to Julia. Its hot right now in Data Science and is arguably much faster than Python.

Python compiles its code into byte codes and follow on executions use that saving some conversion time. I'd run your algorithm a couple of times using the 'time' command if you were on Linux or MacOS.
I dont know it too.But can it make a really huge difference ? Like minutes
 
  • #18
11,674
5,247
You have to take it with a grain of salt. Performance tests are notoriously difficult to get right. Developers will do anything to get bragging rights for speed especially in situations where money is involved. In this case, there’s a Julia consulting company being formed by the original creators so of course they’ll want Julia to win.

As an example, the developers might optimize the underlying code to do especially well for specific test cases in order to beat a competitor. The database engine wars were like that. It meant that one database product would outperform a competitor one year and the competitor would fire back on the next release. To the ordinary user both products would perform well enough to not really matter but products are first selected based on these numbers and so there’s a sales opportunity and followon upgrades that turns into big money especially for enterprise databases.
 
  • Like
Likes QuantumQuest
  • #19
fluidistic
Gold Member
3,662
104
You guys are missing an important thing. Project Euler is notoriously known for problems that aren't approachable via brute force (though you certainly can solve a few of them that way). What I mean is that, it is both a mathematical and programming challenge. Usually most problems can be solved under 2 minutes using Python. If this isn't the case, then the approach is to be reconsidered. So before going blindly into profiling and things of the sort, the very first thing to think about, is whether the mathematical approach is reasonably well suited, keeping in mind that brute force approaches are to be discarded.

Another thing. Since the OP actually did get the correct result, he has access to the solution to this problem posted by others. And since Python is very popular, he can directly compare his code and therefore his method, to others.
 
  • #20
1,777
139
You guys are missing an important thing. Project Euler is notoriously known for problems that aren't approachable via brute force (though you certainly can solve a few of them that way). What I mean is that, it is both a mathematical and programming challenge. Usually most problems can be solved under 2 minutes using Python. If this isn't the case, then the approach is to be reconsidered. So before going blindly into profiling and things of the sort, the very first thing to think about, is whether the mathematical approach is reasonably well suited, keeping in mind that brute force approaches are to be discarded.

Another thing. Since the OP actually did get the correct result, he has access to the solution to this problem posted by others. And since Python is very popular, he can directly compare his code and therefore his method, to others.
I think that my code would work under 1 min in C/C++ . I didnt use brute force , but I also couldnt find a better alogrithm under then this. My code works like this.

First I made a set of 2 primes that satisfies two conditions.The first condition was the digit sum cannot be 3. The other condition was the concentrating test. After finding these all 2 prime sets, I tried to add the third prime to the set and again I checked these 2 conditions are satisfied. After that, I tried to add the fourth prime and so on.

The best python solution that I noticed is by ximera2 :and he uses prime dictionary and then using it (and also other sort of thing he manages to reduces time 26.5sec)

İn C and Java I saw 2 sec solution codes. I saw another python solution using graph theory and finding it in seconds.

I dont think my algorithm is bad but I am not a good coder so this is best I can do.
 
  • #21
1,777
139
You have to take it with a grain of salt. Performance tests are notoriously difficult to get right. Developers will do anything to get bragging rights for speed especially in situations where money is involved. In this case, there’s a Julia consulting company being formed by the original creators so of course they’ll want Julia to win.

As an example, the developers might optimize the underlying code to do especially well for specific test cases in order to beat a competitor. The database engine wars were like that. It meant that one database product would outperform a competitor one year and the competitor would fire back on the next release. To the ordinary user both products would perform well enough to not really matter but products are first selected based on these numbers and so there’s a sales opportunity and followon upgrades that turns into big money especially for enterprise databases.
I see your point ..the code looks like python actually but the speed is near C thats amazing actually. I searched a bit and actually its normal to be too slow compared to C or Java etc.
 
  • Like
Likes jedishrfu
  • #22
fluidistic
Gold Member
3,662
104
I think that my code would work under 1 min in C/C++ . I didnt use brute force , but I also couldnt find a better alogrithm under then this. My code works like this.

First I made a set of 2 primes that satisfies two conditions.The first condition was the digit sum cannot be 3. The other condition was the concentrating test. After finding these all 2 prime sets, I tried to add the third prime to the set and again I checked these 2 conditions are satisfied. After that, I tried to add the fourth prime and so on.

The best python solution that I noticed is by ximera2 :and he uses prime dictionary and then using it (and also other sort of thing he manages to reduces time 26.5sec)

İn C and Java I saw 2 sec solution codes. I saw another python solution using graph theory and finding it in seconds.

I dont think my algorithm is bad but I am not a good coder so this is best I can do.
I do not find your code very Pythonic in that it doesn't really deal with lists (you do use list() and list comprehension, but that's about it). Then the redundant if ... == True. Which can be replaced by if ...
Then, way too many loops and if conditions, this looks like C or something like that, which Python is not terribly fast at.
Plus, I find the code hard to understand, I do not have the time to understand it even though it would be nice.
Now, by reading the EP question, the first thing that comes to mind is... should I start by taking 5 primes from 1,1,1,1,1 and check whether they satisfy the condition? Or can I use a math trick and take 4 of them as 3, 7, 109, 673, and the 5th one from 1 (or M >1) ?
I do not know what your code is doing there, but if it starts from 5 low prime numbers and gradually increase them as to find the answer, this is clearly not what is intended. IMO, you should dissect and fully understand the quickest ways people have found to tackle the problem. Then it should be obvious to you, what you were doing wrong. And I'm sure it's related to the math of the problem, not that much about the programming, which undoubtedly could be improved.

Edit before posting: After rereading your post, I do not think starting by a list of 2 primes that satisfies the condition is the way to go, especially when the problem statement provided 4 of them. But that's just the tip of the iceberg here. There are hidden mathematical tricks that you did not use, that you could have used, for sure.
 
  • #23
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
2019 Award
24,305
7,127
Or can I use a math trick and take 4 of them as 3, 7, 109, 673, and the 5th one from 1 (or M >1)
It is true that the smallest 5-plet includes a 4-plet (five of them, in fact). Is it true that the smallest 5-plet includes the smallest 4-plet? I don't see how that immediately follows.
 
  • Like
Likes Arman777
  • #24
1,777
139
Edit before posting: After rereading your post, I do not think starting by a list of 2 primes that satisfies the condition is the way to go, especially when the problem statement provided 4 of them. But that's just the tip of the iceberg here. There are hidden mathematical tricks that you did not use, that you could have used, for sure.
I didnt understand it. Provided 4 numbers are just example and have nothing to do with the solution.

I do not know what your code is doing there, but if it starts from 5 low prime numbers and gradually increase them as to find the answer, this is clearly not what is intended.
I dont think that would work. Its just simply brute froce with 5 loops which, it would take much more time to test ( if they are prime or not and the concatenate part).

Think about it, 5 numbers and you have to test each two pair thats like 10 combination then you have 5 loops and each range should be 3000. which thats like 3000**5 ?? Theres no possible way that it will give the result under 1 min.

Then it should be obvious to you, what you were doing wrong. And I'm sure it's related to the math of the problem, not that much about the programming, which undoubtedly could be improved.
Yes I know this is not the perfect algorithm but thats okay...I am not a math professor or a computer scientist. I am doing this for hobby and this algorithm was the only one that I could have think of.
 
Last edited:
  • #25
StoneTemplePython
Science Advisor
Gold Member
2019 Award
1,151
560
OP is still a very new programmer-- he's just quite the Project Euler enthusiast. I am one of the biggest boosters of Julia on this forum, but it's an immature language (version ##\lt 1.0##) and sometimes limited support for questions. This is quite difficult if you are new to programming. I wouldn't recommend it to OP at this time.

I saw another python solution using graph theory and finding it in seconds.
This is what I'd suggest OP focuses on. Learning more powerful algorithmic techniques as part of solving these problems... will go a long way. And yes techniques related to graph theory are in this realm. It'll be a struggle to learn these algorithms, but a very productive one.

The nice thing about Project Euler is you can put in a lot of work to get a solution and as a result you get some insights for the problem. Then you see how other people solved it, and as a result get considerably more insights.

edit:
Julia 1.0 was just released:
http://news.mit.edu/2018/mit-developed-julia-programming-language-debuts-juliacon-0827

(still too an immature a language for me to recommend to OP at this time, but I hadn't realized this and need to upgrade on my machine.)
 
Last edited:
  • Like
Likes Arman777

Related Threads on Project Euler Question 60

  • Last Post
Replies
5
Views
461
  • Last Post
Replies
4
Views
763
  • Last Post
Replies
14
Views
765
  • Last Post
Replies
8
Views
4K
Replies
13
Views
1K
Replies
31
Views
3K
Replies
7
Views
509
  • Last Post
Replies
8
Views
976
  • Last Post
Replies
13
Views
18K
Top