Python Optimizing Prime Pair Combinations for Project Euler Problem 60

  • Thread starter Thread starter Arman777
  • Start date Start date
  • Tags Tags
    Euler Project
Click For Summary
The discussion revolves around optimizing a Python solution for Project Euler Problem 60, which involves finding sets of prime numbers that can concatenate to form new primes. The original code is inefficient, taking over 18 minutes to execute, prompting suggestions for profiling to identify time-consuming functions. Recommendations include using a more efficient prime-checking algorithm, such as the Miller-Rabin test, and employing a sieve method to generate primes. Participants emphasize the importance of mathematical approaches over brute force methods, suggesting that the problem may require a reevaluation of the algorithm used. Overall, the conversation highlights the balance between coding efficiency and mathematical strategy in solving complex problems.
  • #31
Arman777 said:
what do you mean ? I am finding two pair then adding third prime by applying the rules.

Okay, I have to be honest and say I don't think it'll save you much time, but I think you are reading the primes from P1 multiple times. It could be that you want to do that, I don't know. By the way, does your code prove that you have found the group with the lowest sum?
 
Technology news on Phys.org
  • #32
verty said:
but I think you are reading the primes from P1 multiple times.
Before it was the "Prime" array and now its the "P1" array . When I use the P1 array I no longer need to use sum function which reduces the time 8.5 min to 2 min.
verty said:
By the way, does your code prove that you have found the group with the lowest sum?
It does not prove no, it just shows the half of it but this half gives me the answer. So I wanted to share this half. And also other half gives us nothing ( no 5 prime pair that satisfy the condition)
 
  • #33
Arman777 said:
I changed my prime code to Miller–Rabin primality test ( I didnt write the code myself I copied )

You also didn't check the code yourself. :smile:

If you simply print(len(Primes)) after finding them, you will discover it varies (!) from 1240 to 1250 or so for 10,000 primes. 1228 is the correct answer.
 
  • #34
Vanadium 50 said:
You also didn't check the code yourself. :smile:

If you simply print(len(Primes)) after finding them, you will discover it varies (!) from 1240 to 1250 or so for 10,000 primes. 1228 is the correct answer.
I didnt understand what you mean.

Edit: Yes I did, there's a reason for that. It take random numbers and tests that accordingly to determine if the number is prime or not.

Python:
_mrpt_num_trials = 5

This part is shows us how many random number do we want. If its higher the solution gets more accurate In my code it was 1 so its normal that my code will give different results each time. I select it 1 to make it much faster and still it gives the solution.
 
Last edited:
  • #35
Arman777 said:
It take random numbers and tests that accordingly to determine if the number is prime or not.

No, it doesn't. If it did, it would give 1228 primes.

Arman777 said:
I select it 1 to make it much faster and still it gives the solution.

Yes, but it's not testing to see if the pairs actually are prime or not. So it's not doing either what you say it is doing or what the question asks. Yes, it gives the correct answer, but so does print("26033"). Do you consider that a solution?
 
  • #36
Vanadium 50 said:
No, it doesn't. If it did, it would give 1228 primes.
How so ?
Vanadium 50 said:
Yes, but it's not testing to see if the pairs actually are prime or not. So it's not doing either what you say it is doing or what the question asks. Yes, it gives the correct answer, but so does print("26033"). Do you consider that a solution?
Yes I do.
 
  • #37
Well, if you consider print(26033) a solution, you're not going to find anything faster.
 
  • #38
Then I am the one who solved the problem fastest, that's great!
 
  • #39
Arman, you should write your code to solve both parts at once, so that you run it and it proves the result as well. Only then can you say you have solved it.
 
  • #40
Even that doesn't do it. It shows that he's found a solution with all the primes less than 8389. He then needs to test all the way up to 26000 to verify that there are no solutions with a smaller sum which have four small numbers and one big one.
 
  • #41
Vanadium 50 said:
Even that doesn't do it. It shows that he's found a solution with all the primes less than 8389. He then needs to test all the way up to 26000 to verify that there are no solutions with a smaller sum which have four small numbers and one big one.
Yes you are right. But I didnt get and answer about why my prime function is wrong ? Also can you solve and prove it under 1 min ? I am really curious about it Since all of you think " I didnt solve the problem under 1 min"
 
  • #42
I solved the problem. If my last code is wrong my previous one can solve it in 8.5 min . For proving part I think it will take more time or maybe I could have done smthing else. I solved the problem in any case just not in 1 min or in 2 min but maybe in an hour or in 30 min.

As I said if you say I didnt solve it cause it says under min 1. Then I am expecting you guys to solve (and prove) it under 1 min without using graph theory (which we know its the main solution) otherwise no one have right to say I didnt solve it.

I see many people on the PE forum that shares solution which took more then 1 min or they are saying " my code without prove " . I am not the only one
 
Last edited:
  • #43
Another thing is If you guys wanted to help you could have just copied my code and show where I am doing wrong or how can I improve it but no. I only see how did I do wrong without particulary showing where did I do wrong. Which it wasnt helpful at all. You told me I did a wrong on prime function and I asked " how so" but all I got was " you didnt prove it you are doing wrong ". If that's helping, you are doing great job.
 
Last edited:
  • #44
Arman777 said:
I solved the problem. ... As I said if you say I didnt solve it cause it says under min 1. Then I am expecting you guys to solve (and prove) it under 1 min without using graph theory (which we know its the main solution) otherwise no one have right to say I didnt solve it.

Why do you say graph theory is the main solution? I wrote some pseudo-code a few days ago to solve this, it runs in O(n^2). I think that'd be quite hard to beat. So I wouldn't say that's the main solution. The way I wrote it, I consider it to be quite a mainstream solution. Anyway, your code should show that no other combination exists, is what I meant.
 
  • #45
verty said:
? I wrote some pseudo-code a few days ago to solve this, it runs in O(n^2).

Is it truly O(n^2) or is there a dominant O(n^2) piece but the code is formally O(n^5)? That describes my code, which takes 1 or 2 seconds to run, the vast majority of time spent ensuring there is not a better solution. I used STL's unordered_map for the O(n^2) piece. Also, unlike Arman777's non-solution, this uses a primality tester that actually works.

verty said:
Anyway, your code should show that no other combination exists, is what I meant.

Actually, other combinations do exist. 7, 1237, 2341, 12409 and 18433 is one, although it sums to 34427. My original code finds it before the true solution, 13, 5197, 5701, 6733 and 8389. (My modified code only checks pairs that are less than 26033). The reason Arman777 doesn't find it is he stops checking at 9000. That does not guarantee he's got the right solution. To my mind, a valid solution a) needs to find 26033, b) would find any smaller solution if it exists, and c) not find it by accident - i.e. use a fast-but-wrong primality tester that happens to give the right answer in this particular case.
 
  • #46
Vanadium 50 said:
Is it truly O(n^2) or is there a dominant O(n^2) piece but the code is formally O(n^5)?

I don't know the formalities that well to be honest. It's certainly dominated by an O(n^2) piece.
 
  • #47
  • #48
verty said:
I don't know the formalities that well to be honest. It's certainly dominated by an O(n^2) piece.

If your code is like mine, it has a O(n^2) piece followed by five nested loops, so is formally O(n^5), If n is large enough, eventually that part dominates.

sysprog said:
The problem of determining whether a given number is prime was proven to be in P (not NP-hard or NP-complete) in 2002.

That would be helpful in this case if the numbers were very very large, but as you can see, we're talking 4 or 5 digits. Plenty of algorithms exist that are plenty fast. The key to take away from the profile is not that the code spends 82% of its time checking primes, but that it does it 3,886,258 times.
 
  • #49
No, okay it's a little bit different. It runs until it has proven the result and has one hook for a check procedure. I didn't actually write the check procedure, I call it constant but it does a varying amount of work. I think I see now where O(n^5) comes from. But the check procedure wouldn't be slow at all. Anyhow, with modern computers it isn't such a concern because they have tons of memory and are super fast as well. The old mentality of saving a few cycles here and there is not so important anymore.
 
  • #50
I made a mistake in my pseudo-code. The way I wrote it, the check procedure would be duplicating work. It was thematic but it was wasteful. It wasn't the solution I was hoping for.
 
  • #51
As pointed out, the thing that takes the longest is the prime checking. My algorithm would do about 310,000 checks where Arman777's did 3.9M. So that alone would save 75% of the time.and bring it to under a minute. However, to verify that there's not a second solution requires about 900,000. Two steps forward, one step back.
 
  • #52
Vanadium 50 said:
which takes 1 or 2 seconds to run

Let me be clear - it's 1 or 2 seconds to find the solution, and a few minutes to verify that it is indeed the smallest.

Code:
#include <iostream>
#include <cstdio>
#include <forward_list>
#include <unordered_map>
#include <string>

using namespace std;

/* This is all taken from Rosetta Code's deterministic Miller-Rabin test */

// calcul a^n%mod
unsigned long long power(unsigned long a, unsigned long n, unsigned long mod)
{
    unsigned long long power = a;
    unsigned long long result = 1;
    while (n)
    {
        if (n & 1)
            result = (result * power) % mod;
        power = (power * power) % mod;
        n >>= 1;
    }
    return result;
}
// n−1 = 2^s * d with d odd by factoring powers of 2 from n−1
bool witness(unsigned long long n, unsigned long long s, unsigned long long d, unsigned long long a)
{
    unsigned long long x = power(a, d, n);
    unsigned long long y;
    while (s) {
        y = (x * x) % n;
        if (y == 1 && x != 1 && x != n-1)
            return false;
        x = y;
        --s;
    }
    if (y != 1)
        return false;
    return true;
}
/*
 * if n < 1,373,653, it is enough to test a = 2 and 3;
 * if n < 9,080,191, it is enough to test a = 31 and 73;
 * if n < 4,759,123,141, it is enough to test a = 2, 7, and 61;
 * if n < 1,122,004,669,633, it is enough to test a = 2, 13, 23, and 1662803;
 * if n < 2,152,302,898,747, it is enough to test a = 2, 3, 5, 7, and 11;
 * if n < 3,474,749,660,383, it is enough to test a = 2, 3, 5, 7, 11, and 13;
 * if n < 341,550,071,728,321, it is enough to test a = 2, 3, 5, 7, 11, 13, and 17.
 */
bool is_prime_mr(unsigned int n)
{
    if (((!(n & 1)) && n != 2 ) || (n < 2) || (n % 3 == 0 && n != 3))
        return false;
    if (n <= 3)
        return true;
    unsigned long d = n / 2;
    unsigned long s = 1;
    while (!(d & 1)) {
        d /= 2;
        ++s;
    }
    if (n < 1373653)
        return witness(n, s, d, 2) && witness(n, s, d, 3);
    if (n < 9080191)
        return witness(n, s, d, 31) && witness(n, s, d, 73);
    if (n < 4759123141)
        return witness(n, s, d, 2) && witness(n, s, d, 7) && witness(n, s, d, 61);
    if (n < 1122004669633)
        return witness(n, s, d, 2) && witness(n, s, d, 13) && witness(n, s, d, 23) && witness(n, s, d, 1662803);
    if (n < 2152302898747)
        return witness(n, s, d, 2) && witness(n, s, d, 3) && witness(n, s, d, 5) && witness(n, s, d, 7) && witness(n, s, d, 11);
    if (n < 3474749660383)
        return witness(n, s, d, 2) && witness(n, s, d, 3) && witness(n, s, d, 5) && witness(n, s, d, 7) && witness(n, s, d, 11) && witness(n, s, d, 13);
    return witness(n, s, d, 2) && witness(n, s, d, 3) && witness(n, s, d, 5) && witness(n, s, d, 7) && witness(n, s, d, 11) && witness(n, s, d, 13) && witness(n, s, d, 17);
}/* This was all taken from Rosetta Code's deterministic Miller-Rabin test */

bool is_prime(unsigned int n)
{
   if((n >  2) && !(n %  2)) return false;
   if((n >  3) && !(n %  3)) return false;
   if((n >  5) && !(n %  5)) return false;
   if((n >  7) && !(n %  7)) return false;
   if((n > 11) && !(n % 11)) return false;
   if((n > 13) && !(n % 13)) return false;
   if((n > 17) && !(n % 17)) return false;
   if((n > 19) && !(n % 19)) return false;
   if((n > 23) && !(n % 23)) return false;
   if((n > 29) && !(n % 29)) return false;
   if((n > 31) && !(n % 31)) return false;
   if((n > 37) && !(n % 37)) return false;
   if((n > 41) && !(n % 41)) return false;
   if((n > 43) && !(n % 43)) return false;
   if((n > 47) && !(n % 47)) return false;
   if((n > 53) && !(n % 53)) return false;
   if((n > 59) && !(n % 59)) return false;
   if((n > 61) && !(n % 61)) return false;
   if((n > 67) && !(n % 67)) return false;
   if((n > 71) && !(n % 71)) return false;
   if((n > 73) && !(n % 73)) return false;
   if((n > 79) && !(n % 79)) return false;
   return is_prime_mr(n);
}

bool is_valid_pair(unsigned int a, unsigned int b)
{
  if(a >= b) return false;
  if(a+b > 26021) return false;

  // Only prime numbers should be past this point, so no need to test a and b for primality
  if((a+b)%3 == 0) return false;
  if (is_prime(stol(to_string(a)+to_string(b)))==false) return false;
  if (is_prime(stol(to_string(b)+to_string(a)))==false) return false;
  return true;
}

main() {

double duration;
int nprimes;
forward_list<unsigned int> Primes;
unordered_map<unsigned int,bool> Pairs;

// Find the primes

for(unsigned int i=2;i<26000;i++) if(is_prime(i)) Primes.push_front(i);
Primes.reverse();// Find the prime pairs

for ( auto it = Primes.begin(); it != Primes.end(); ++it ) {
   for ( auto jt = next(it,1); jt != Primes.end(); ++jt ) {
      if(is_valid_pair(*it, *jt)) Pairs.emplace (stol(to_string(*it)+to_string(*jt)),true);
      else Pairs.emplace (stol(to_string(*it)+to_string(*jt)),false);
   }
}// Now build the tuples

unsigned int smallest = 26034;
for ( auto it = Primes.begin(); it != Primes.end(); ++it ) {
   for ( auto jt = next(it,1); jt != Primes.end(); ++jt ) {
      if(Pairs[stol(to_string(*it)+to_string(*jt))]) {
         for ( auto kt = next(jt,1); kt != Primes.end(); ++kt ) {
             if(Pairs[stol(to_string(*it)+to_string(*kt))] &&
                Pairs[stol(to_string(*jt)+to_string(*kt))]) {
                  for(auto lt = next(kt,1); lt != Primes.end(); ++lt ) {
                    if(Pairs[stol(to_string(*it)+to_string(*lt))] &&
                       Pairs[stol(to_string(*jt)+to_string(*lt))] &&
                       Pairs[stol(to_string(*kt)+to_string(*lt))]) {
                         for(auto mt = next(lt,1); mt != Primes.end() && *it+*jt+*kt+*lt+*mt < smallest; ++mt ) {
                           if(Pairs[stol(to_string(*it)+to_string(*mt))] &&
                              Pairs[stol(to_string(*jt)+to_string(*mt))] &&
                              Pairs[stol(to_string(*kt)+to_string(*mt))] &&
                              Pairs[stol(to_string(*lt)+to_string(*mt))]) {
                                if(*it+*jt+*kt+*lt+*mt < smallest) {
                                   smallest = *it+*jt+*kt+*lt+*mt;
                                   cout << *it << " " << *jt << " " << *kt <<" " << *lt << "  " << *mt << " = ";
                                   cout << smallest << endl;
                                 } // smallest thus far
                            } // 5-plet
                         } // m-loop
                    } // 4-plet
                  } // l-loop
                } // 3-plet
          } // k-loop
      } // 2-plet
   } // j-loop
} //i-loop

return 0;

}

What's good? The fact that not only is the primality test accurate, it's faster.
What's bad? Keying the unordered_map on the concatenation of the two numbers. It would have been wiser to have used the product. It's also not clear to me that I wouldn't do as well or better using an array and my own hash function.

Oh, and I abandoned the hypothesis that numbers in multiple tuples are sparse. They aren't.
 
  • Like
Likes Arman777
  • #53
How's this for an interesting approach? This is not how I was doing it but it is an interesting idea. I sieve out the primes that concatenate with at least 4 other primes. So for example, I sieve primes to 30000, then check how many each prime concatenates with. This is O(n^2). I take only those primes that have 4 or more partners. Of those primes, I find the smallest mutually concatenating set (quick), then I see if I sieved enough numbers to be sure a smaller sum is not possible. For example, if I sieved to sum - (3+7+11+13), I sieved enough numbers.
 
  • #54
verty said:
I find the smallest mutually concatenating set (quick)

It sounds interesting, but I'm not quite sure what you mean by this. Can you elaborate?
 
  • #55
Its good code. What did you use in algorithm ? Like Its faster due to the prime number checking ? And you said you are not checking 3.9 million primes but 310k how did you do that ?

In the PE forum I saw 5 line python code (using graph theory) and solved the problem in seconds but I am not sure it proves it or not. If it also proves that's amazing.
 
  • #56
Arman777 said:
And you said you are not checking 3.9 million primes but 310k how did you do that ?

I check all the potential pairs for primality first. Then if I need to know if a number is prime, I simply look it up. Your code checks the same number on average 12-1/2 times. If 11 was prime a second ago, it's probably still prime. :wink:

There are two ways to speed up code. You can do the same work faster, or you can do less work. Most people focus on the former, but big gains usually come from the latter.
 
  • Like
Likes Arman777
  • #57
Vanadium 50 said:
It sounds interesting, but I'm not quite sure what you mean by this. Can you elaborate?

I was thinking that one could reduce the number of primes to be tested by sieving the primes that had at least 4 partners. So they would be primes that were very popular, let's say, with regard to the concatenation. But it turns out all primes are popular. So that won't work.

Anyway, I've written up some PHP code and it runs in about 1 minute but it is buggy, it has some spurious behaviour. But the way I wrote it is sound. I didn't look at your code but I see you did something similar but not quite the same. I could give a rundown but I don't know if I should because it kind of defeats the object for anyone trying to solve this. They have one approach explained above but they should follow their own approach, whatever that is.
 
  • #58
Vanadium 50 said:
I check all the potential pairs for primality first. Then if I need to know if a number is prime, I simply look it up. Your code checks the same number on average 12-1/2 times. If 11 was prime a second ago, it's probably still prime. :wink:

There are two ways to speed up code. You can do the same work faster, or you can do less work. Most people focus on the former, but big gains usually come from the latter.
Thanks for the explanation. I understand it now.
 
  • #59
I tried to find another approach so I looked the concatenate rule for numbers. And I find this in Wolfram

Python:
import math
a = 12
b = 136
y= a*(10**(int(math.log10(b))+1))+b

We know that y must be prime and also a and b. I thought maybe we can find a mathematical trick or fast way to separate primes to find 5 set numbers, but I couldn't find something useful.
 
  • #60
verty said:
I was thinking that one could reduce the number of primes to be tested by sieving the primes that had at least 4 partners. So they would be primes that were very popular, let's say, with regard to the concatenation. But it turns out all primes are popular. So that won't work.

I was thinking the same thing. I think out of the whole set four primes are unpopular.

One thing I was pondering was the reverse. I already have found valid pairs. If there were a fast way to select the disjoint set of pairs, I could quickly make quadruplets.

Arman777 said:
I tried to find another approach so I looked the concatenate rule for numbers. And I find this in Wolfram

That's going to be very slow, since you are doing a logarithm and an exponentiation. But more importantly, it shows you have missed the entire point of profiling your code. If you are spending less than 6 seconds concatenating numbers, what's the maximum gain you can have by speeding up that piece. For instance...

Vanadium 50 said:
What's bad? Keying the unordered_map on the concatenation of the two numbers. It would have been wiser to have used the product.

Switching from the concatenation to the product is an order of magnitude improvement. The code provably finds the correct solution in 11 seconds.
 

Similar threads

Replies
55
Views
6K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 5 ·
Replies
5
Views
5K
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 3 ·
Replies
3
Views
5K
  • · Replies 80 ·
3
Replies
80
Views
10K