Optimizing Prime Pair Combinations for Project Euler Problem 60

Jarvis323 · Sep 7, 2018

Vanadium 50 said:

It's not. Your code would claim 13 is not prime because it's divisible by 13.

Can't believe I missed that.

Anyway, it seams like this is a working faster version of that block of code.

C:

    if( n > 79 &&
     ! ( ( (  n %  2 ) ) |
         ( (  n %  3 ) ) |
         ( (  n %  5 ) ) |
         ( (  n %  7 ) ) |
         ( (  n % 11 ) ) |
         ( (  n % 13 ) ) |
         ( (  n % 17 ) ) |
         ( (  n % 19 ) ) |
         ( (  n % 23 ) ) |
         ( (  n % 29 ) ) |
         ( (  n % 31 ) ) |
         ( (  n % 37 ) ) |
         ( (  n % 41 ) ) |
         ( (  n % 43 ) ) |
         ( (  n % 47 ) ) |
         ( (  n % 53 ) ) |
         ( (  n % 59 ) ) |
         ( (  n % 61 ) ) |
         ( (  n % 67 ) ) |
         ( (  n % 71 ) ) |
         ( (  n % 73 ) ) |
         ( (  n % 79 ) ) ) )
    {
       return false;
    }

Jarvis323 · Sep 7, 2018

Vanadium 50 said:

It probably would, but this is a case of "speed up the work" as opposed to "do less work". The problem says that 3, 7, 109 and 673 form a 4-plet. 673 is the 122nd prime, so you might expect 1 in 30 or so numbers participates in a 4-plet. That suggests the strategy of brute-force to find the 4-plets, which should take 10's of milliseconds, and then test for 5-plets based only on numbers in the 4-plets.

True, but I was responding to this previous post.

Vanadium 50 said:

Can 11 seconds be improved upon? It's pretty good for what is essentially brute force.

Vanadium 50 · Sep 7, 2018

Jarvis323 said:

Anyway, it seams like this is a working faster version of that block of code.

I don't think so. 13 is not greater than 79, so that evaluates to false, so the whole expression evaluates to false, so therefore 13 is not prime.

More to the point, though, the code is not spending its time here. Only 10% of the time is spent on primality tests and the bulk of this is in the Miller-Rabin phase, not the small number checking phase. If you spend 0.1% of your time testing small primes, and you make it infinitely faster, how much do you gain?

Arman777 · Sep 7, 2018

Whats the fastest deterministic prime tester in python ?

My code is somthing like this

Python:

def prime(n):
    if n < 2:
        return False
    elif n == 2:
        return True
    else:
        if n % 2 == 0:
            return False
        for i in range(3,int(n**0.5)+1,2):
            if n%i == 0:
                return False
        return True

Arman777 · Sep 7, 2018

With non-deterministic Prime Test : Time = 230 sec (with 5 random number)

Python:

import time

start = time.perf_counter()

import random

_mrpt_num_trials = 5

def prime(n):
    if n == 0 or n == 1 or n ==5:
        return False
    if n % 2 == 0:
        return False
    s = 0
    d = n-1
    while True:
        quotient, remainder = divmod(d, 2)
        if remainder == 1:
            break
        s += 1
        d = quotient
    assert(2**s * d == n-1)
    def try_composite(a):
        if pow(a, d, n) == 1:
            return False
        for i in range(s):
            if pow(a, 2**i * d, n) == n-1:
                return False
        return True # n is definitely composite
    for i in range(_mrpt_num_trials):
        a = random.randrange(2, n)
        if try_composite(a):
            return False
    return True

P1 = [i for i in range(26040) if prime(i) == True and i%3 == 1]   # mod(3) == 1 elements
P2 = [i for i in range(26040) if prime(i) == True and i%3 == 2]

def con_test(j, n):  # j is the number, n is the element in the list as tuple
    if j == 0:
        w = int(str(n[0]) + str(n[1]))
        w1 = int(str(n[1]) + str(n[0]))
        if prime(w) == False or prime(w1) == False:
            return False
        return True
    else:
        for i in n:
            w = int(str(j) + str(i))
            w1 = int(str(i) + str(j))
            if prime(w) == False or prime(w1) == False:
                return False
        return True

def st(n):            #to get rid of the repeated elements
    H = [tuple(sorted(i)) for i in n]
    G = set(H)
    return G

for i in P1:
    if i < 26040//5:
        G1 = [ j for j in P1 if con_test(0,(i,j)) == True]
        G2 = [(j,k) for j in G1 for k in G1 if con_test(0,(k,j)) == True ]
        if G2 != []:
            G2r = st(G2)
            G3 = [(p,*q) for q in G2r for p in G1 if con_test(p,q) == True ]
            if G3 != []:
                G3r = st(G3)
                G4 = [(m,*q) for q in G3r for m in G1 if con_test(m,q) == True]
                if G4 != []:
                    G4r = st(G4)
                    for j in G4r:
                        f = sum(j)
                        print(G4r, "with", i, "and sum is", f+i)

for i in P2:
    if i < 26040//5:
        G1 = [ j for j in P2 if con_test(0,(i,j)) == True]
        G2 = [(j,k) for j in G1 for k in G1 if con_test(0,(k,j)) == True ]
        if G2 != []:
            G2r = st(G2)
            G3 = [(p,*q) for q in G2r for p in G1 if con_test(p,q) == True ]
            if G3 != []:
                G3r = st(G3)
                G4 = [(m,*q) for q in G3r for m in G1 if con_test(m,q) == True]
                if G4 != []:
                    G4r = st(G4)
                    for j in G4r:
                        f = sum(j)
                        print(G4r, "with", i, "and sum is", f+i)

end = time.perf_counter()
print("passsed time:", end-start)

With normal prime tester : Time = 40 sec (But range is 9000, if I increase the range to 26034 it would take probably 10 min or so)

Python:

import time

start = time.perf_counter()

def prime(n):
    if n == 0 or n ==1 or  n==2 or n==5:
        return False
    if n % 2 == 0:
        return False
    for i in range(3,int(n**0.5)+1,2):
        if n%i == 0:
            return False
    return True

P1 = [i for i in range(9000) if prime(i) == True and i%3 == 1]   # mod(3) == 1 elements
P2 = [i for i in range(9000) if prime(i) == True and i%3 == 2]

def con_test(j, n):  # j is the number, n is the element in the list as tuple
    if j == 0:
        w = int(str(n[0]) + str(n[1]))
        w1 = int(str(n[1]) + str(n[0]))
        if prime(w) == False or prime(w1) == False:
            return False
        return True
    else:
        for i in n:
            w = int(str(j) + str(i))
            w1 = int(str(i) + str(j))
            if prime(w) == False or prime(w1) == False:
                return False
        return True

def st(n):            #to get rid of the repeated elements
    H = [tuple(sorted(i)) for i in n]
    G = set(H)
    return G

for i in P1:
    if i < 9000//5:
        G1 = [ j for j in P1 if con_test(0,(i,j)) == True]
        G2 = [(j,k) for j in G1 for k in G1 if con_test(0,(k,j)) == True ]
        if G2 != []:
            G2r = st(G2)
            G3 = [(p,*q) for q in G2r for p in G1 if con_test(p,q) == True ]
            if G3 != []:
                G3r = st(G3)
                G4 = [(m,*q) for q in G3r for m in G1 if con_test(m,q) == True]
                if G4 != []:
                    G4r = st(G4)
                    for j in G4r:
                        f = sum(j)
                        print(G4r, "with", i, "and sum is", f+i)

for i in P2:
    if i < 9000//5:
        G1 = [ j for j in P2 if con_test(0,(i,j)) == True]
        G2 = [(j,k) for j in G1 for k in G1 if con_test(0,(k,j)) == True ]
        if G2 != []:
            G2r = st(G2)
            G3 = [(p,*q) for q in G2r for p in G1 if con_test(p,q) == True ]
            if G3 != []:
                G3r = st(G3)
                G4 = [(m,*q) for q in G3r for m in G1 if con_test(m,q) == True]
                if G4 != []:
                    G4r = st(G4)
                    for j in G4r:
                        f = sum(j)
                        print(G4r, "with", i, "and sum is", f+i)

end = time.perf_counter()
print("passsed time:", end-start)

With non-deterministic prime tester : Time = 22 sec (Range is 9000 - with 5 random number test)

Conclusion: Without the prove I can solve the problem in half minute but I need faster prime checker so it would be possible to find the solution faster in 26000 range. However I can find the solution and prove it in 3 min with non-det prime tester.

I don't understand why its "bad" to use non- deterministic prime tester ? I think its useful and right for the situation

Vanadium 50 said:

Optimization 2: Checking to see if a number is definitely prime is slow. Checking to see if a number is probably prime, e.g. by checking small divisors or any of a number of more sophisticated tests will be faster, and I don't really care if a percent or two of composites gets mixed in.

So you are right about it.

Here is the all list that I find

Code:

{(1237, 2341, 12409, 18433)} with 7 and sum is 34427
{(5197, 5701, 6733, 8389)} with 13 and sum is 26033
{(7, 2341, 12409, 18433)} with 1237 and sum is 34427
{(7, 1237, 12409, 18433)} with 2341 and sum is 34427
{(13, 5701, 6733, 8389)} with 5197 and sum is 26033
{(941, 2099, 19793, 25253)} with 467 and sum is 48553
{(467, 2099, 19793, 25253)} with 941 and sum is 48553
{(467, 941, 19793, 25253)} with 2099 and sum is 48553

>>>

verty · Sep 7, 2018

Arman777 said:

With non-deterministic Prime Test : Time = 230 sec (with 5 random number)
With non-deterministic prime tester : Time = 22 sec (Range is 9000 - with 5 random number test)

Conclusion: Without the prove I can solve the problem in half minute but I need faster prime checker so it would be possible to find the solution faster in 26000 range. However I can find the solution and prove it in 3 min with non-det prime tester.

I think Vanadium means the prime checker isn't the main thing causing it to be slow. On my code, the Miller-Rabin checker is about 3 times quicker. It's not so much overall. Other areas of improvement are much more important. For example, I know how to solve it without caching. Then that unordered_map stuff goes and it is much quicker. I will tell you more about this tomorrow.

I don't understand why its "bad" to use non- deterministic prime tester ? I think its useful and right for the situation

Well I trust the deterministic one because it is very simple. For example, the one Vanadium used is very complicated and hard to understand. So if I'm using PHP where it's hard to debug stuff, it's easier with trusted code. I was still getting spurious results though. I'm now using C# where I can debug and profile much more easily. And it isn't so maniacal like C++ is.

Arman777 · Sep 7, 2018

verty said:

I think Vanadium means the prime checker isn't the main thing causing it to be slow.

Its half true and half not. For smaller range it doesn't matter but for large numbers it starts to affect time.
(We can see that on the result of the code times) I also think that the language affects the speed so python can be bad at calculating these stuff respect to C++. And time also depends how can you use the language to do stuff more efficiently.

verty said:

Then that unordered_map stuff goes and it is much quicker. I will tell you more about this tomorrow.

I actually didnt understand the unordered_map thing. I tried to look the code but I don't know C++

verty said:

Well I trust the deterministic one because it is very simple. For example, the one Vanadium used is very complicated and hard to understand.

Yes i its simple. Its just takes still much time with proofing. (or we should use some tricks). I think its nice to use shortcuts.

Edit: I checked the non-det prime test and it gives me accurate numbers up to 10**7 (%100 accurate).
I also did some more tests. But interestingly with the normal prime checker I find the results faster.
In general my code is much better then non-det test but in this case (the problem) its not I don't know why.

For example I generated random 300000 numbers between 10**9 and 10**14 and my own code give me the result in 0.9 sec and the answer was 25997 and non det code gives me the answer as 25597 in 2 sec.

Vanadium 50 · Sep 7, 2018

Arman777 said:

I don't understand why its "bad" to use non- deterministic prime tester

Because it gets the wrong answer - it is telling you that some composite numbers are prime and possibly that some prime numbers are composite. Yes, it's fast, and yes it's simple, but so is "if odd call it prime and if even call it composite", but we don't use that, do we?

Arman777 said:

I actually didnt understand the unordered_map thing.

http://www.google.com/search?q=unordered_map

Arman777 · Sep 7, 2018

Vanadium 50 said:

Because it gets the wrong answer

In general its not wrong. Its accurate as normal prime checker but its slower ?? that's kind of odd. And in this problem its much more worse

Check this 2 codes and see where I am doing wrong cause its amazing

Python:

import time
start = time.perf_counter()
import random
_mrpt_num_trials = 5

def prime(n):
    if n == 0 or n == 1:
        return False
    if n == 2:
        return True
    if n % 2 == 0:
        return False
    s = 0
    d = n-1
    while True:
        quotient, remainder = divmod(d, 2)
        if remainder == 1:
            break
        s += 1
        d = quotient
    assert(2**s * d == n-1)
    def try_composite(a):
        if pow(a, d, n) == 1:
            return False
        for i in range(s):
            if pow(a, 2**i * d, n) == n-1:
                return False
        return True # n is definitely composite
    for i in range(_mrpt_num_trials):
        a = random.randrange(2, n)
        if try_composite(a):
            return False
    return True

g =[]
for i in range(10**6):
    k = random.randint(10**9, 10**14)
    g.append(k)

L = [i for i in range(len(g)) if prime(i) == True]
print(len(L))
end = time.perf_counter()

and

Python:

import time
import random
start = time.perf_counter()

def prime(n):
    if n < 2:
        return False
    elif n == 2:
        return True
    else:
        if n % 2 == 0:
            return False
        for i in range(3,int(n**0.5)+1,2):
            if n%i == 0:
                return False
        return True
g =[]
for i in range(10**6):
    k = random.randint(10**9, 10**14)
    g.append(k)

L = [i for i in range(len(g)) if prime(i) == True]
print(len(L))
end = time.perf_counter()
print(end-start)

verty · Sep 7, 2018

Arman777 said:

Check this 2 codes and see where I am doing wrong cause its amazing.

You are using 5 trials but I think it is 3x quicker with 1 trial. So 5 trials is slower.

Arman777 · Sep 7, 2018

verty said:

You are using 5 trials but I think it is 3x quicker with 1 trial. So 5 trials is slower.

Yes that's what I thought.. So I am doing nothing wrong and as you can see its accurate right.

For example I made another test checking prime numbers up to 10**7 and non-det gives me the result of 664579 in 85 sec but normal one gives the same answer in 64 sec.

sorry nothing is wrong ... But in any case we can see that non-det is accurate as normal prime checker but not faster

Vanadium 50 · Sep 7, 2018

Arman777 said:

In general its not wrong.

That's the same as "Sometimes its right."

Arman777 · Sep 7, 2018

Vanadium 50 said:

That's the same as "Sometimes its right."

If sometimes is up to 10**12 then, its safe to use in this problem.
I made couple runs and its pretty safe to use it up to 10**7. I am sure that it can go up to 10**14 maybe even more. As an example of post 79.

In any case I think normal one is better then non-det. Its just faster in my code or it seems that way Idk I ll look in more detail tomorrow. Also the unordered map thing.

verty · Sep 8, 2018

My new code runs in 20s in C# and 45s in PHP. The main difference is I avoid the need for caching. To put it simply, a lot can be learned from the pairs collection. In particular, if (i, j) is a pair and (j, k) is a pair, we need to know if (i, k) is a pair.

Arman777 · Sep 8, 2018

Vanadium 50 said:

http://www.google.com/search?q=unordered_map

Its like dictionary in pyhton ?

verty said:

. In particular, if (i, j) is a pair and (j, k) is a pair, we need to know if (i, k) is a pair.

Yes you are right. Thats what I did in my last code. If (i,j) is pair and (i,k) is pair then if (j,k) is pair (i,j,k) must be pair

verty · Sep 8, 2018

My c# executable is taking 18 seconds and when I compile it with ngen, it takes 14. That's pretty close to the 11 of the c++. Some of the things I do are: sieving primes to get a list of them, storing the index of the first pair corresponding to a prime and the number of pairs containing that prime, using a binary search within that range to find the pair, and caching the random number that Miller-Rabin uses (I only run one trial anyway).

This is all small potatoes time-wise because the big time hog is generating the pairs (96% of total) and Miller-Rabin is taking 60% of the total. I think I can perhaps save another 2 seconds but that'll be the limit of what is possible, I reckon.

Vanadium 50 · Sep 8, 2018

Jarvis323 said:

You might be able to use tables instead of hash maps.

While in general I agree, if you index on the number you're testing - very fast -you will need a huge array: 676 MB if you do it naively. (The unordered map is probably 8-9 MB in comparison) When you think about physical computers, as opposed to language abstraction, you will realize that most or all of the unordered_map is in cache, but very little of an array would be.

One thing to try would be to take advantage of the sparseness of the array - for example, we don't need to test even numbers, we don't need to test pairs whose sum is congruent to 3 mod 3, and we can pack 8 bools into a byte. That gets us to just a hair over 10 MB.

Another idea I have been toying with is to take advantage of the fact the keys aren't random - they are prime numbers, and approximate functions exist for the n-th prime. One can use this fact to speed up navigation over, say, binary search trees.

Finally, one could take the profiling literally and see what can be done about the hash function. It's MurmurHash2, which is quite fast, but we're probably not doing it any favors by presenting it only odd numbers. A faster but slightly worse hash could be an improvement.

verty · Sep 9, 2018

My code now runs in 5s. I made two modifications from what I wrote before.

1. When generating prime pairs, if two large primes are used the sum may be too high.
2. Concatenating the numbers was slow. I now use the following procedure:

Code:

private static long custom_concat(int p1, int p2)
{
    if (p2 < 10) return p2 + (long)p1 * 10;
    else if (p2 < 100) return p2 + (long)p1 * 100;
    else if (p2 < 1000) return p2 + (long)p1 * 1000;
    else if (p2 < 10000) return p2 + (long)p1 * 10000;
    else if (p2 < 100000) return p2 + (long)p1 * 100000;
    else throw new AssertionException("custom_concat, too large");
}

Jarvis323 · Sep 9, 2018

verty said:
My code now runs in 5s. I made two modifications from what I wrote before.

1. When generating prime pairs, if two large primes are used the sum may be too high.
2. Concatenating the numbers was slow. I now use the following procedure:
Code:
private static long custom_concat(int p1, int p2)
{
    if (p2 < 10) return p2 + (long)p1 * 10;
    else if (p2 < 100) return p2 + (long)p1 * 100;
    else if (p2 < 1000) return p2 + (long)p1 * 1000;
    else if (p2 < 10000) return p2 + (long)p1 * 10000;
    else if (p2 < 100000) return p2 + (long)p1 * 100000;
    else throw new AssertionException("custom_concat, too large");
}

Would something like this be faster?

C:

long cat( long a, long b )
{
   long s = log2( b ) + 1;
   long p = a << s;
   long t = floor( log10( ( 2 << s ) ) );
   return a*pow( 10, t ) + b;
}

verty · Sep 9, 2018

Jarvis323 said:

Would something like this be faster?

C:

long cat( long a, long b )
{
   long s = log2( b ) + 1;
   long p = a << s;
   long t = floor( log10( ( 2 << s ) ) );
   return a*pow( 10, t ) + b;
}

Those math functions call out to a DLL which is expensive for some reason.. I was going to see how quick it was but it won't compare for that reason. Also there is no ##log_2## function.

Jarvis323 · Sep 9, 2018

Here's my solution. I also used the Miller-Rabin prime tester from Rosseta Code (which still turns out to be the bottleneck). The times I am getting are,

getting primes took: 1 ms
getting primes pairs took: 286 ms
computing all solution sums took: 5 ms
26033

"computing all solution sums" could potentially be optimized a bit more, i.e. in testing that a newly added prime is also in a pair with each other prime in the current solution. I tried binary search, and unorderd_set (also uses more memory), but it seams the brute force is faster for this part anyway.

C:

struct Pair
{
    IntType a;
    IntType b;

    Pair( IntType _a, IntType _b ) : a( _a ), b( _b ) {}
};inline void fillPrimes( vector< IntType > & primes, const IntType n )
{
   for( IntType i = 3, j = 0; j < n; ++i )
   {
        if( prime( i ) )
        {
           primes[ j++ ] = i;
        }
   }
}

inline IntType cat(
   const IntType a,
   const IntType b )
{
    return a*( std::pow( static_cast< IntType >( 10 ),
        ( IntType( std::log10( b ) ) + 1 ) ) ) + b;
}

inline bool catTest( IntType a, IntType b )
{
   return prime( cat( a, b ) )
        && prime( cat( b, a ) );
}

inline bool binarySearch(
   const IntType v,
   const vector< Pair > & pairs,
   IntType start,
   IntType last )
{
    while( start <= last )
    {
        int64_t m = ( start + last ) / 2;
        if( pairs[ m ].b < v )
        {
            start = m + 1;
        }
        else if( pairs[ m ].b > v )
        {
            last = m - 1;
        }
        else
        {
            return true;
        }
    }
    return false;
}

pair< bool, IntType > solutionFrom(
   vector< IntType > & roots,
    IntType & minSum,
   const IntType & sum,
   const IntType & NPP,
   const vector< Pair >    & pairs,
   const vector< IntType > & starts,
   const vector< IntType > & counts,
   const vector< IntType > & primePosition,
   const vector< IntType > & primes )
{
   const IntType root  = roots.back();
   const IntType level = roots.size();
   const IntType NPL   = counts[ root ] - ( NPP - level );

    for( IntType i = 0; i <= NPL; ++i )
    {
       IntType subPrime = pairs[ starts[ root ] + i ].b;
       IntType tempSum = sum + subPrime;
 
       if( tempSum + subPrime >= minSum )
       {
           return { false, 0 };
       }

        bool intersectsAll = true;
        for( IntType k = level - 2; k >= 0; --k )
        {

            bool intersectsSubTree = false;    
           for( IntType j = 0; j < counts[ roots[ k ] ]
               && subPrime >= pairs[ starts[ roots[ k ] ] + j ].b; ++j )
           {
               if( pairs[ starts[ roots[ k ] ] + j ].b == subPrime )
               {
                   intersectsSubTree = true;
                   break;
               }
           }
           if( ! intersectsSubTree )
           {
               intersectsAll = false;
               break;
           }

           // if( binarySearch(
           //    subPrime,
           //    pairs,
           //    starts[ roots[ k ] ],
           //    starts[ roots[ k ] ] + counts[ roots[ k ] ] - 1 ) )
           // {
           //    continue;
           // }

           // intersectsAll = false;
           // break;
        }
        if( ! intersectsAll )
        {
            continue;
        }

        if( level == NPP - 1 )
        {
            return { true, tempSum };
        }
        else
        {
           roots.push_back( primePosition[ subPrime / 2 ] );
           auto s = solutionFrom(
               roots,
               minSum,
               tempSum,
               NPP,
               pairs,
               starts,
               counts,
               primePosition,
               primes );

           if( s.first )
           {
               minSum = s.second;
           }

           roots.pop_back();
       }
    }

    return { false, 0 };
}

int main()
{
    const IntType MAX_N_PRIMES = 1200;
    const IntType NPP = 5;

    vector< IntType > primes( MAX_N_PRIMES );

    auto t1 = high_resolution_clock::now();
    fillPrimes( primes, MAX_N_PRIMES );
    cout << "getting primes took: " <<
            duration_cast< milliseconds >( high_resolution_clock::now() - t1 ).count()
            << " ms" << endl;

    vector< Pair > pairs;
    pairs.reserve( MAX_N_PRIMES );
    vector< IntType > primePairCounts(  MAX_N_PRIMES, 0 );
    vector< IntType > primePairStarts(  MAX_N_PRIMES, 0 );
    vector< IntType > validPairs;
    vector< IntType > primePosition( primes.back() / 2, 0 );

    t1 = high_resolution_clock::now();
    for( IntType i = 0; i < MAX_N_PRIMES; ++i )
    {
       primePosition[ primes[ i ] / 2 ] = i;
       primePairStarts[ i ] = pairs.size();
       for( IntType j = i+1; j < MAX_N_PRIMES; ++j )
       {
            if( catTest( primes[ i ], primes[ j ] ) )
            {
                pairs.push_back( { primes[ i ], primes[ j ] } );
                ++primePairCounts[ i ];
            }
       }
       if( primePairCounts[ i ] >= NPP )
        {
            validPairs.push_back( i );
        }
    }
    cout << "getting primes pairs took: " <<
            duration_cast< milliseconds >( high_resolution_clock::now() - t1 ).count()
            << " ms" << endl;

    t1 = high_resolution_clock::now();
    IntType minSolution = numeric_limits<IntType>::max();
    vector< IntType > roots( NPP );
    roots.shrink_to_fit();

    for( auto root : validPairs )
    {
       IntType sum  = primes[ root ];

       roots.clear();
       roots.push_back( root );

       pair< bool, IntType > p
         = solutionFrom(
          roots,
          minSolution,
          sum,
          NPP,
          pairs,
          primePairStarts,
          primePairCounts,
          primePosition,
          primes );

       if( p.first )
       {
           minSolution = min( minSolution, p.second );
       }    
       else
       {
           continue;
       }
   }    
   cout << "computing all solution sums took: " <<
        duration_cast< milliseconds >( high_resolution_clock::now() - t1 ).count()
        << " ms" << endl;

   cout << minSolution << endl;
}

Seams like a pretty efficient solution. Most of the time is in testing primes, to construct the list of prime pairs.

1) list prime pairs ##P## in sorted order with no duplicates.
2) given the ##i^{th}## prime. You have pairs in the list (from indices ##start(x_i)##, to ##start(x_j)## ), with ##(x_i, x_j)## the last pair with ##x_i## in the list.
3) pick ##x_i##' = ##P[ start[ x_i ] ] + 1##, as the first candidate to include in a solution using ##x_i##, and recurse, with ##x_i = x_i##'.
4) repeat step 3 until you have a solution (while making sure that each prime picked is in a pair with each previous pick), record it as the new min if it is, go back one level and continue where you left off.

verty · Sep 10, 2018

@jarvis, very well done. Having vectors for everything seems to be the way to go! I did a similar thing. Mine was more like a database though. I had a prime record that had everything in it. It was pretty thematic.

Ok, I think we've killed this problem now. @Arman777, would you like to pick another?

Arman777 · Sep 10, 2018

verty said:

Ok, I think we've killed this problem now. @Arman777, would you like to pick another?

Lol I agree, we talked a lot about it. I am trying to solve the next problem #61. If I couldn't solve it I can share.

Vanadium 50 · Sep 17, 2018

verty said:

Ok, I think we've killed this problem now

Actually, I think that the real value is discussing the pros and cons of the various approaches. Not whether someone can drag their code across the finish line.

Interestingly, I modified the code to find pairs, loop over pairs and look for numbers that form triplets. One there are triplets, make sure there are at least 3, and then build up the quintuplet from just that restricted list of numbers. Turns out that didn't help - it takes slightly longer.

What does help is a fast "probable prime" algorithm (100% of the primes are identified as probable primes and a tiny fraction of composites sneak in), and only when there is a quintuplet do the full prime test. It trades time in the primality testing for a slightly longer set of loops. That takes me to 8 seconds, 3 of which are spent caching prime pairs, and the remainder in unordered_map. So it's clear to me that to do better I need to drop the unordered_map.

Jarvis323 · Sep 17, 2018

Vanadium 50 said:

Actually, I think that the real value is discussing the pros and cons of the various approaches. Not whether someone can drag their code across the finish line.

Interestingly, I modified the code to find pairs, loop over pairs and look for numbers that form triplets. One there are triplets, make sure there are at least 3, and then build up the quintuplet from just that restricted list of numbers. Turns out that didn't help - it takes slightly longer.

What does help is a fast "probable prime" algorithm (100% of the primes are identified as probable primes and a tiny fraction of composites sneak in), and only when there is a quintuplet do the full prime test. It trades time in the primality testing for a slightly longer set of loops. That takes me to 8 seconds, 3 of which are spent caching prime pairs, and the remainder in unordered_map. So it's clear to me that to do better I need to drop the unordered_map.

Below is the essence of my solution if it were iterative. I think time is saved by knowing when to quit, and by going over pairs directly (in order) so smaller solutions are found sooner. The pairs look like this ( assuming only b and c are not paired):

Python:

primes = [ a, b, c, d ]
pairs =
(a,b)
(a,c)
(a,d)
(b,d)
(c,d)

starts = [  0,  3,  4,  5]
counts = [ 3, 1, 1, 0]

Then

Python:

for( each p_j paired with p_i )

becomes

Python:

for( int j = starts[ i ]; j < starts[ i ] + counts[ i ]; ++j )
    p_j = pairs[ j ].second

Now in the next loop you want to skip ahead to the primes paired with p_j, so somehow you need to know the index of p_j in primes. You could use a hash map or something,

Python:

for( int j = starts[ i ]; j < starts[ i ] + counts[ i ]; ++j )
    p_j = pairs[ j ].second
    for( int k = starts[ index_map( p_j ) ]; k < starts[ index_map( p_j )]  + counts[ index_map(p_j ) ]; ++k )
        ...

Or better yet, only keep track of the indices in the first place,

Python:

for i = 0 to N-1
    for j = I+1 to N-1
        if( is_pair( primes[ i ], primes[ j ] )
            pairs.push( i, j )
e.g.
pairs =
(0,1)
(0,2)
(0,3)
(1,3)
(2,3)

for( int i = 0; i < N; ++I )
    p1 = primes[ i ]
    for( int j = starts[ i ]; j < starts[ i ] + counts[ i ]; ++j )
        p2 = primes[ pairs[ j ].second ]
        for( int k = starts[ pairs[ j ].second ]; k < starts[ pairs[ j ].second ]  + counts[ pairs[ j ].second  ]; ++k )
            p3 = primes[ pairs[ k ].second ]
            for( int m = starts[ pairs[ k ].second ]; m < starts[ pairs[ k ].second ]  + counts[ pairs[ k ].second  ]; ++m )
                 p4 = primes[ pairs[ m ].second
                ...

Pseudo code for an iterative solution with fixed dimensions and quitting early then looks like this.

Python:

for( p1 in primes ):
    for( p2 in pair with p1 ):
    if( sum( p2, p1 ) >= minSolution ): break
         for( p3 in pair with p2 ):
             if( sum( p3, p2, p1 ) >= minSolution ): break
             if( p3 in pair with p1 ):
                 for( p4 in pair with p3 ):
                    if( sum( p4,p3, p2, p1 ) >= minSolution ): break
                    if( p4 in pair with p2 and p4 in pair with p1 ):
                        for( p5 in pair with p4 ):
                            if( sum( p5,p4,p3, p2, p1 ) >= minSolution ): break
                            if( p5 in pair with p3 and p5 in pair with p2 and p5 in pair with p1 ):
                                minSolution = sum( p5,p4, p3, p2, p1 )

Checking if ##p_i## is paired with ## p_{i-2}, p_{i-3}, ... ,## can be done using brute force (no additional memory), binary search (no additional memory), hash maps (additional memory), or direct tables (a ton of additional memory), depending on how much memory you have. I was surprised that brute force was actually faster then binary search and as fast as hash tables; with the pairs sorted by value, you can stop searching as soon as you see a prime larger than the one you're search for.

C:

inline bool binarySearch(
   const IntType v,
   const vector< Pair > & pairs,
   const vector< IntType > & primes,
   IntType start,
   IntType last  )
{
    while( start <= last )
    {
        IntType m = ( start + last ) / 2;
        if( primes[ pairs[ m ].b ] < v )
        {
            start = m + 1;
        }
        else if( primes[ pairs[ m ].b ] > v )
        {
            last = m - 1;
        }
        else
        {
            return true;
        }
    }
    return false;
}

if( binarySearch(  p3, pairs, primes, starts[ i ], starts[ i ] + counts[ i ] -1 ))
    // then p3 is paired with p1=primes[i]

An additional optimization is that you can quit early at any level if there are not enough subsequent pairs. If you have p1 chosen, you need 4 more, so no need to consider any of the last 3 primes paired with p1 as a choice for p2.

e.g.

Python:

for( int i = 0; i < N; ++I )
    p1 = primes[ i ]
    for( int j = starts[ i ]; j < starts[ i ] + counts[ I ] -  3; ++j )
        p2 = primes[ pairs[ j ].second ]
        if( p1 + p2 >= minSolution ) break
        for( int k = starts[ pairs[ j ].second ];
            k < starts[ pairs[ j ].second ]  + counts[ pairs[ j ].second  ] - 2;
            ++k )
            p3 = primes[ pairs[ k ].second ]
            if( p1 + p2 + p3 >= minSolution ) break
            if( in_pair( p1, p3 ))
            for( int m = starts[ pairs[ k ].second ];
                 m < starts[ pairs[ k ].second ]  + counts[ pairs[ k ].second  ] - 1;
                 ++m )
                 p4 = primes[ pairs[ m ].second]
                ...

An additional insight is that this solution is essentially a depth first search in the tree induced by the pairs. e.g.

Code:

(a,b), (a,c), (a,d), (b,d), (c,d)

           a
       /   |   \
      b    c    d
      |    |
      d    d

I think it would be interesting and very challenging to do it efficiently for general cases, without pre-generating all of the primes/prime pairs, but by only introducing new ones as needed until the point where it's clear that any solution with a larger prime can't be minimal, (and also use big numbers and parallelization). I wonder how long it would take to find solutions for 6 pairs, or 7 pairs?

Vanadium 50 · Sep 18, 2018

The problem I see with "just in time" determination of primes or prime pairs is "where do you put them?" Specifically, how do you find them - if you're doing searches, you're going to be replacing O(1) operations with O(log n) operations. An efficient algorithm needs a good answer to that question.

On parallelization, I was thinking about that as well, partly because I have access to some highly parallel systems. The actual search parallelizes quite nicely, with large outer loops. Unfortunately, my code spends almost half its time finding prime pairs, and the structure I chose to store them in, unordered_map, is going to have a slow reduction time. It's not clear to me how to best efficiently reduce the pair precomputation if done in parallel.

I'm not going to have time to fiddle with this, but I am interested in what the gains would be if I replaced unordered_map by an array. When I was spending minutes doing primality testing, a few seconds of unordered_map navigation was a good thing. Now that I have discovered the key is to use probable primes and only do a full primality test on candidate solutions, it looks like less of a bargain. In the 8s version of my code, 90% or more of them time is spent in unordered_map.

The other advantage is that it's more obvious how to do the reduction.

Vanadium 50 · Sep 30, 2018

Vanadium 50 said:

I'm not going to have time to fiddle with this, but I am interested in what the gains would be if I replaced unordered_map by an array.

That gets the time down to under a second. (940 milliseconds) Essentially all of the time is in primality testing.

Given that, looping over probable primes is no longer helpful. You have to test it for real eventually, so you might as well do it at the beginning. I have repurposed the probable prime tester to do a fast reject before calling Miller-Rabin. If it were necessary to get this well below one second, I would push harder on this front.

Vanadium 50 · Nov 10, 2018

I got a chance to play again. It's at half a second. To do this I:

Replaced the string concatenation with if/else and multiplies
Replaced the iterators with indices. My intent was to allow parallelism, but it did not help - the overhead wiped out the gains
More carefully set loop limits to avoid searching for quintuplets that could not possibly be smaller
Tweaked the fast reject to re-optimize

As it stands, 20% of the time is spent looking at the cached pair results and 80% of the time is doing primality testing, the majority in modular exponentiation.

This is on an i5-6500; faster chips would take less time obviously.

Optimizing Prime Pair Combinations for Project Euler Problem 60

Similar threads

Hot Threads

Recent Insights