Sum of All Positive Integers n where d+n/d is Prime to 100M

  • Thread starter Thread starter Arman777
  • Start date Start date
  • Tags Tags
    Euler
AI Thread Summary
The discussion focuses on solving Project Euler Problem 357, which involves finding the sum of all positive integers n not exceeding 100 million such that for every divisor d of n, the expression d + n/d is prime. Participants share code snippets for generating divisors and testing for primality, noting that the current implementation is slow for larger numbers. Suggestions include using the Sieve of Eratosthenes to generate a set of primes for faster membership testing, as set lookups are significantly quicker than list lookups. Profiling the code is emphasized to identify bottlenecks and optimize performance. The conversation highlights the importance of efficient algorithms in computational problems.
Arman777
Insights Author
Gold Member
Messages
2,163
Reaction score
191
[Project Euler Problem 357]

Consider the divisors of 30: 1,2,3,5,6,10,15,30.
It can be seen that for every divisor d of 30, $$d+30/d$$ is prime.

Find the sum of all positive integers n not exceeding 100 000 000
such that for every divisor d of n, $$d+n/d$$ is prime.


Python:
import math
import time

start = time.perf_counter()

def divisor_generator(n):
    '''Generates the divisiors of input num'''
    large_divisors = []
    for i in range(1, int(math.sqrt(n) + 1)):
        if n % i == 0:
            yield i
            if i*i != n:
                large_divisors.append(n / i)
    for divisor in reversed(large_divisors):
        yield divisor

def is_prime(n):
    if n < 2:
        return False
    if n == 2:
        return True
    if n != 2 and n % 2 == 0:
        return False
    for i in range(3,int(n**(1/2))+1,2):
        if n % i == 0:
            return False
    return True

def test_condition (divisor_array, num):

    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''

    if len(divisor_array) %2 != 0: # the divisors of num = d3 is [1,d2,d3], and d2*d2=d3 hence d2+d3/d2 = 2d2 which is not prime
        return False
    if sum(divisor_array) %2 != 0: # the divisors of num = d4, [1,d2,d3,d4] (1+d4)=Prime and (d2+d3)=Prime hence sum([1,d2,d3,d4]) must be even
        return False
    for i in range(len(divisor_array)//2):
        if is_prime(divisor_array[i] + divisor_array[-i-1]) == False:
            return False
    return True

Sum = 0
for num in range(2,10**6,2): #since for odd num we do not have prime numbers
    divisor_list_num = list(divisor_generator(num))
    if test_condition(divisor_list_num,num) == True:
        Sum += numprint(Sum)

end = time.perf_counter()
print("Time : ", end-start, "sec")

It's good up to ##\text{num} = 10^6## but then it gets slow after that. For instance it takes 16 min to solve ##10^7## case.

How can I inrease the speed of my code ?
 
Physics news on Phys.org
Trade space for speed. Can you create a list of primes once and use it in your prime test instead of that for loop test you have now? By using more space to store the list you can decrease the time of computation...
 
jedishrfu said:
Trade space for speed. Can you create a list of primes once and use it in your prime test instead of that for loop test you have now? By using more space to store the list you can decrease the time of computation...
If you mean something like
Python:
  if divisor_array[i] + divisor_array[-i-1] not in list_of_primes:
            return False
    return True

Thats 16 times slower than my original code
 
Arman777 said:
Thats 16 times slower than my original code
I think you need a set or a dictionary to make it faster. The program is now doing a linear search.
 
  • Like
Likes scottdave, Klystron and StoneTemplePython
willem2 said:
I think you need a set or a dictionary to make it faster. The program is now doing a linear search.
What do you mean by that. How I can use them to make my code faster
 
Arman777 said:
How can I inrease the speed of my code ?

The Sieve of Eratosthenes is a fast way to generate primes. Use that to initialize a Python set containing all the primes up to the largest you will need; then just check for set membership to test numbers for being prime.
 
  • Like
Likes Klystron
jedishrfu said:
Can you create a list of primes once and use it in your prime test

Membership lookup in Python lists is very slow as the list gets larger. Set membership testing is constant time, so a Python set should be used for this. (Or a dict with dummy values for each key in versions of Python too old to have the set builtin, but those are pretty rare these days.)
 
  • Like
Likes jedishrfu
PeterDonis said:
The Sieve of Eratosthenes is a fast way to generate primes. Use that to initialize a Python set containing all the primes up to the largest you will need; then just check for set membership to test numbers for being prime.
Okay here is my code and its still slow.. ?

Python:
import math
import time
from itertools import compress

start = time.perf_counter()

def divisor_generator(n):
    '''Generates the divisiors of input num'''
    large_divisors = []
    for i in range(1, int(math.sqrt(n) + 1)):
        if n % i == 0:
            yield i
            if i*i != n:
                large_divisors.append(n / i)
    for divisor in reversed(large_divisors):
        yield divisor

def prime_number(n):
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return {2,*compress(range(3,n,2), sieve[1:])}list_of_primes = prime_number(10**5)

def test_condition (divisor_array, num):

    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''

    if len(divisor_array) %2 != 0: # the divisors of num = d3 is [1,d2,d3], and d2*d2=d3 hence d2+d3/d2 = 2d2 which is not prime
        return False
    if sum(divisor_array) %2 != 0: # the divisors of num = d4, [1,d2,d3,d4] (1+d4)=Prime and (d2+d3)=Prime hence sum([1,d2,d3,d4]) must be even
        return False
    for i in range(len(divisor_array)//2):
        if (divisor_array[i] + divisor_array[-i-1]) not in list_of_primes:
            return False
    return True

Sum = 0
for num in range(2,10**5,2): #since for odd num we do not have prime numbers
    divisor_list_num = list(divisor_generator(num))
    if test_condition(divisor_list_num,num) == True:
        Sum += numprint(Sum)

end = time.perf_counter()
print("Time : ", end-start, "sec")

By the way, this prime lister can list primes up to ##10^5## in ##0.00164## sec and up to ##10^8## in 2 sec. I think that's quite fast.

If this is what you mean that its still 16 times slow. My original code finishes in 0.9 sec.
So to search up to ##10^8## we should finish this ##10^5##test in 0.01 sec or so
If you mean I should create a list of primes for each num that even more slow.

However if you mean something entirely different then I wrote then it would be better ıf you give me an example code.
 
Last edited:
Arman777 said:
it would be better ıf you give me an example code.

We can't give a complete solution here since this is an Euler problem and solutions are not supposed to be posted publicly. In fact I am going to move this thread to the homework forum for that reason.
 
  • #10
Arman777 said:
here is my code and its still slow.. ?

The list_of_primes should not be a list. It should be a set. List lookups are slow in Python for large lists. Set lookups are constant time (since Python sets use the same underlying lookup code as Python dict keys, i.e., a hash table).
 
  • #11
PeterDonis said:
The list_of_primes should not be a list. It should be a set. List lookups are slow in Python for large lists. Set lookups are constant time (since Python sets use the same underlying lookup code as Python dict keys, i.e., a hash table).
Oh I see okay. I did not know that list and set can alter the solution time that much.

However the difference is not much from my original code. I tested this time for ##10^6## case and the difference is only 2-3 sec so still slow.
 
  • #12
Arman777 said:
still slow

Since you have a generated set of primes, you might also consider generating the divisors of a number from its prime factors (since prime factorization of numbers in the given range should be pretty fast with a pre-computed set of primes), instead of testing every single number from 1 to sqrt(n).
 
  • #13
Perhaps instead of using the not in listofprimes you used a binsearch since its a sorted list.

I think you need to see where its spending time first after having tried these suggestions as you humans often have misconceptions about what takes too long to run.
 
  • Like
Likes Klystron
  • #14
jedishrfu said:
Perhaps instead of using the not in listofprimes you used a binsearch since its a sorted list.

This is still slower than a hash table lookup, which you get by using a Python set instead of a list.
 
  • #15
Thanks I’m rather new to python and not completely aware of its many features and their relative speeds. I tend to make sure my program works, try to find the speed bottlenecks and fix those based on the 80/20 rule ie 80% of the work is done in %20 of the code.
 
  • #16
I
Arman777 said:
However I checked my answer and it is wrong. Any idea why ?
You get the wrong answer for 2.

I think I see the problem with this algorithm. You use a generator function and yield, so you would think this would have the effect of not computing more divisors than you need. To do this, you must use the generator function in a for loop directly and not assign it to a variable first.

If you put for d in divisor_generator[num]: in your test_condition function, you won't be generating any extra divisors after the first one has produced a non-prime. This made the original problem 4 times as fast, so at least 75% of the time was spent on generating divisors.
If you want to speed up the divisor generator further you can look at any technique for integer factorization.

All divisors need to be square free, if the number is divisible by p it can't be divisible by p^2 . If not p+n/p won't be a prime.

r.
 
  • #17
Profile your code. If you want your code to run faster, you need to know where it is spending its time now. Profile your code. Don't go looking for inefficient code first. Profile your code. Code may be inefficient, but not the most inefficient, so fixing it will not help as much as you want. Profile your code.
 
  • Like
Likes Tom.G and jedishrfu
  • #19
You don't think I was being too subtle, do you?
 
  • Like
Likes jedishrfu
  • #20
My recent code was producing wrong results as divisors.

Vanadium 50 said:
Profile your code. If you want your code to run faster, you need to know where it is spending its time now. Profile your code. Don't go looking for inefficient code first. Profile your code. Code may be inefficient, but not the most inefficient, so fixing it will not help as much as you want. Profile your code.
Code:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:1009(_handle_fromlist)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:103(release)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:143(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:147(__enter__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:151(__exit__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:157(_get_module_lock)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:176(cb)
        2    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:222(_verbose_message)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:232(_requires_builtin_wrapper)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:307(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:311(__enter__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:318(__exit__)
        4    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:321(<genexpr>)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:369(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:416(parent)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:424(has_location)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:433(spec_from_loader)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:504(_init_module_attrs)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:576(module_from_spec)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:58(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:663(_load_unlocked)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:719(find_spec)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:740(create_module)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:748(exec_module)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:765(is_package)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:78(acquire)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:855(__enter__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:859(__exit__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:882(_find_spec)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:948(_find_and_load_unlocked)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:978(_find_and_load)
        1    1.451    1.451   24.929   24.929 smthing.py:1(<module>)
        1    0.019    0.019    0.019    0.019 smthing.py:18(prime_number)
   499999    0.540    0.000    0.610    0.000 smthing.py:28(test_condition)
10630890   22.430    0.000   22.850    0.000 smthing.py:7(divisor_generator)
        3    0.000    0.000    0.000    0.000 {built-in method _imp.acquire_lock}
        1    0.000    0.000    0.000    0.000 {built-in method _imp.create_builtin}
        1    0.000    0.000    0.000    0.000 {built-in method _imp.exec_builtin}
        1    0.000    0.000    0.000    0.000 {built-in method _imp.is_builtin}
        3    0.000    0.000    0.000    0.000 {built-in method _imp.release_lock}
        2    0.000    0.000    0.000    0.000 {built-in method _thread.allocate_lock}
        2    0.000    0.000    0.000    0.000 {built-in method _thread.get_ident}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.any}
        1    0.000    0.000   24.929   24.929 {built-in method builtins.exec}
        4    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
        5    0.000    0.000    0.000    0.000 {built-in method builtins.hasattr}
   999499    0.070    0.000    0.070    0.000 {built-in method builtins.len}
        2    0.000    0.000    0.000    0.000 {built-in method builtins.print}
   499999    0.094    0.000    0.094    0.000 {built-in method math.sqrt}
        2    0.000    0.000    0.000    0.000 {built-in method time.perf_counter}
  5065196    0.326    0.000    0.326    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        2    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
        2    0.000    0.000    0.000    0.000 {method 'rpartition' of 'str' objects}
I did this test for ##10^6## case and I used the code in post #8 (with changing something).
 
Last edited:
  • #21
willem2 said:
I

You get the wrong answer for 2.

I think I see the problem with this algorithm. You use a generator function and yield, so you would think this would have the effect of not computing more divisors than you need. To do this, you must use the generator function in a for loop directly and not assign it to a variable first.

If you put for d in divisor_generator[num]: in your test_condition function, you won't be generating any extra divisors after the first one has produced a non-prime. This made the original problem 4 times as fast, so at least 75% of the time was spent on generating divisors.
If you want to speed up the divisor generator further you can look at any technique for integer factorization.

All divisors need to be square free, if the number is divisible by p it can't be divisible by p^2 . If not p+n/p won't be a prime.

r.
I ll try this now, but I think I did not quite understand it
 
  • #22
Python:
import math
import time
from itertools import compress

start = time.perf_counter()

def prime_number(n):
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return  {2,*compress(range(3,n,2), sieve[1:])}

list_of_primes = prime_number(10**8)def test_condition (num):
    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''
    if num % 4 == 0:
        return False
    for i in range(1, int(num**0.5) + 1):
        if num % i == 0:
            if (i + num /i) not in list_of_primes:
                return False
    return True

Sum = 0
for num in range(2, 10**8, 2): #since for odd num we do not have prime numbers
    if test_condition(num) == True:
        Sum += num

print(Sum)

end = time.perf_counter()
print("Time : ", end-start, "sec")
Sum = 1739023853136
Time : 54.514943536000004 sec

Where I am doing wrong, now ??
 
  • #23
1 also should be included.
 
  • Like
Likes Arman777
  • #24
willem2 said:
1 also should be included.
Yes thanks...I solved it now.I never thought to add 1.
willem2 said:
you won't be generating any extra divisors after the first one has produced a non-prime.
Also thanks for this tip.
 
  • #25
If you want your code to run faster, you not only need to profile it, but to look at the results of the profile.

It says you are spending the majority of your time in divisor_generator. So you have three options:

  1. Speed up divisor_generator. Advice on changing your data structures fall into this category.
  2. Call divisor_generator less often. If instead of looping over odd numbers you looped over prime numbers (d is always one less than a prime. Why?) your code would run ~8 times faster.
  3. Don't call divisor_generator at all. Nothing says d is the right variable to loop over.
That third point is very important. Three times in three threads you have posted a solution based on exhaustive tests, and three times that's not the most efficient way to solve the problem, and twice (so far) you've gotten upset with people who have had the temerity to suggest this. While I haven't coded this up, I am certain my algorithm could do this in under a second.
 
  • Like
Likes scottdave
  • #26
Vanadium 50 said:
d is always one less than a prime. Why?)

Because d+1 always have to be prime
Vanadium 50 said:
your code would run ~8 times faster.

It actually runs 2 times faster.

I did not notice this. I wish I did tho.

Vanadium 50 said:
While I haven't coded this up, I am certain my algorithm could do this in under a second.

Of course. I don't think the time is that much important (for this type of problems, such as problems in PE) unless it takes more than minutes or hours. For instance, I saw an algorithm in the forum page that finds the solution in 0.70 sec in C++.
Vanadium 50 said:
That third point is very important. Three times in three threads you have posted a solution based on exhaustive tests, and three times that's not the most efficient way to solve the problem, and twice (so far) you've gotten upset with people who have had the temerity to suggest this.

Which three thread? That's why I am asking here to improve the code. I am not exactly using the brute force to solve these problems but most of the times I am missing some stuff. Next time, I 'll profile my code and try to see where the problem is.

In PE problems, mathematical tricks speeds up the code most of the times rather than the code itself. And again most of the times I miss those points ( such as d + 1 must be prime case)
 
  • #27
What you call "mathematical tricks" other people call "programming". It's a vital skill to understand the framing of a problem to allow for an accurate and speedy solution. Also, the point of these problems is seldom "can I write an exhaustive test?" It's "can I learn how to think about these problems to do better than an exhaustive test"?

In this case, factorizing is slow but multiplication is fast. Furthermore, looking at a list of factors - even prime factors - will tell you a lot about whether this number works or not. For example, I can tell that 156 will not work, but 190 will. (More precisely, 156 does not need to be tested, but 190 does - and it passes) So looping over candidates may not be the way you want to do things.

Can you think of a property of the prime factorization that would help you? And given that, what should one loop over?
 
  • #28
Vanadium 50 said:
Can you think of a property of the prime factorization that would help you? And given that, what should one loop over?
I couldn't find it maybe more hint ?
 
  • #29
Why doesn't 18 work? 19 is prime, and it's not divisible by 4.
 
  • #30
Its a multiple of a square number. So it will not work.

I edited my code according to it. But I see no improvment on time.
 
  • #31
You need to pay more attention to what people are writing.

Vanadium 50 said:
Can you think of a property of the prime factorization that would help you?
Vanadium 50 said:
Nothing says d is the right variable to loop over
Vanadium 50 said:
you have posted a solution based on exhaustive tests, and three times that's not the most efficient way to solve the problem
 
  • #32
As I said I couldn't find it. What should I loop over then ?
 
  • #33
I'll tell you in a few days. It is important that you think for yourself. You have plenty of hints.
 
  • #34
Another hint. How many Sophie Germain primes are there under 50,000,000? Why do I care? Why do I care how fast I can find them?
 
Last edited:
  • #35
Vanadium 50 said:
Another hint. How many Sophie Germain primes are there under 50,000,000? Why do I care? Why do I care how fast I can find them?
I find 229568 primes.

I looked at the definiton etc. However I did not quite understand how can we use them. For instance the solution of the number + 1 must be prime. These primes are,

(2,3,7,11,23,31,43,59,71,79,83)

However the Sophie German primes,

(2, 3, 5, 11, 23, 29, 41, 53, 83)

I couldn't make the connection. Some of them are in the solution but some of them are not.

Vanadium 50 said:
Why do I care? Why do I care how fast I can find them?
Its important to optimize the codes..?

I added the condition of num / 2 + 2 = prime. My time now reduced to half, 24 second.
 
Last edited:
  • #36
If the only factors of your number are 2 and p, then p is prime and 2p+1 is prime. How does that relate to Sophie Germain primes. How does that relate to your solution?
 
  • #37
Vanadium 50 said:
If the only factors of your number are 2 and p, then p is prime and 2p+1 is prime. How does that relate to Sophie Germain primes. How does that relate to your solution?
I see it now.
[1 , 2, p , 2* p]

However in this case its not certain that p+2 should be prime ? Also It's the same as num/2 + 2 = prime condition .
 
  • #38
Arman777 said:
However in this case its not certain that p+2 should be prime ?

That is correct. This is a necessary condition, not a sufficient condition. It is also only necessary when there are exactly two factors. (Why?)

Looking back - if you are spending your time factorizing, how would you rewrite your code so you weren't?
 
  • #39
Python:
import math
import time
from itertools import compress

start = time.perf_counter()

def prime_number(n):
    ''' Prime numbers up to n'''
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return  {2,*compress(range(3,n,2), sieve[1:])}   #Using set to increase the search time

list_of_primes = prime_number(10**8)  # listing prime numbers up to 10**8
def test_condition (num):
    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''
    for i in range(1, int(num**0.5) + 1):
        if num % i == 0:
            if (i + num /i) not in list_of_primes:
                return False
    return True

Sum = 0
for num in list_of_primes:
    if num % 4 != 0 or num % 9 != 0:
        if (num - 1) / 2 + 2 in list_of_primes:
            if test_condition (num-1) == True:
                Sum += num - 1

print(Sum+1)
end = time.perf_counter()
print("Time : ", end-start, "sec")

In this code, I am testing for, (num-1) /2 + 2 cases and if it's not true then the loop will break.

It's certain that (num -1) should be even number and all even numbers are divisible by 2. Hence all numbers that will satisfy the condition (given by the problem) should also satisfy (num - 1)/2 + 2

Vanadium 50 said:
if you are spending your time factorizing, how would you rewrite your code so you weren't?

This is the only thing that I can up with. If the (num-1)/2 + 2 is false then the loop will break. For instance, let's take the num = 19 case. [1,2,3,6,9,18] are the divisors of num-1. And (num-1)/2+2 = 11. However, it has 6 and 3 so we cannot add it to the Sum.

After checking (num-1)/2 + 2 == prime condition, we should definately factor the num-1. Same as for num = 31.

Without factorizing, we cannot be certainly sure about if num-1 should be added to Sum or not.

This condition helps me to improve the run time. Now by code works 2x faster. But I don't think I can further improve the run time.

I am also bored by the problem..
 
  • #40
The point I was trying to get you to see was that you can avoid factoring by not looping over the number, but looping over prime factors. All solutions have no repeated prime factors, which allows you to cut candidate solutions way, way down. If there are exactly two prime factors, one mmust be 2 (for n>1), so you are dealing with Sophie Germain primes as candidates.

The 100,000,000 limit means you need to only consider up to 8 prime factors.

Arman777 said:
I am also bored by the problem..

That's a good way to get people to not help you in the future.
 
  • #41
Vanadium 50 said:
If there are exactly two prime factors, one must be 2 (for n>1), so you are dealing with Sophie Germain primes as candidates.

I write code to see how many of the Sophie German Primes adds up to the sum.
Here are my results

Total number of Sophie German primes below 50 million, 229568
Sophie German primes who passed the test_condition, 3808


Also, I looked for the total number of primes that passes the test and for below 50 million that is 23025 and for 100 million 39626

For the only 2 factor case, we can find the solutions in a short amount of time. However, as we can see they only cover a small percentage of the total solution number.

Vanadium 50 said:
All solutions have no repeated prime factors, which allows you to cut candidate solutions way, way down.

I think I understand your point. However, I think that the time will not change much even we try that. First, we need to find all the factors of a number and then we should check somehow it repeats or not. However, I already did that by dividing 4 and 9, etc without needing to factor the number. Also even in that case you still need to check if the sum of d + num/d = prime or not.

If you can write a python code for your algorithm then maybe we can compare times and see the result.
 
  • #42
I thought you were bored.

I get 229568, but the number who pass is 30301. The first few I get are: 6 10 22 58 82 358 382 478 562 838 862 1282 1318 1618 2038 2062 2098 2458 2578 2902 2962 3862 4258 4282 4678 5098 5938 6598 6658 6718 6778 7078 7642 7702 8038 8542 8962

It takes 20 ms to cover all the answers with exactly two factors.
 
  • #43
Vanadium 50 said:
I get 229568, but the number who pass is 30301
I am getting the same answer as you do now. It seems like I made a mistake on my previous code.
Vanadium 50 said:
It takes 20 ms to cover all the answers with exactly two factors.
Thats nice and normal.. since you have to only check num + 2 case for num in sophie_german_primes. How about the whole test. How it takes to run ?
Vanadium 50 said:
I thought you were bored.
Kind of but its still fun to do somehow.
 
  • #44
You're going to need to decide - are you bored or not? I'm not `willing to put any effort into a conversation if the other party will just say "Too bad. I'm bored. Goodbye."

Here's why looping over factors will be faster - and you have to do two factors (related to Sophie Germain primes) all the way up to 8 factors. Looping over n, you need to test 100,000,000 numbers. I only need to test nonsquare numbers, i.e. numbers with no repeated factors, and there are about 61,000,000 of them. So I am already ahead.

You can say "I don't need to test all the numbers. Only the even ones, only the ones that are one less than a prime, etc." True. But neither do I. That factor of 0.61 persists.

But that's peanuts. The real gain is that while you have to factor numbers, I don't. My numbers are already factored. You're spending 90% of your tine factoring, and I don't have to do it at all. So my code (when I finish writing it) should be ~15x faster than yours.
 
  • #45
Vanadium 50 said:
~15x faster than yours

That's not possible in python. I am using one of the fastest prime generators and that only itself takes 2.5 sec to produce to prime numbers up to 10**8... you can mostly be 5-3 times faster. Which I am not sure that even possible. I am curious about your algorithm. Also, you should start your time before generating the primes not after and I only accept if its faster in python not in C. C is already faster than the python in the core.

Vanadium 50 said:
Here's why looping over factors will be faster - and you have to do two factors (related to Sophie Germain primes) all the way up to 8 factors. Looping over n, you need to test 100,000,000 numbers. I only need to test nonsquare numbers, i.e. numbers with no repeated factors, and there are about 61,000,000 of them. So I am already ahead.

Python:
        if (num - 1) / 2 + 2 in list_of_primes:
line in my code already checks for the 2 factor case and if its wrong it immidetly exits the loop.

I think this is much faster or equally faster than the generating some primes and then testing each of them that num + 2 satisfies or not.

Vanadium 50 said:
I only need to test nonsquare numbers,

I can also add that condition to my code. And I did before but it does not change much.

Vanadium 50 said:
You're going to need to decide - are you bored or not? I'm not `willing to put any effort into a conversation if the other party will just say "Too bad. I'm bored. Goodbye."

We can continue to discuss
 
  • #46
I am of course only talking about the algorithmic time, not the set-up time (like finding all the primes).

Arman777 said:
I think this is much faster or equally faster than the generating some primes

You see, this is why they have classes on algorithm design. It's not about your feelings. (Or my feelings, but as an inexperienced programmer your feelings should carry less weight) I have shown why this is faster: I am replacing factorization with multiplication (and avoiding 39% of the work). Factorization is slow, multiplication is fast.
 
  • Like
Likes StoneTemplePython and Arman777
  • #47
Okay then what can I say .. when you finish writing your code you can share it.
 
  • #48
Vanadium 50 said:
I am of course only talking about the algorithmic time, not the set-up time (like finding all the primes).
Okay then. I wrote a new code by implementing your idea on sophie-german primes now my code runs in 8.35 sec

Python:
import math
import time
from itertools import compressdef prime_number(n):
    ''' Prime numbers up to n'''
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return  {2,*compress(range(3,n,2), sieve[1:])}   #Using set to increase the search time

list_of_primes = prime_number(10**8)  # listing prime numbers up to 10**8
square_numbers = {i**2 for i in range(2, 10**4)}
for i in square_numbers:
    for j in list_of_primes:
        if (j-1) % i == 0 and j in list_of_primes:
            list_of_primes.remove(j)
        else:
            break

list_of_primes = list_of_primes - square_numbers
sophie_german_primes = prime_number(50 * 10**6)
sg_prime = {i for i in sophie_german_primes if 2*i + 1 in list_of_primes}

def test_condition (num):
    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''
    for i in range(1, int(num**0.5) + 1):
        if num % i == 0:
            if (i + num /i) not in list_of_primes:
                return False
    return True

start = time.perf_counter()

Sum = 0
for num in sg_prime:
    if num + 2 in list_of_primes:
        Sum += 2*num

new_list_primes = list_of_primes - {2*i+1 for i in sg_prime}

for num in new_list_primes:
    if (num - 1) / 2 + 2 in list_of_primes:
        if test_condition (num-1) == True:
            Sum += num - 1

print(Sum+1)
end = time.perf_counter()
print("Time : ", end-start, "sec")
Vanadium 50 said:
your feelings should carry less weight)
:/

It seems that, this type of algorithm makes my code faster by an amount of 2.5x
 
  • #49
A nonsquare number is not a number that is not a square. It's a number that has no repeated prime factors. So 8 is not a square, but it's also not a nonsquare.

My code has a bug in it; it finds about 1% too many solutions (not too few!). Most likely, I am doing one test twice and another test not at all. Even so, it takes 20 ms to test all the numbers with 2 prime factors, 100 ms to test all numbers with 3 and 10 ms to test all with 4. That's most of them, so I confident the testing time will end up under a quarter of a second.

If I replace my dumb sieve with something like primegen, I should be able to sieve in under half a second. (It sieves a billion primes in a second or two on a modern CPU).

So while I admit I don't yet have a solution in under one second, it's pretty clear that there is one, and looping over factors gets one there.
 
  • #50
Vanadium 50 said:
and looping over factors gets one there.
It seems so yes.
Vanadium 50 said:
Even so, it takes 20 ms to test all the numbers with 2 prime factors, 100 ms to test all numbers with 3 and 10 ms to test all with 4. That's most of them, so I confident the testing time will end up under a quarter of a second.
Thats really great actually. So I think we can to an end. It seems that you have the fastest algorithm
Vanadium 50 said:
My code has a bug in it; it finds about 1% too many solutions (not too few!).
And I am sure you can fix it.
 
Back
Top