Sum of All Positive Integers n where d+n/d is Prime to 100M

  • Thread starter Arman777
  • Start date
  • Tags
    Euler
In summary: Testing the condition of d+n/d by taking the input as array of divisor and num''' if len(divisor_array) %2 != 0: # the divisors of num = d3 is [1,d2,d3], and d2*d2=d3 hence d2+d3/d2 = 2d2 which is not prime return False if sum(divisor_array) %2 != 0: # the div
  • #1
Arman777
Insights Author
Gold Member
2,168
193
[Project Euler Problem 357]

Consider the divisors of 30: 1,2,3,5,6,10,15,30.
It can be seen that for every divisor d of 30, $$d+30/d$$ is prime.

Find the sum of all positive integers n not exceeding 100 000 000
such that for every divisor d of n, $$d+n/d$$ is prime.


Python:
import math
import time

start = time.perf_counter()

def divisor_generator(n):
    '''Generates the divisiors of input num'''
    large_divisors = []
    for i in range(1, int(math.sqrt(n) + 1)):
        if n % i == 0:
            yield i
            if i*i != n:
                large_divisors.append(n / i)
    for divisor in reversed(large_divisors):
        yield divisor

def is_prime(n):
    if n < 2:
        return False
    if n == 2:
        return True
    if n != 2 and n % 2 == 0:
        return False
    for i in range(3,int(n**(1/2))+1,2):
        if n % i == 0:
            return False
    return True

def test_condition (divisor_array, num):

    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''

    if len(divisor_array) %2 != 0: # the divisors of num = d3 is [1,d2,d3], and d2*d2=d3 hence d2+d3/d2 = 2d2 which is not prime
        return False
    if sum(divisor_array) %2 != 0: # the divisors of num = d4, [1,d2,d3,d4] (1+d4)=Prime and (d2+d3)=Prime hence sum([1,d2,d3,d4]) must be even
        return False
    for i in range(len(divisor_array)//2):
        if is_prime(divisor_array[i] + divisor_array[-i-1]) == False:
            return False
    return True

Sum = 0
for num in range(2,10**6,2): #since for odd num we do not have prime numbers
    divisor_list_num = list(divisor_generator(num))
    if test_condition(divisor_list_num,num) == True:
        Sum += numprint(Sum)

end = time.perf_counter()
print("Time : ", end-start, "sec")

It's good up to ##\text{num} = 10^6## but then it gets slow after that. For instance it takes 16 min to solve ##10^7## case.

How can I inrease the speed of my code ?
 
Physics news on Phys.org
  • #2
Trade space for speed. Can you create a list of primes once and use it in your prime test instead of that for loop test you have now? By using more space to store the list you can decrease the time of computation...
 
  • #3
jedishrfu said:
Trade space for speed. Can you create a list of primes once and use it in your prime test instead of that for loop test you have now? By using more space to store the list you can decrease the time of computation...
If you mean something like
Python:
  if divisor_array[i] + divisor_array[-i-1] not in list_of_primes:
            return False
    return True

Thats 16 times slower than my original code
 
  • #4
Arman777 said:
Thats 16 times slower than my original code
I think you need a set or a dictionary to make it faster. The program is now doing a linear search.
 
  • Like
Likes scottdave, Klystron and StoneTemplePython
  • #5
willem2 said:
I think you need a set or a dictionary to make it faster. The program is now doing a linear search.
What do you mean by that. How I can use them to make my code faster
 
  • #6
Arman777 said:
How can I inrease the speed of my code ?

The Sieve of Eratosthenes is a fast way to generate primes. Use that to initialize a Python set containing all the primes up to the largest you will need; then just check for set membership to test numbers for being prime.
 
  • Like
Likes Klystron
  • #7
jedishrfu said:
Can you create a list of primes once and use it in your prime test

Membership lookup in Python lists is very slow as the list gets larger. Set membership testing is constant time, so a Python set should be used for this. (Or a dict with dummy values for each key in versions of Python too old to have the set builtin, but those are pretty rare these days.)
 
  • Like
Likes jedishrfu
  • #8
PeterDonis said:
The Sieve of Eratosthenes is a fast way to generate primes. Use that to initialize a Python set containing all the primes up to the largest you will need; then just check for set membership to test numbers for being prime.
Okay here is my code and its still slow.. ?

Python:
import math
import time
from itertools import compress

start = time.perf_counter()

def divisor_generator(n):
    '''Generates the divisiors of input num'''
    large_divisors = []
    for i in range(1, int(math.sqrt(n) + 1)):
        if n % i == 0:
            yield i
            if i*i != n:
                large_divisors.append(n / i)
    for divisor in reversed(large_divisors):
        yield divisor

def prime_number(n):
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return {2,*compress(range(3,n,2), sieve[1:])}list_of_primes = prime_number(10**5)

def test_condition (divisor_array, num):

    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''

    if len(divisor_array) %2 != 0: # the divisors of num = d3 is [1,d2,d3], and d2*d2=d3 hence d2+d3/d2 = 2d2 which is not prime
        return False
    if sum(divisor_array) %2 != 0: # the divisors of num = d4, [1,d2,d3,d4] (1+d4)=Prime and (d2+d3)=Prime hence sum([1,d2,d3,d4]) must be even
        return False
    for i in range(len(divisor_array)//2):
        if (divisor_array[i] + divisor_array[-i-1]) not in list_of_primes:
            return False
    return True

Sum = 0
for num in range(2,10**5,2): #since for odd num we do not have prime numbers
    divisor_list_num = list(divisor_generator(num))
    if test_condition(divisor_list_num,num) == True:
        Sum += numprint(Sum)

end = time.perf_counter()
print("Time : ", end-start, "sec")

By the way, this prime lister can list primes up to ##10^5## in ##0.00164## sec and up to ##10^8## in 2 sec. I think that's quite fast.

If this is what you mean that its still 16 times slow. My original code finishes in 0.9 sec.
So to search up to ##10^8## we should finish this ##10^5##test in 0.01 sec or so
If you mean I should create a list of primes for each num that even more slow.

However if you mean something entirely different then I wrote then it would be better ıf you give me an example code.
 
Last edited:
  • #9
Arman777 said:
it would be better ıf you give me an example code.

We can't give a complete solution here since this is an Euler problem and solutions are not supposed to be posted publicly. In fact I am going to move this thread to the homework forum for that reason.
 
  • #10
Arman777 said:
here is my code and its still slow.. ?

The list_of_primes should not be a list. It should be a set. List lookups are slow in Python for large lists. Set lookups are constant time (since Python sets use the same underlying lookup code as Python dict keys, i.e., a hash table).
 
  • #11
PeterDonis said:
The list_of_primes should not be a list. It should be a set. List lookups are slow in Python for large lists. Set lookups are constant time (since Python sets use the same underlying lookup code as Python dict keys, i.e., a hash table).
Oh I see okay. I did not know that list and set can alter the solution time that much.

However the difference is not much from my original code. I tested this time for ##10^6## case and the difference is only 2-3 sec so still slow.
 
  • #12
Arman777 said:
still slow

Since you have a generated set of primes, you might also consider generating the divisors of a number from its prime factors (since prime factorization of numbers in the given range should be pretty fast with a pre-computed set of primes), instead of testing every single number from 1 to sqrt(n).
 
  • #13
Perhaps instead of using the not in listofprimes you used a binsearch since its a sorted list.

I think you need to see where its spending time first after having tried these suggestions as you humans often have misconceptions about what takes too long to run.
 
  • Like
Likes Klystron
  • #14
jedishrfu said:
Perhaps instead of using the not in listofprimes you used a binsearch since its a sorted list.

This is still slower than a hash table lookup, which you get by using a Python set instead of a list.
 
  • #15
Thanks I’m rather new to python and not completely aware of its many features and their relative speeds. I tend to make sure my program works, try to find the speed bottlenecks and fix those based on the 80/20 rule ie 80% of the work is done in %20 of the code.
 
  • #16
I
Arman777 said:
However I checked my answer and it is wrong. Any idea why ?
You get the wrong answer for 2.

I think I see the problem with this algorithm. You use a generator function and yield, so you would think this would have the effect of not computing more divisors than you need. To do this, you must use the generator function in a for loop directly and not assign it to a variable first.

If you put for d in divisor_generator[num]: in your test_condition function, you won't be generating any extra divisors after the first one has produced a non-prime. This made the original problem 4 times as fast, so at least 75% of the time was spent on generating divisors.
If you want to speed up the divisor generator further you can look at any technique for integer factorization.

All divisors need to be square free, if the number is divisible by p it can't be divisible by p^2 . If not p+n/p won't be a prime.

r.
 
  • #17
Profile your code. If you want your code to run faster, you need to know where it is spending its time now. Profile your code. Don't go looking for inefficient code first. Profile your code. Code may be inefficient, but not the most inefficient, so fixing it will not help as much as you want. Profile your code.
 
  • Like
Likes Tom.G and jedishrfu
  • #19
You don't think I was being too subtle, do you?
 
  • Like
Likes jedishrfu
  • #20
My recent code was producing wrong results as divisors.

Vanadium 50 said:
Profile your code. If you want your code to run faster, you need to know where it is spending its time now. Profile your code. Don't go looking for inefficient code first. Profile your code. Code may be inefficient, but not the most inefficient, so fixing it will not help as much as you want. Profile your code.
Code:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:1009(_handle_fromlist)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:103(release)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:143(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:147(__enter__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:151(__exit__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:157(_get_module_lock)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:176(cb)
        2    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:222(_verbose_message)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:232(_requires_builtin_wrapper)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:307(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:311(__enter__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:318(__exit__)
        4    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:321(<genexpr>)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:369(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:416(parent)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:424(has_location)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:433(spec_from_loader)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:504(_init_module_attrs)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:576(module_from_spec)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:58(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:663(_load_unlocked)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:719(find_spec)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:740(create_module)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:748(exec_module)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:765(is_package)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:78(acquire)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:855(__enter__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:859(__exit__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:882(_find_spec)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:948(_find_and_load_unlocked)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:978(_find_and_load)
        1    1.451    1.451   24.929   24.929 smthing.py:1(<module>)
        1    0.019    0.019    0.019    0.019 smthing.py:18(prime_number)
   499999    0.540    0.000    0.610    0.000 smthing.py:28(test_condition)
10630890   22.430    0.000   22.850    0.000 smthing.py:7(divisor_generator)
        3    0.000    0.000    0.000    0.000 {built-in method _imp.acquire_lock}
        1    0.000    0.000    0.000    0.000 {built-in method _imp.create_builtin}
        1    0.000    0.000    0.000    0.000 {built-in method _imp.exec_builtin}
        1    0.000    0.000    0.000    0.000 {built-in method _imp.is_builtin}
        3    0.000    0.000    0.000    0.000 {built-in method _imp.release_lock}
        2    0.000    0.000    0.000    0.000 {built-in method _thread.allocate_lock}
        2    0.000    0.000    0.000    0.000 {built-in method _thread.get_ident}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.any}
        1    0.000    0.000   24.929   24.929 {built-in method builtins.exec}
        4    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
        5    0.000    0.000    0.000    0.000 {built-in method builtins.hasattr}
   999499    0.070    0.000    0.070    0.000 {built-in method builtins.len}
        2    0.000    0.000    0.000    0.000 {built-in method builtins.print}
   499999    0.094    0.000    0.094    0.000 {built-in method math.sqrt}
        2    0.000    0.000    0.000    0.000 {built-in method time.perf_counter}
  5065196    0.326    0.000    0.326    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        2    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
        2    0.000    0.000    0.000    0.000 {method 'rpartition' of 'str' objects}
I did this test for ##10^6## case and I used the code in post #8 (with changing something).
 
Last edited:
  • #21
willem2 said:
I

You get the wrong answer for 2.

I think I see the problem with this algorithm. You use a generator function and yield, so you would think this would have the effect of not computing more divisors than you need. To do this, you must use the generator function in a for loop directly and not assign it to a variable first.

If you put for d in divisor_generator[num]: in your test_condition function, you won't be generating any extra divisors after the first one has produced a non-prime. This made the original problem 4 times as fast, so at least 75% of the time was spent on generating divisors.
If you want to speed up the divisor generator further you can look at any technique for integer factorization.

All divisors need to be square free, if the number is divisible by p it can't be divisible by p^2 . If not p+n/p won't be a prime.

r.
I ll try this now, but I think I did not quite understand it
 
  • #22
Python:
import math
import time
from itertools import compress

start = time.perf_counter()

def prime_number(n):
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return  {2,*compress(range(3,n,2), sieve[1:])}

list_of_primes = prime_number(10**8)def test_condition (num):
    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''
    if num % 4 == 0:
        return False
    for i in range(1, int(num**0.5) + 1):
        if num % i == 0:
            if (i + num /i) not in list_of_primes:
                return False
    return True

Sum = 0
for num in range(2, 10**8, 2): #since for odd num we do not have prime numbers
    if test_condition(num) == True:
        Sum += num

print(Sum)

end = time.perf_counter()
print("Time : ", end-start, "sec")
Sum = 1739023853136
Time : 54.514943536000004 sec

Where I am doing wrong, now ??
 
  • #23
1 also should be included.
 
  • Like
Likes Arman777
  • #24
willem2 said:
1 also should be included.
Yes thanks...I solved it now.I never thought to add 1.
willem2 said:
you won't be generating any extra divisors after the first one has produced a non-prime.
Also thanks for this tip.
 
  • #25
If you want your code to run faster, you not only need to profile it, but to look at the results of the profile.

It says you are spending the majority of your time in divisor_generator. So you have three options:

  1. Speed up divisor_generator. Advice on changing your data structures fall into this category.
  2. Call divisor_generator less often. If instead of looping over odd numbers you looped over prime numbers (d is always one less than a prime. Why?) your code would run ~8 times faster.
  3. Don't call divisor_generator at all. Nothing says d is the right variable to loop over.
That third point is very important. Three times in three threads you have posted a solution based on exhaustive tests, and three times that's not the most efficient way to solve the problem, and twice (so far) you've gotten upset with people who have had the temerity to suggest this. While I haven't coded this up, I am certain my algorithm could do this in under a second.
 
  • Like
Likes scottdave
  • #26
Vanadium 50 said:
d is always one less than a prime. Why?)

Because d+1 always have to be prime
Vanadium 50 said:
your code would run ~8 times faster.

It actually runs 2 times faster.

I did not notice this. I wish I did tho.

Vanadium 50 said:
While I haven't coded this up, I am certain my algorithm could do this in under a second.

Of course. I don't think the time is that much important (for this type of problems, such as problems in PE) unless it takes more than minutes or hours. For instance, I saw an algorithm in the forum page that finds the solution in 0.70 sec in C++.
Vanadium 50 said:
That third point is very important. Three times in three threads you have posted a solution based on exhaustive tests, and three times that's not the most efficient way to solve the problem, and twice (so far) you've gotten upset with people who have had the temerity to suggest this.

Which three thread? That's why I am asking here to improve the code. I am not exactly using the brute force to solve these problems but most of the times I am missing some stuff. Next time, I 'll profile my code and try to see where the problem is.

In PE problems, mathematical tricks speeds up the code most of the times rather than the code itself. And again most of the times I miss those points ( such as d + 1 must be prime case)
 
  • #27
What you call "mathematical tricks" other people call "programming". It's a vital skill to understand the framing of a problem to allow for an accurate and speedy solution. Also, the point of these problems is seldom "can I write an exhaustive test?" It's "can I learn how to think about these problems to do better than an exhaustive test"?

In this case, factorizing is slow but multiplication is fast. Furthermore, looking at a list of factors - even prime factors - will tell you a lot about whether this number works or not. For example, I can tell that 156 will not work, but 190 will. (More precisely, 156 does not need to be tested, but 190 does - and it passes) So looping over candidates may not be the way you want to do things.

Can you think of a property of the prime factorization that would help you? And given that, what should one loop over?
 
  • #28
Vanadium 50 said:
Can you think of a property of the prime factorization that would help you? And given that, what should one loop over?
I couldn't find it maybe more hint ?
 
  • #29
Why doesn't 18 work? 19 is prime, and it's not divisible by 4.
 
  • #30
Its a multiple of a square number. So it will not work.

I edited my code according to it. But I see no improvment on time.
 
  • #31
You need to pay more attention to what people are writing.

Vanadium 50 said:
Can you think of a property of the prime factorization that would help you?
Vanadium 50 said:
Nothing says d is the right variable to loop over
Vanadium 50 said:
you have posted a solution based on exhaustive tests, and three times that's not the most efficient way to solve the problem
 
  • #32
As I said I couldn't find it. What should I loop over then ?
 
  • #33
I'll tell you in a few days. It is important that you think for yourself. You have plenty of hints.
 
  • #34
Another hint. How many Sophie Germain primes are there under 50,000,000? Why do I care? Why do I care how fast I can find them?
 
Last edited:
  • #35
Vanadium 50 said:
Another hint. How many Sophie Germain primes are there under 50,000,000? Why do I care? Why do I care how fast I can find them?
I find 229568 primes.

I looked at the definiton etc. However I did not quite understand how can we use them. For instance the solution of the number + 1 must be prime. These primes are,

(2,3,7,11,23,31,43,59,71,79,83)

However the Sophie German primes,

(2, 3, 5, 11, 23, 29, 41, 53, 83)

I couldn't make the connection. Some of them are in the solution but some of them are not.

Vanadium 50 said:
Why do I care? Why do I care how fast I can find them?
Its important to optimize the codes..?

I added the condition of num / 2 + 2 = prime. My time now reduced to half, 24 second.
 
Last edited:

Similar threads

Back
Top