P357-Project Euler

  • Thread starter Arman777
  • Start date
  • #1
2,170
189
[Project Euler Problem 357]

Consider the divisors of 30: 1,2,3,5,6,10,15,30.
It can be seen that for every divisor d of 30, $$d+30/d$$ is prime.

Find the sum of all positive integers n not exceeding 100 000 000
such that for every divisor d of n, $$d+n/d$$ is prime.


Python:
import math
import time

start = time.perf_counter()

def divisor_generator(n):
    '''Generates the divisiors of input num'''
    large_divisors = []
    for i in range(1, int(math.sqrt(n) + 1)):
        if n % i == 0:
            yield i
            if i*i != n:
                large_divisors.append(n / i)
    for divisor in reversed(large_divisors):
        yield divisor

def is_prime(n):
    if n < 2:
        return False
    if n == 2:
        return True
    if n != 2 and n % 2 == 0:
        return False
    for i in range(3,int(n**(1/2))+1,2):
        if n % i == 0:
            return False
    return True

def test_condition (divisor_array, num):

    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''

    if len(divisor_array) %2 != 0: # the divisors of num = d3 is [1,d2,d3], and d2*d2=d3 hence d2+d3/d2 = 2d2 which is not prime
        return False
    if sum(divisor_array) %2 != 0: # the divisors of num = d4, [1,d2,d3,d4] (1+d4)=Prime and (d2+d3)=Prime hence sum([1,d2,d3,d4]) must be even
        return False
    for i in range(len(divisor_array)//2):
        if is_prime(divisor_array[i] + divisor_array[-i-1]) == False:
            return False
    return True

Sum = 0
for num in range(2,10**6,2): #since for odd num we do not have prime numbers
    divisor_list_num = list(divisor_generator(num))
    if test_condition(divisor_list_num,num) == True:
        Sum += num


print(Sum)

end = time.perf_counter()
print("Time : ", end-start, "sec")

It's good up to ##\text{num} = 10^6## but then it gets slow after that. For instance it takes 16 min to solve ##10^7## case.

How can I inrease the speed of my code ?
 

Answers and Replies

  • #2
14,288
8,316
Trade space for speed. Can you create a list of primes once and use it in your prime test instead of that for loop test you have now? By using more space to store the list you can decrease the time of computation...
 
  • #3
2,170
189
Trade space for speed. Can you create a list of primes once and use it in your prime test instead of that for loop test you have now? By using more space to store the list you can decrease the time of computation...
If you mean something like
Python:
  if divisor_array[i] + divisor_array[-i-1] not in list_of_primes:
            return False
    return True

Thats 16 times slower than my original code
 
  • #4
willem2
2,111
372
Thats 16 times slower than my original code
I think you need a set or a dictionary to make it faster. The program is now doing a linear search.
 
  • Like
Likes scottdave, Klystron and StoneTemplePython
  • #5
2,170
189
I think you need a set or a dictionary to make it faster. The program is now doing a linear search.
What do you mean by that. How I can use them to make my code faster
 
  • #6
41,295
18,928
How can I inrease the speed of my code ?

The Sieve of Eratosthenes is a fast way to generate primes. Use that to initialize a Python set containing all the primes up to the largest you will need; then just check for set membership to test numbers for being prime.
 
  • #7
41,295
18,928
Can you create a list of primes once and use it in your prime test

Membership lookup in Python lists is very slow as the list gets larger. Set membership testing is constant time, so a Python set should be used for this. (Or a dict with dummy values for each key in versions of Python too old to have the set builtin, but those are pretty rare these days.)
 
  • #8
2,170
189
The Sieve of Eratosthenes is a fast way to generate primes. Use that to initialize a Python set containing all the primes up to the largest you will need; then just check for set membership to test numbers for being prime.
Okay here is my code and its still slow.. ?

Python:
import math
import time
from itertools import compress

start = time.perf_counter()

def divisor_generator(n):
    '''Generates the divisiors of input num'''
    large_divisors = []
    for i in range(1, int(math.sqrt(n) + 1)):
        if n % i == 0:
            yield i
            if i*i != n:
                large_divisors.append(n / i)
    for divisor in reversed(large_divisors):
        yield divisor

def prime_number(n):
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return {2,*compress(range(3,n,2), sieve[1:])}


list_of_primes = prime_number(10**5)

def test_condition (divisor_array, num):

    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''

    if len(divisor_array) %2 != 0: # the divisors of num = d3 is [1,d2,d3], and d2*d2=d3 hence d2+d3/d2 = 2d2 which is not prime
        return False
    if sum(divisor_array) %2 != 0: # the divisors of num = d4, [1,d2,d3,d4] (1+d4)=Prime and (d2+d3)=Prime hence sum([1,d2,d3,d4]) must be even
        return False
    for i in range(len(divisor_array)//2):
        if (divisor_array[i] + divisor_array[-i-1]) not in list_of_primes:
            return False
    return True

Sum = 0
for num in range(2,10**5,2): #since for odd num we do not have prime numbers
    divisor_list_num = list(divisor_generator(num))
    if test_condition(divisor_list_num,num) == True:
        Sum += num


print(Sum)

end = time.perf_counter()
print("Time : ", end-start, "sec")

By the way, this prime lister can list primes up to ##10^5## in ##0.00164## sec and up to ##10^8## in 2 sec. I think that's quite fast.

If this is what you mean that its still 16 times slow. My original code finishes in 0.9 sec.
So to search up to ##10^8## we should finish this ##10^5##test in 0.01 sec or so
If you mean I should create a list of primes for each num that even more slow.

However if you mean something entirely different then I wrote then it would be better ıf you give me an example code.
 
Last edited:
  • #9
41,295
18,928
it would be better ıf you give me an example code.

We can't give a complete solution here since this is an Euler problem and solutions are not supposed to be posted publicly. In fact I am going to move this thread to the homework forum for that reason.
 
  • #10
41,295
18,928
here is my code and its still slow.. ?

The list_of_primes should not be a list. It should be a set. List lookups are slow in Python for large lists. Set lookups are constant time (since Python sets use the same underlying lookup code as Python dict keys, i.e., a hash table).
 
  • #11
2,170
189
The list_of_primes should not be a list. It should be a set. List lookups are slow in Python for large lists. Set lookups are constant time (since Python sets use the same underlying lookup code as Python dict keys, i.e., a hash table).
Oh I see okay. I did not know that list and set can alter the solution time that much.

However the difference is not much from my original code. I tested this time for ##10^6## case and the difference is only 2-3 sec so still slow.
 
  • #12
41,295
18,928
still slow

Since you have a generated set of primes, you might also consider generating the divisors of a number from its prime factors (since prime factorization of numbers in the given range should be pretty fast with a pre-computed set of primes), instead of testing every single number from 1 to sqrt(n).
 
  • #13
14,288
8,316
Perhaps instead of using the not in listofprimes you used a binsearch since its a sorted list.

I think you need to see where its spending time first after having tried these suggestions as you humans often have misconceptions about what takes too long to run.
 
  • #14
41,295
18,928
Perhaps instead of using the not in listofprimes you used a binsearch since its a sorted list.

This is still slower than a hash table lookup, which you get by using a Python set instead of a list.
 
  • #15
14,288
8,316
Thanks I’m rather new to python and not completely aware of its many features and their relative speeds. I tend to make sure my program works, try to find the speed bottlenecks and fix those based on the 80/20 rule ie 80% of the work is done in %20 of the code.
 
  • #16
willem2
2,111
372
I
However I checked my answer and it is wrong. Any idea why ?
You get the wrong answer for 2.

I think I see the problem with this algorithm. You use a generator function and yield, so you would think this would have the effect of not computing more divisors than you need. To do this, you must use the generator function in a for loop directly and not assign it to a variable first.

If you put for d in divisor_generator[num]: in your test_condition function, you won't be generating any extra divisors after the first one has produced a non-prime. This made the original problem 4 times as fast, so at least 75% of the time was spent on generating divisors.
If you want to speed up the divisor generator further you can look at any technique for integer factorization.

All divisors need to be square free, if the number is divisible by p it can't be divisible by p^2 . If not p+n/p won't be a prime.

r.
 
  • #17
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
29,939
15,618
Profile your code. If you want your code to run faster, you need to know where it is spending its time now. Profile your code. Don't go looking for inefficient code first. Profile your code. Code may be inefficient, but not the most inefficient, so fixing it will not help as much as you want. Profile your code.
 
  • Like
Likes Tom.G and jedishrfu
  • #19
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
29,939
15,618
You don't think I was being too subtle, do you?
 
  • #20
2,170
189
My recent code was producing wrong results as divisors.

Profile your code. If you want your code to run faster, you need to know where it is spending its time now. Profile your code. Don't go looking for inefficient code first. Profile your code. Code may be inefficient, but not the most inefficient, so fixing it will not help as much as you want. Profile your code.
Code:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:1009(_handle_fromlist)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:103(release)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:143(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:147(__enter__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:151(__exit__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:157(_get_module_lock)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:176(cb)
        2    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:222(_verbose_message)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:232(_requires_builtin_wrapper)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:307(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:311(__enter__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:318(__exit__)
        4    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:321(<genexpr>)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:369(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:416(parent)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:424(has_location)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:433(spec_from_loader)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:504(_init_module_attrs)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:576(module_from_spec)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:58(__init__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:663(_load_unlocked)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:719(find_spec)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:740(create_module)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:748(exec_module)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:765(is_package)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:78(acquire)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:855(__enter__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:859(__exit__)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:882(_find_spec)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:948(_find_and_load_unlocked)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:978(_find_and_load)
        1    1.451    1.451   24.929   24.929 smthing.py:1(<module>)
        1    0.019    0.019    0.019    0.019 smthing.py:18(prime_number)
   499999    0.540    0.000    0.610    0.000 smthing.py:28(test_condition)
10630890   22.430    0.000   22.850    0.000 smthing.py:7(divisor_generator)
        3    0.000    0.000    0.000    0.000 {built-in method _imp.acquire_lock}
        1    0.000    0.000    0.000    0.000 {built-in method _imp.create_builtin}
        1    0.000    0.000    0.000    0.000 {built-in method _imp.exec_builtin}
        1    0.000    0.000    0.000    0.000 {built-in method _imp.is_builtin}
        3    0.000    0.000    0.000    0.000 {built-in method _imp.release_lock}
        2    0.000    0.000    0.000    0.000 {built-in method _thread.allocate_lock}
        2    0.000    0.000    0.000    0.000 {built-in method _thread.get_ident}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.any}
        1    0.000    0.000   24.929   24.929 {built-in method builtins.exec}
        4    0.000    0.000    0.000    0.000 {built-in method builtins.getattr}
        5    0.000    0.000    0.000    0.000 {built-in method builtins.hasattr}
   999499    0.070    0.000    0.070    0.000 {built-in method builtins.len}
        2    0.000    0.000    0.000    0.000 {built-in method builtins.print}
   499999    0.094    0.000    0.094    0.000 {built-in method math.sqrt}
        2    0.000    0.000    0.000    0.000 {built-in method time.perf_counter}
  5065196    0.326    0.000    0.326    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        2    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
        2    0.000    0.000    0.000    0.000 {method 'rpartition' of 'str' objects}


I did this test for ##10^6## case and I used the code in post #8 (with changing something).
 
Last edited:
  • #21
2,170
189
I

You get the wrong answer for 2.

I think I see the problem with this algorithm. You use a generator function and yield, so you would think this would have the effect of not computing more divisors than you need. To do this, you must use the generator function in a for loop directly and not assign it to a variable first.

If you put for d in divisor_generator[num]: in your test_condition function, you won't be generating any extra divisors after the first one has produced a non-prime. This made the original problem 4 times as fast, so at least 75% of the time was spent on generating divisors.
If you want to speed up the divisor generator further you can look at any technique for integer factorization.

All divisors need to be square free, if the number is divisible by p it can't be divisible by p^2 . If not p+n/p won't be a prime.

r.
I ll try this now, but I think I did not quite understand it
 
  • #22
2,170
189
Python:
import math
import time
from itertools import compress

start = time.perf_counter()

def prime_number(n):
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return  {2,*compress(range(3,n,2), sieve[1:])}

list_of_primes = prime_number(10**8)


def test_condition (num):
    ''' Testing the condition of d+n/d by taking the input as array of divisor and num'''
    if num % 4 == 0:
        return False
    for i in range(1, int(num**0.5) + 1):
        if num % i == 0:
            if (i + num /i) not in list_of_primes:
                return False
    return True

Sum = 0
for num in range(2, 10**8, 2): #since for odd num we do not have prime numbers
    if test_condition(num) == True:
        Sum += num

print(Sum)

end = time.perf_counter()
print("Time : ", end-start, "sec")
Sum = 1739023853136
Time : 54.514943536000004 sec

Where I am doing wrong, now ??
 
  • #23
willem2
2,111
372
1 also should be included.
 
  • #24
2,170
189
1 also should be included.
Yes thanks...I solved it now.I never thought to add 1.
you won't be generating any extra divisors after the first one has produced a non-prime.
Also thanks for this tip.
 
  • #25
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
29,939
15,618
If you want your code to run faster, you not only need to profile it, but to look at the results of the profile.

It says you are spending the majority of your time in divisor_generator. So you have three options:

  1. Speed up divisor_generator. Advice on changing your data structures fall into this category.
  2. Call divisor_generator less often. If instead of looping over odd numbers you looped over prime numbers (d is always one less than a prime. Why?) your code would run ~8 times faster.
  3. Don't call divisor_generator at all. Nothing says d is the right variable to loop over.
That third point is very important. Three times in three threads you have posted a solution based on exhaustive tests, and three times that's not the most efficient way to solve the problem, and twice (so far) you've gotten upset with people who have had the temerity to suggest this. While I haven't coded this up, I am certain my algorithm could do this in under a second.
 
  • #26
2,170
189
d is always one less than a prime. Why?)

Because d+1 always have to be prime



your code would run ~8 times faster.

It actually runs 2 times faster.

I did not notice this. I wish I did tho.

While I haven't coded this up, I am certain my algorithm could do this in under a second.

Of course. I don't think the time is that much important (for this type of problems, such as problems in PE) unless it takes more than minutes or hours. For instance, I saw an algorithm in the forum page that finds the solution in 0.70 sec in C++.
That third point is very important. Three times in three threads you have posted a solution based on exhaustive tests, and three times that's not the most efficient way to solve the problem, and twice (so far) you've gotten upset with people who have had the temerity to suggest this.

Which three thread? That's why I am asking here to improve the code. I am not exactly using the brute force to solve these problems but most of the times I am missing some stuff. Next time, I 'll profile my code and try to see where the problem is.

In PE problems, mathematical tricks speeds up the code most of the times rather than the code itself. And again most of the times I miss those points ( such as d + 1 must be prime case)
 
  • #27
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
29,939
15,618
What you call "mathematical tricks" other people call "programming". It's a vital skill to understand the framing of a problem to allow for an accurate and speedy solution. Also, the point of these problems is seldom "can I write an exhaustive test?" It's "can I learn how to think about these problems to do better than an exhaustive test"?

In this case, factorizing is slow but multiplication is fast. Furthermore, looking at a list of factors - even prime factors - will tell you a lot about whether this number works or not. For example, I can tell that 156 will not work, but 190 will. (More precisely, 156 does not need to be tested, but 190 does - and it passes) So looping over candidates may not be the way you want to do things.

Can you think of a property of the prime factorization that would help you? And given that, what should one loop over?
 
  • #28
2,170
189
Can you think of a property of the prime factorization that would help you? And given that, what should one loop over?
I couldn't find it maybe more hint ?
 
  • #29
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
29,939
15,618
Why doesn't 18 work? 19 is prime, and it's not divisible by 4.
 
  • #30
2,170
189
Its a multiple of a square number. So it will not work.

I edited my code according to it. But I see no improvment on time.
 
  • #31
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
29,939
15,618
You need to pay more attention to what people are writing.

Can you think of a property of the prime factorization that would help you?
Nothing says d is the right variable to loop over
you have posted a solution based on exhaustive tests, and three times that's not the most efficient way to solve the problem
 
  • #33
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
29,939
15,618
I'll tell you in a few days. It is important that you think for yourself. You have plenty of hints.
 
  • #34
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
29,939
15,618
Another hint. How many Sophie Germain primes are there under 50,000,000? Why do I care? Why do I care how fast I can find them?
 
Last edited:
  • #35
2,170
189
Another hint. How many Sophie Germain primes are there under 50,000,000? Why do I care? Why do I care how fast I can find them?
I find 229568 primes.

I looked at the definiton etc. However I did not quite understand how can we use them. For instance the solution of the number + 1 must be prime. These primes are,

(2,3,7,11,23,31,43,59,71,79,83)

However the Sophie German primes,

(2, 3, 5, 11, 23, 29, 41, 53, 83)

I couldn't make the connection. Some of them are in the solution but some of them are not.

Why do I care? Why do I care how fast I can find them?
Its important to optimize the codes..?

I added the condition of num / 2 + 2 = prime. My time now reduced to half, 24 second.
 
Last edited:

Suggested for: P357-Project Euler

  • Last Post
Replies
2
Views
773
  • Last Post
Replies
5
Views
1K
  • Last Post
Replies
27
Views
2K
Replies
3
Views
432
Replies
11
Views
511
  • Last Post
Replies
3
Views
586
Replies
25
Views
1K
Replies
4
Views
305
Replies
3
Views
597
Top