Efficiency: prime test vs prime generator

  1. Hi everyone,

    This is a simple question which has probably already been solved by someone, but I couldn't find it in the research anywhere, so maybe somebody here can help.

    First of all, say we have a program which uses a new algorithm that outputs the nth prime when you input n, I call it a prime generator. I don't (quite) have such a program, but say we did.
    say you input 10000 and it gives you the 10000th prime
    then you input 10001 and it gives you 10001th prime
    then you input 10002 and it gives you 10002th prime
    then you input 10003 and it gives you 10003th prime
    It continues in like fashion until it finds all primes in a given range, each of which requires some time to compute.

    now this program is racing against the fastest test program that we have so far, and this test program checks ALL ODD INTEGERS after the 9999th prime to SEE IF THEY ARE PRIME. After checking alot of composites, it finds the 10000th prime. then it checks more odds until it finds the 10001th prime. In like fashion it continues checking each odd until it finds all the primes in the given range.

    My question is, how efficient must the first program be to surpass the second program in efficiency?

    Obviously, this is a non-trivial question, because the generator doesn't need to test all of the composites in between each prime.

    Thanks for the help.
  2. jcsd
  3. NateTG

    NateTG 2,537
    Science Advisor
    Homework Helper

  4. clarification

    I guess that is exactly the question I have, how fast must it run to find ALL OF the primes in a range at the same rate that the second program finds them, given that the second program checks every odd between the nth prime and the n+1th prime, and the first program just computes the nth prime and the n+1th prime directly from input values n and n+1, not needing to test anything at all - particularly not even considering values between the two primes. No test, just input and output. Input 42, output the fourty-second prime. Maybe next you input 100 and output the one hundredth prime, it can skip around that way.

    Put another way, given that you know 2,3,5...31, find primes through 53.
    The second program tests 33,35,37,39,41,43,45,47,49,51, and 53 for primality. The first program inputs 12 and gets out the 12th prime, 37,
    inputs 13 and gets out 41, inputs 14 and gets out 43, inputs 15 and gets out 47, inputs 16 and gets out 53, and is done. 5 inputs, 5 primes output.
    The second program input 11 odd numbers to test for primality, and output 5 primes.
    For higher ranges, the second program might have to test hundreds of consecutive odds to find 5 consecutive primes, while the first program will still only have 5 inputs to output the 5 primes.
  5. Hurkyl

    Hurkyl 16,090
    Staff Emeritus
    Science Advisor
    Gold Member

    There are much faster ways of finding primes in a range than testing each number in the range individually... I don't know if that matters to you, though.
  6. shmoe

    shmoe 1,994
    Science Advisor
    Homework Helper

    The number of inputs doesn't really matter considering the work you're going to have to put into going from the nth prime to the (n+1)st.

    Do you think you can find the nth prime faster than a basic sieve algorithm can find the first n primes?
  7. EVERY prime in a range? If so, I didn't know that, and I'd be interested in a simplified version, plz. If not, then my question stands.

    Shmoe, I'll post my algorithm soon. It doesn't really find the nth prime given n, but it does find all primes in a range (somewhat out-of-order) without considering the composites in between, though it does have some repeats in the output. I expect less repeats as the primes get larger. I'll post it when I have a larger continuous block of time, hopefully monday.

  8. Hurkyl

    Hurkyl 16,090
    Staff Emeritus
    Science Advisor
    Gold Member

    It's called a sieve. Cross out every even number, then every number divisible by 3, then every number divisible by 5, and so on. What's left will be the prime numbers in the range.

    You can continue until you've gone through all primes up to the square root of the high end of your range, or you could just do it with "small" primes to eliminate most of the numbers in your range, then use a different test on the few that remain.
  9. shmoe

    shmoe 1,994
    Science Advisor
    Homework Helper

    Sieve of Eratosthenes:

    Start with a list of the numbers from 2 to n. Cross of every multiple of 2 (except 2). Go to next uncrossed off number, which is 3. and cross off every multiple of 3. Repeat. When the next uncrossed off number is larger than sqrt(n) stop. All the non-crossed off numbers left are prime.


    2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

    knock off multiples of 2:

    2 3 5 7 9 11 13 15

    knock off multiples of 3:

    2 3 5 7 11 13

    All that's left are primes. This is much faster than taking each number in turn and checking if it's prime or not.

    This is very simple and can be sped up somewhat. For example, you can precross off the 2's by storing your list of numbers as a binary string that represents odd numbers (the string 11101101 would mean the numbers 3 5 7 11 13 17 are prime but 9 and 15 are not). (you could also precross off 3, 5, etc..)
  10. Last clarification:

    I do know about the sieve, but you are still "looking at" composites when you eliminate each product of smaller primes, one composite at a time. I'm talking about a sequence of equations which intrinsically eliminates those primes, without striking out each multiple of them one by one. As an example, look at the absolute value of 15 plus or minus 2^x, for outputs less than 7^2=49. No such output can have a factor of 2,3, or 5. This is one of a sequence of equations which gets all primes less than 49. I write this as abs( 15 +/- 2^x ) = q(x), where q(x)<49. Actually, if you want the entire theory in a nutshell, look at
    abs( 3^y * 5^z +/- 2^x ) = q1(x,y,z) where q1(x,y,z)<49,
    then look at q2(x,y,z)= abs( 2^x*3^y +/- 5^z ) and
    q3(x,y,z) = abs( 2^x*5^z +/- 3^y), where qi<49. You can see that I've partitioned the set {2,3,5} into two disjoint sets, one on each side of the +/-. If you work this out, you will get every prime less than 49, depending on the integer exponents >0 that you input. When q1 gets "tired", move to q2, etc. For the next step in the algorithm, partition the set {2,3,...43} into two disjoint subsets, one on each side of the +/-, with each prime independently exponentiated with positive integers, and the output will be prime when it is less than 47^2=2209. To get all primes in that range, repartition the sets as necessary. I haven't actually gotten that high so far, but I have found every single prime less than 37^2=1369 using this method.

    Now, my question wasn't exactly about this algorithm, since the primes it finds come in no linear order and my algorithm does have repeated outputs i.e. it might produce 29 six times (though I expect the repeats to dwindle at high prime values). Also notice that, like the sieve, you can get a set "enriched" with primes simply by ignoring the less-than test, for example, 15 +/- 2^x will eliminate all composites with 2,3, or 5 as factors, even if we look at outputs higher than 49. My question was about an "ideal" program that produces the 49th prime when we input the number 49.

    Here's a simplification of what my question is about:

    Let's pick a range, say all x between 10^9 and 10^10.
    Now, let e=the efficiency of standard programs in proving x is either prime or composite, in terms of the time it takes.
    Let E=the time it takes to find (and prove the primality of) a prime number

    Then E = e * (the number of integers you tested before finding the prime)

    So, if you can prove a number is prime in 100 hours, and it takes my program 1000 hours to generate a prime, then e=100 for you and 1000 for me, yours is ten times more efficient than mine at PROVING PRIMALITY.

    However, if you examined 100 integers before finding the prime, then it took you 10000 hours to find the prime, so E=10000 for you and still only 1000 for me, and mine is ten times more efficient than yours for FINDING A PRIME.

    Now, I think my question is more clear, i.e. how efficient must a program of the second type be, in order to be as fast as the first type?

    I welcome all coments, both positive and negative criticism, although really, how can you criticize a question, when I'm not making any claim? I've already stated my algorithm has a setback or two, such at duplicate prime outputs. I just want to know, in theory, if we had an "oracle" program of this type, how fast would it have to be, in terms of the size of the primes, to top known techniques? Thanks for the comments, and I hope this helps clarify a question that I think is worth studying.
    Last edited: Oct 15, 2004
  11. NateTG

    NateTG 2,537
    Science Advisor
    Homework Helper

    If you want to be precise, if [tex]\pi_n[/tex] is the [tex]n^{th}[/tex] prime, then the second algorithm must run [tex]\frac{\pi_m - \pi_n}{m-n}[/tex] times faster, possibly with some constant factor thrown in.

    If we can approximate [tex]\pi_m[/tex] with [tex]\pi_n+(m-n)ln(x)[/tex]. Then first algorithm is going to run [tex]m-n[/tex] times, and the second algorithm is going to be run approximately [tex]ln(\pi_n) (m-n)[/tex] times, so the second algorithm would have to be roughly [tex]ln(\pi_n)[/tex] times faster.
  12. shmoe

    shmoe 1,994
    Science Advisor
    Homework Helper

    We're interested in finding all primes between 10^9 and 10^10?

    There are pi(10^10)-pi(10^9)=455,052,511-50,847,534=404,204,977~4*10^8 of them.
    If you were to naively check 10^9+1, 10^9+3,... for primality, you'd be doing 4.5*10^9 checks, this is pretty close to 11 times number of primes in this range. No suprise since the density of primes around here is about 1/21.

    In short, your oracle program will have to be able to find the nth prime in pretty close to the time it takes to test eleven 9 digit numbers for primality. I have no idea the actual number of seconds it will take for various primality tests, I've only ever looked at big-O bounds, which can be decieving in practice.

    About your proposed algorithm- it looks familiar. Have you posted it on this or another website before? A big problem with it is how do you guarantee you've found all primes in the specified range? For example, how do you guarantee that there are no values of x, y, z that will give 2^x*3^y-5^z=21? What range of x, y, z do you propose it's sufficient to check? Obviously we know this will never be the case, but that's because we can factor 21 easily, for larger instances it won't be a trivial question. I'm getting deja vu.
  13. Nate,
    Thanks for the specifics, I'm going to print this out and study its implications.
    Yes, I posted it here long ago on the old physics/math site. The only reason I brought it back was because I realized that, inefficient though it might be, it may make up for that inefficiency by intrinsically skipping the composites, particularly for mega-sized primes, especially if we can find an "intelligent" algorithm for picking partitions and exponents that output the smallest possible outputs.

    You have seemingly asked 2 questions:
    1. how do I know all outputs are prime and
    2. how do I know I haven't skipped any primes i.e. all primes are outputs.

    (2.) I don't. I still haven't proven that my procedure will find ALL primes. Any suggestions on how to do this?

    (1.) Using the distributive principle, if the output has a factor that is on one side of the +/-, then that factor must also be on the other side of the +/-. Since this is not the case, by design, we must conclude that the output has no factor in common with either side of the +/-. When we used 15+/-2^x=q(x), q(x) can never have 2,3, or 5 as factors.
    Also, since we are looking only at q(x)<{the next prime squared} i.e. 49 in the example right above, then the square root of q(x) is less than the next prime. Since all primes less than the next prime are on one side or the other of the +/-, and q(x) can't have them as factors, and they are the only primes less than the square root of q(x), then every prime less than sqr root of q(x) doesn't divide q(x).
    Therefore, q(x) is prime.
    Thanks to everyone again for the help.
    Last edited: Oct 15, 2004
  14. Hurkyl

    Hurkyl 16,090
    Staff Emeritus
    Science Advisor
    Gold Member

    You can't neglect the sieve because it is so useful. Crossing out all multiples of, say, the primes less than 1000, then running a primality test on what remains will be faster than running a primality test on all odd numbers in the range.

    On your sequence, I'm curious why you chose that particular form. Once you've partitioned your small primes into two sets, all you need to do is consider things of the form:

    p * P1 + q * P2 where
    P1 is the product of all the primes from the first set
    P2 is the product of all the primes from the second set
    p is an integer with no factors from the second set
    q is an integer with no factors from the first set

    Also, since gcd(P1, P2) = 1, it's fairly straightforward to show that all other primes will be of this form. (even if you never change your partitioning!)

    Also, note that each of your expressions are also of this form.

    However, the problem is, and you've noticed this too, is that you only guarantee that the resulting number will not be divisible by any small prime. So, in terms of "enriching" your search space, you do no better than a simple sieve.

    More explicitly, if you use all the primes < 1000 as your small primes, then numbers generated by your method only have the guarantee that they aren't divisible by any prime under 1000. In terms of "enriching" your search space, you've done no better than simply sieving by all primes under 1000.
  15. shmoe

    shmoe 1,994
    Science Advisor
    Homework Helper

    I was just wondering about 2, 1 is obvious. I used 21 in my example because you claimed (and I don't doubt this) that your method generated all primes less than 49, so 21 would be on your list of numbers not generated and therefore you would hope to conclude it was not prime. In retrospect I should have chosen a more abstract or less obvious example.

    For 2, it may turn out to be false. You do have a staggering number of variables to play with, but an exception is still possible, I don't know. One of the joys of number theory is that this exception may be enormous and impossible to find by experimentation. Even if this turns out to be true, you'd need some sort of reasonable bounds on your exponents (which guarantee each prime will have a representation within those bounds) to have any shot at all of making this practical and beating sieve methods.
  16. I've been on fall break, so haven't had a chance to answer - sorry.
    To answer an earlier question more accurately, I have posted this before but I'm not sure it was here. The physics/math forum I referred to was on sciforums.com, not here.
    I believe, based only on the experience of having solved so many of these, that it will be possible to put upper bounds on the exponents. Let me illustrate what seems to be a "natural phenomenon" when doing these computations.
    13,11,7,1,17,49stop subtracting
    17,19,23,31,47,79stop adding
    Now increment exponent on 3:
    and keep going, incrementing exponents on the 3 and 5 as needed.
    At some point, you will increment an exponent on 3 or 5 to get the next larger size on the left, and you will get NO OUTPUTS less than 49, particularly with subtraction. I call this the beginning of a "desert", where no outputs are to be found.
    At that point, you could stubbornly try higher exponents on the left, until you find a power of 2 on the right that produces more solutions within 49. It might be possible to find all primes in this fashion, BUT...
    The interesting thing that I've discovered is that, so far, this isn't necessary. At the point you reach this desert, simply repartition the set and continue:
    abs(10+/-3^x) and abs(6+/-5^x)
    incrementing powers on the left as needed. So far, this has been sufficient to find all primes in the given range.
    As far as the staggering number of variables, very true. The fact that I haven't found an exception so far is encouraging, particularly when you consider how unlikely it seems that I would get any sufficiently small solutions at all for some of them. The highest I've gone so far included partitioning and exponentiating primes from the set {2,3,5,7,11,13,17,19,23,29, and 31} and I found every prime from 37 to 37^2=1369, without going beyond the natural "desert" that I mentioned.
    One way I've thought about proving no primes are skipped is the following:
    29, 30, 31, 32, 33, 34, 35, ...... ,58, 59, 60, 61, 62
    0, 1, 2, 3, 4, 5, 6, ........,29, 30, 31, 32, 33
    subtract a bottom number from the corresponding top number to get 29. Now, to fit my abs( +/- ) scheme, every top/bottom pair must have a 2, a 3, and a 5 in it, along with no other primes (though as Hurlyl noticed, putting in an extra prime here or there changes nothing, I prefer keeping things "pure" until I can decide if it works or not for every prime). Now, 2 is in every pair. 3 is in two-thirds of the pairs, and 5 is in two-fifths of the pairs. So to find a pair that has 2,3, and 5 in it, we only need to look at 2/3 * 2/5 = 4/15 so out of 15 such pairs, four of them will have 2,3, and 5 as factors of one number or the other. It get's trickier when we try to eliminate higher primes from the pairs, but notice that five-sevenths of them will NOT have a seven as a factor, nine-elevenths will NOT have 11, eleven-thirteenths will NOT have 13, fifteen-seventeenths will NOT have 17, etc. So multiply all of these by the 4/15 that we found above. Where do we cut it off? I'm still working out the details, but it looks like if our fraction is greater than 1/n, and n is greater than any of the primes considered so far, we can say we've found a representation of the prime that fits the form given. Or something like that, like I said I'm still looking at it.
    As far as enriched sets goes, I agree that the sieve does the same thing. All I'm saying is that I don't have to waste time striking out every composite that has all of the low primes as factors, every output is automatically relatively prime to these.
    To all interested, I suggest working out all of the primes less than 121 this way, doing the process may answer questions more efficiently than I can, and if you still don't have "faith" in this process after trying it out, I'll be very, very surprised. To start out, look at f(x)=abs(105+/-2^x), it is a fascinating exponential expression, with reflection where it becomes negative, and every prime that it puts out occurs on integer values of x. Graphing some of these things in three dimensions is interesting, also, though adjustments are needed because the exponents are all positive, and the "plus" gets graphed separate from the "minus", etc. If you all have a spare minute, please try a couple, you'll be surprised at how easy it is to fill the whole range from 11 to 121. Remember, I did all primes to 1369 in my head, first checking odds by division to see if they were prime, and then finding a form that fits the procedure given.
    You will also get a single output that isn't usually considered prime, the number 1. 1=3-2=10-9=15-14 etc. all come from the correct form. I believe it wouldn't be too hard to prove that you can always get the number 1. Also, I note the similarity with the theorem that says you can get every multiple of the gcd of two numbers by using linear combinations of them. I wonder if you can get all RELATIVELY PRIME multiples of the gcd of the left and right sides, i.e. all multiples of 1 that are relatively prime to the left and right, using linear combinations of the left and right with the coefficients restricted to factors already present in that side. Ex: Left->15, Right->2 now use a coefficient having only 1,3,5 as factors, to multiply by the left, and a coefficient having only 1,2 as factors to multiply by the right, then subtract the two results. Can we find all multiples of 1, relatively prime to 15 and 2, with this restricted linear combination? Good question, I think the answer is "yes", at least if we use the other two partitions in the same way.
Know someone interested in this topic? Share a link to this question via email, Google+, Twitter, or Facebook