Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

MOdular Hypergeometric Sieve and Parallel Processing

  1. Dec 2, 2006 #1
    [Note: P and p throughout are probabilities]

    Fact: The hypergeometric distribution is the precise probability of crawing from a dichotomous (S-F) population without replacement.

    A family of probabilistic sieves can be constructed using this fact as follows.

    Step 1: Suppose we know all of the primes less than the root of some natural number N given by [itex]\pi(\sqrt{N})[/itex].

    Step 2: Draw at random one prime from this set to be tested as a factor of N. If there are M divisors of N below it's square root (N is composite) the probability of getting a divisor on the first draw is given by

    (1) [tex]P(X=1)=h(1,1,M,\pi(\sqrt(N))) = \frac{C(M,1)C(\pi(\sqrt(N)),0)}{C(\pi(\sqrt(N)),1)}[/itex]

    Where C(x,y) is the number of combinations of size y that can be taken from x.

    Generalize 1: Generalize this to sample size n by using n draws without replacement and get

    (2) [tex]P(X>1)=\sum_{x=1}^nh(1,n,M,\pi(\sqrt(N)))[/itex]

    Generalize 2: Use parallel processing by partitioning the set of primes less that root N to two computers (S1 and S2) where S1 is given the whole set minus one prime peculiar to S2, and S2 is given the whole set minus one prime peculiar to S1. The probability of finding a divisor is then

    (3) [tex]P_{S1}(X>1)+P_{S2}(X>1)=\sum_{x=1}^nh(1,n,M,\pi(\sqrt(N))-1)+\sum_{x=1}^nh(1,n,M,\pi(\sqrt(N))-1)[/itex]

    Another discrete ditribution that can be used to try to maximize likelihood of findin a divisor on a random search is the negative binomial distribution. In this case we start with the set of primes less than root N and assume there are M divisors in this set (N is not prime). Now after each trial we replace the number we removed and draw again at random. The negative binomial distribution gives the probability that we will find a divisor in n trials.

    (4) [tex]nb(x;r,p)=C(x+r-1,r-1)p^r(1-p)^x[/tex]

    Applying this to our problem where [itex]p = \frac{M}{\pi(\sqrt(N))}[/itex] and r = 1 and x is the number of failures (non-divisors) that proceed the first success (finding a divisor). We are only looking for one success, we get

    (5) [tex]nb(x;1,\frac{M}{\pi(\sqrt(N))})=C(x,0)\frac{M}{\pi(\sqrt(N))}(1-\frac{M}{\pi(\sqrt(N))})^x[/tex]

    This is intuitive in that the probability of getting x falses before finding the first divisor decreases if there are more divisors (M) below root N.

    Now that (barring any mistakes on the details) describes a method for using a simple draw and replace or draw without replacement process for finding divisors of N. But it utilizes prior knowledge of the set of primes less than root N. We can get around that somewhat by using modular sieves to produce proxy sets. For example using base six, you first have to clear the register byshowing that N is realtively prime to six, but then for 36<N<49 the primes less than root N are proxied well by the set {1,5}.

    We will never know the number of divisors less than root N but there are ways of estimating size and number of divisors. But even without that we can look at the discrete distributions given above and choose experiments that have the best chance of finding a divisor in the shortest number of steps.

    Philosophically it is interesting that these discrete distributions get used all the time to model real world phenomena. Furthermore I skipped the binomial experiment which is also possibly useful here and that distribution limits out as a Poisson. In fact the condition for that limit is that

    (6) [tex] b(x;n,p) -> p(x;\lambda) [/tex]

    if n-> infinity and p->0 in such a way that np = lambda

    Well in the above experiments n does go to infinity as N goes to infinity and if we are using replacement (we must for this set up) the probability p of success in finding a divisor during any trial is precisely [itex]\frac{M}{\pi(\sqrt(N))}[/itex]. So the product

    (7) [tex] P(X=1)=b(1;n,p)=C(n,1)p(1-p)^{n-1}=np(1-p)^{n-1}[/tex]

    (8) [tex] P(X=1) = frac{nM}{\pi(\sqrt(N))}(1-\frac{M}{\pi(\sqrt(N))})^{n-1}[/tex]

    The Poisson approximation requires

    (9) [tex]\lim_{n /rightarrow {\infinity}}\lim_{p\rightarrow 0}{frac{M}{\pi(\sqrt(N))}=\lambda>0[/tex]

    Is there a way within the context that we could allow the limit to proceed in this manner? Suppose that we know there is only one divisor less than the square root of N ( Some RSA numbers are like this) then w have

    (10) [tex] p = \frac{1}{\pi(\sqrt(N))}[/tex]

    This clear goes to zero as N goes to infinity. Then since [itex]n\leq\pi(\sqrt(N))[/itex]

    (11) [tex] np = \frac{n}{\pi(\sqrt(N))}\leq\frac{\pi(\sqrt(N))}{\pi(\sqrt(N))}=1[/tex]

    So there is a lambda that works and we could say that the binomial distributions limit out at a Poisson distribution with lambda less than one.

    This limiting distribution must be related to the prime number theorem in some way.

    The distribution is

    (12) [tex]p(x;\lambda*)<p(x;1)=\frac{e^{-1}}{x!}[/tex]

    lambda* must be a fundamental constant describing the distribution of primes.
     
    Last edited: Dec 2, 2006
  2. jcsd
  3. Dec 2, 2006 #2

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    I have a guess as to why it's so hard to read your writing; I think it's simply disorganized. You don't have any opening remarks that tell the reader what you intend to show, you don't have any closing remarks that tell the reader what you have shown, and the text of your post doesn't have any syntactic cues to make the important stuff stand out.


    It almost looks like you're trying to suggest that randomly picking numbers and doing trial division is a good way to find factors of N! Even if you intend to use trial division, the optimal way is to trial divide by the most likely prime, then the next most likely prime, and so forth. For a "random" number, that means 2, then 3, then 5, and so forth.

    The average number of (nontrivial) factors smaller than sqrt(N) that a number of size N has is easy to compute. 2 divides half of the numbers of size N, 3 divides a third, 4 divides a quarter, et cetera. The expected number of factors is approximately (1/2) log N - 0.422...
     
  4. Dec 2, 2006 #3
    Come one now studly. I like that dismiss everything written there by saying it is not clear and then put forth some simple little pedantic fact to make yourself seem beyond it all. Your description of what you think I said is hardly close. Do you even know what the hypergeometric or the negative binomial distributions are? Your attitude is what bothers me about you, you do not meet the character qualifications for being a mentor, even if your knowledge might suggest otherwise. If you have comments, comment on the specifics of the post. In fact my post says nothing about trial division as a way of factoring N! N! never gets mentioned. As for my writing style, here I am abit more lax, but my general ability was good enough to get my masters degree. What degree do you have again, have you had to write any documents over ten pages, or at all, for review by a panel of Phd's? What a dweeb!
     
    Last edited: Dec 2, 2006
  5. Dec 2, 2006 #4

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Fine.


    Nowhere in your post do you construct a sieving algorithm. In fact, the only mention of a sieve at all is an aside for producing a "proxy" for the set of primes.


    Here would be a good time to describe in which N you're interested. Very small N? Small N? "Large" N? Random N? RSA N?

    Fine. But since I'm pointing out the problems, I might as well nitpick and point out the fact you're forcing the reader to guess that X means, and you meant to use {} instead of () for your sqare roots.

    Three typos: surely you meant "Generalization 1:" and

    [tex]P(X \geq \mathbf{1})[/itex]=\sum_{x=1}^nh(\mathbf{x},n,M,\pi(\sqrt(N)))[/tex]

    Incidentally, it's usually easier to compute the odds of "at least one success" by first computing the odds of "exactly zero successes".


    That's not a partition. Partitioning the primes would make sense, but your later writing suggests that you really do mean to use the unusual scheme that you describe. You never give any hint as to why you would use it.

    (Assuming you made the same typos as before) [itex]P_{S1}(X \geq 1) + P_{S2}(X \geq 1)[/itex] is not

    P(S1 finds a divisor or S2 finds a divisor).

    In fact, it can even be greater than 1. (e.g. N = 6) I suppose this is a reasonable approximation if you think it very unlikely they could both find a divisor, but since you haven't described N yet, that's not a reasonable assumption.

    How are you planning on using any sort of distribution to maximize any sort of likelihood? You certainly didn't do it above, nor do you do it immediately below... you do make a nod to it later, and my comment is there.

    If N is "random", I've told you how to maximize the likelihood of finding a divisor of N, if your procedure is to do test a sample of n primes to see if they divide N.

    Fine. (I didn't check the math) Why are you now interested in getting exactly one success, when you were previously interested in at least one success?

    This is confusing. It's technically correct, because "Generalize this to sample size n by using n draws without replacement" is technically a description of sampling without replacement, but the bulk of what you wrote previously is an analysis of the probability of success. (with the implicit assumption that you're sampling uniformly)


    This is a major source of my confusion. The words you have written here make it sound like you are seriously proposing randomly picking primes and doing trial division as a method for finding factors of large N, and that we use the above probabilities of success to optimize this approach. (Incidentally, knowing probability of success is entirely different than number of steps)

    But you seem learned enough to know that trial division is asymptotically bad, and that it always takes more tries on average to find something by sampling with replacement than by sampling without replacement. So I don't know what to make of this at all.

    I'm tired, so I'm going to stop at this point.


    Unfortunately, part of my job as mentor is also to be moderator, and I have to balace my suspicion that you are knowledgable, and could write interesting stuff if you cleaned it up against the reality of what you actually post. Despite this last outburst, I still plan on giving you the benefit of the doubt. Alas, I have a history of extending that for far too long. :frown:


    P.S. sometimes, exclamation points are used for emphasis, rather than to denote a factorial. :tongue:
     
  6. Dec 2, 2006 #5
    Ok, I do admit that I could try to write a small introduction to things that will help. But, I have looked at alot of your posts, and I rearely see you do anything except post some formula that anyone could get from a textbook. You want my respect show me that you don;t need to make excuses for attacking things you don;t understand and start being a mentor. Moderator is supposed to be a position of resolving disputes, not the opportunity to dispose of those who challenge you.

    BTW, you did not answer the question do you or do you not know what the hypergeometric and negative binomial probability distributions are? Quick, call up Wikipedia! I've reported you this time, hoping that you are just a rogue with a ego complex, but maybe you are actually the admin too. Sad if that's true.

    Now if you want to come back at this post and start talking like someone who wants to understand but does'nt know everything, I would happy to do the same.

    As for your observation that N is more likely to me divisible by smal primes: once you understand the experiments being formulated above, we can factor that in, it is not needed to concieve of the experiments, but it will be useful in fine tuning them. And, please tell me you have enough peripheral vision to see that in fact all of those experiments generalize to sieves.

    Though I really should have to, I apologize for losing my temper with you. I am going to do my best to ignore the attitude that resonates through your posts. It only infuriates because I have seen it so many times and in so many young mathematicians before. You show me that you know what it means to be humble and admit that you did not understand the post without attacking me and we can talk. But I'll be damned to hell if I am gonna sit back here and let you grandstand a post that you clearly did not understand but are not humble enough to ask about.

    Odds? We are talking probability here. Odds are probability related but they are for the race track and logistic regression. Do you know how to formulate a probability space? Anyways, I'm getting worked up again.

    Using a reduced residue system mod a as a proxy to the set of primes less than root N requires first showing that N and the base a are coprime. Think about this now, it makes sense to use the primorials for the sequence of possible bases because it clears out those highly likely factors. It is true that a randomly chosen N will be divisible by 2, half the time, and by three, 1/3 the time, and by 2 and 3, 1/6 the time etc. That means that simple base six modular factoring algorithm is accounting for 1/2 + 1/6 = 2/3 of all natural numbers just by using the Euclidean Algorithm to show N coprime to 6, the base 30 is accounting for 22/30 = 11/15 > 66%. The ideas above however are looking at that set of possible x's that come from setting up the equation

    N = az+r[z]=(ax+r[x])(ay+r[y])

    z = axy + r[x]y+r[y]x+Floor(r[x]r[y]/a)

    That can be solved for x and then rather than searching the possible x's sequentially you can just randomly pick x's and use the above (well known) discrete probability distributions to estimate how long it will take. True the same principle holds the small x's are possibly proxies for small primes larger than the base. The question that comes up is when is it advisable to keep doing trial division and when do you gain more by stepping into the modular algorithm. I say working via the algorithm is better because I do not need to know the primes and the set of possible x's here is an approximation to the number of primes less than root N. (note the x's here are not the same as the x that appears in the distributions in the first post.)

    As for why the hypergeometric looks for at least one success and the other looks for only one it is because the hypergeometric is based upon drawing a sample of size n from the set of possibilities. We want to understand then not just how to get one, but what if we drew a set of five and there were two divisors in it, this could happen, so for the hypergeometric we want P(X>=1) to account for our chances of success. Calculating the probability of zero means nothing here. Now the negative binomial distribution corresponds to an experiment where we are drawing one item at a time, replacing it, and then counting how many failures happen before a certian number of successes. With that one we only want the probability of how many failures precede the first success.

    Now here is my admission, this is real painful, but perhaps some of the details of the equations need to be ironed out. But the way to handle that is to have a discussion, you see that is how mathematicians work together because no matter what the mid-level dweebies have pounded into you in order to work with another human being you have accept that many errors are going to be made on the way to realizing a good idea. Please do me a favor and lighten up a little bit. This is not a professional journal, at best is a scratchpad for the sharing of ideas (hopefully good ones but certainly many not so good.) If we wanted someone to regurgitate Euler's best, I know dozens of Phd's who pride themselves on being technicians like that.

    Don't be afraid to think for yourself and use your abilities in math processes to seek out answers to whatever you are curious about. The fundamentals of how to do math, of how to prove theorems, are not going to let you down. Sometimes we all miss stuff, and I for one am not going to jump all over you if I see that you missed something.

    The most important skill you can get in this world is not how to solve an ODE it is how to treat people (whether they are good, bad, right, or wrong) in a way that dignifies yourself and in humility calls them to something better. Because I know this I am going to apologize if I have lost my tempter with you in the past. You've offered your opinion of me in a tone that I find insulting, I will offer mine of you. I think you are young (possibly adolescent though maybe in your twenties) and have been around people who allow you to act poorly and have maybe treated you in the same way you now often treat others. It will be better for you to forget all the math you ever knew and to learn how to love people. That is not a justification for saying things that are false, but grace is part of the public forum and this is a public forum. Instead of attacking people who post things you don't like, try being a real mentor and coming alongside those people if they will allow it. If not, simply post your objection, taking time to make your case and then be patient.

    If you would like to have a dialog about one of those experiments I mentioned above let me know, PM me whatever. I'm here for you. But please don't act like you got it and then tear them down when it is clear you don't have a clue.

    Here is a reference for those distributions, but any good stats book will have them in it.

    Devore, Jay L. "Probability and Statistics for Engineering and the Sciences" Brooks/Cole 1982 Monterrey

    Jay Devore was a professor at Cal Poly back in those days. I don't know what he is doing now.
     
    Last edited: Dec 2, 2006
  7. Dec 3, 2006 #6

    Hurkyl

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    You completely miss the point of my objection. You appear to consider two algorithms:

    (1) Randomly choose a prime (without replacement) and trial divide.
    (2) Randomly choose a prime (with replacement) and trial divide.

    Then, you analyze (1) by considering the likelihood of getting at least one success in n trials, but you analyze (2) by considering the likelihood of getting your first success in the n-trial. Why? Why not analyze both algorithms in the same way?

    More importantly, why make this analysis at all? The only hint as to why you might be interested in these distributions is

    But even without that we can look at the discrete distributions given above and choose experiments that have the best chance of finding a divisor in the shortest number of steps.​

    but the above distributions aren't (directly) comparable since they measure different things. And as I have already suggested, there's a more direct proof that on average, (1) will find a divisor before (2). (And that looking at primes in sequential order will beat (1))


    Allow me to remind you that, for this random variable,

    [tex]P(X = 0) + P(X \geq 1) = P(X \geq 0) = 1,[/tex]

    so knowing P(X = 0) allows you to compute [itex]P(X \geq 1)[/itex].



    This is a good example of why I urge you to organize and otherwise clean up your posts. The ideas above suggest that you were about to say "instead of using primes, we could use things relatively prime to a". But you don't say that, and instead talk about looking at solutions to some equation. Thus, I'm left wondering whether or not that's your intent.

    Now, your equation is problematic. You introduce several new variables, and write down an equation that doesn't make sense in any interpretation of those variables that I've imagined. My best guess is that you meant N = xy to be a factorization of N, and that you meant to write

    [tex]
    N = \left(a \left\lfloor \frac{N}{a} \right\rfloor + r[N] \right)
    = \left(a \left\lfloor \frac{x}{a} \right\rfloor + r[x] \right)
    \left(a \left\lfloor \frac{y}{a} \right\rfloor + r[y] \right)
    [/tex]

    where

    r[n] = the residue of n modulo a lying in the range [itex]0 \leq r[n] < a[/itex].

    But you talk about solving for x, which doesn't make sense.

    Now, if you had said "If N is relatively prime to a, then instead of using primes, we could use things relatively prime to a", then I could safely assume this was meant to be a proof and ignore it and the problems contained therein. But I don't have that option.

    I don't mean this to be an attack, or an insult, or anything along those lines: I really think it's a good piece of advice. If you made your ideas clear, then these little errors and omissions wouldn't matter so much: the reader can simply say "Yah, I believe that" and move onto the next point. When possible, it's bad to make the reader have to follow the details to get the overview, especially when the details have gaps and errors!




    *moderator hat on* For the record, I consider this to be one of the worst attidues one can have. Rejecting criticism because you're the superior being is a quick way to be shown the door.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?



Similar Discussions: MOdular Hypergeometric Sieve and Parallel Processing
  1. Twin Prime Sieve (Replies: 2)

  2. A new prime sieve (Replies: 23)

  3. Mersenne Sieve (Replies: 6)

Loading...