1. Not finding help here? Sign up for a free 30min tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Probability of Random Numbers

  1. Jun 24, 2017 #1
    1. The problem statement, all variables and given/known data
    • Let 0≤p≤1.
    • Let there be k distinct numbers (they can be natural numbers) a1, a2, ... , ak, each repeating respectively b1, b2, ... , bk times.
    • Let q < ∑r=1k br
    Determine the minimal values of b1 ... bk such that the probability of q numbers chosen out of ∑r=1k br numbers being exactly the same is p.

    Example : consider 1,1,1,1,2 . The probability that 2 numbers chosen out of the 5 is exactly the same is 6/10

    2. Relevant equations

    The only ones I can come up with is
    • h = ∑r=1k br
    and
    • p = ∑r=1k qCb_r where q < br / qCh
    C stands for Combination , in contrast to permutation.

    3. The attempt at a solution

    Technically this is not a homework assignment, but needed for a project where a software needs to generate pseudo-random numbers, with some properties.

    I notice that we have too many variables and just two equations. But a minimal solution should be possible. I personally am stuck, but it looks like I am missing / forgetting some basic properties of probability - with which this problem can be easily solved.. I could apply brute force. The example was generated via brute force. But I am sure a more formal solution is available.

    This is why I classify this as precalculus homework..

    Please help. My background is Physics MSc - so don't hesitate to apply more advanced notation on the solution than precalculus, if needed,.
     
  2. jcsd
  3. Jun 24, 2017 #2

    FactChecker

    User Avatar
    Science Advisor
    Gold Member

    It seems as though the probability will always be a rational number. If p is irrational, the probability can never equal p. If you assume that p is rational, then maybe you can use its rational representation, p = m/n, to figure something out.
     
  4. Jun 24, 2017 #3

    Ray Vickson

    User Avatar
    Science Advisor
    Homework Helper

    For sampling without replacement (i.e., once a number is picked it becomes unavailable to be picked again) you need to have at least one of the ##b_i \geq q## in order to have any chance that you can get ##q## numbers the same. For sampling with replacement, this need not be the case, because with replacement you "put back" a number after it has been selected.

    I get 6/10 in your 11112 example, but only if the sampling is without replacement.

    There is a way of performing your calculations, but it is very computer-intensive. Let's say that all your ##b_i## are ##\geq q##. It was not stated clearly how many numbers you are choosing---you only said you wanted ##q## the same, so the number ##n## picked must be at least ##q##. So, we pick ##n## numbers without replacement. For each ##i = 1,2, \ldots, k##, let ##E_i## be the event that number ##i## is repeated exactly ##q## times in your sample of size ##n##. You want to know something like the probability that at least one of the events ##E_1, E_2, \ldots, E_k## occurs. However, this is still not sufficiently well-defined. Do you want just one of ##E_1, E_2, \ldots, E_k## to occur (so if ##q = 3## and you have 111, you cannot have 222 also, etc.), or do you allow several occurrences of ##q## identical numbers (so it would be OK to have both 111 and 222, etc.)? Each of these two cases can be treated using versions of the so-called inclusion-exclusion principle, but the final details are different in the two versions.

    Let
    $$\begin{array}{rcl}
    S_1&=& \sum_{i=1}^j P(E_i) \\
    S_2 &=& \sum_{i,j: \: i < j} P(E_i E_j) \\
    S_3 &=&\sum_{i,j,l: \: i < j < l} P(E_i E_j E_l) \\
    &\vdots& \\
    S_k&=& P(E_1 E_2 \cdots E_n)
    \end{array}
    $$
    (Here, ##P(AB)## means ##P(A\: \text{and}\: B)##, etc.)
    For example, if ##k=4## we have
    $$\begin{array}{rcl}
    S_1 &=& P(E_1) + P(E_2) + P(E_3) + P(E_4)\\
    S_2&=&P(E_1 E_2) + P(E_1 E_3) + P(E_1 E_4) + P(E_2 E_3) + \cdots + P(E_3 E_4) \\
    S_3 &=& P(E_1 E_2 E_3) + P(E_1 E_2 E_4) + \cdots + P(E_2 E_3 E_4)\\
    S_4 &=& P(E_1 E_2 E_3 E_4)
    \end{array}
    $$

    If we let ##E_{{}\geq 1}## denote the event that at least one of the ##E_i## occur and ##E_{=1}## the event that exactly one of the ##E_i## occur, we have:
    $$P(E_{\geq 1}) = S_1 - S_2 + S_3 - \cdots \pm S_k $$
    and
    $$P(E_{=1}) = S_1 - 2 S_2 + 3 S_3 - \cdots \pm k S_k$$

    OK, so how do we compute the ##S_1, S_2, \ldots, S_k##?

    To compute ##P(E_1)## we categorize the outcomes as "1" and "not 1". The probability of getting exactly ##q## 1s in ##n## picks is given by the hypergeometric probability
    $$P(E_1) = \frac{C(b_1,q) C(B_1,n-q)}{C(N,n)}$$
    Here, ##N = b_1 + b_2 + \cdots + b_k##, ##B_i = N - b_i## for each ##i##, and ##C(a,b)## is the binomial coefficient ##{}^aC_b##. Similarly for the other ##P(E_i)##, so we have
    $$S_1 = \sum_{i=1}^k \frac{C(b_i,q)C(B_i,n-q)}{C(N,n)} $$
    To compute ##P(E_1 E_2)##, categorize the outcomes into three classes, "1", "2" and "neither 1 nor 2", which contain ##b_1, b_2## and ##B_{12} = N - b_1 - b_2## members. Now we have a 3-class hypergeometric distribution, which is less familiar than the classic hypergeometric, but is still easy enough to get (from conditional probabilities, for example):
    $$P(E_1 E_2 E_3) = \frac{C(b_1,q) C(b_2,q) C(B_{12}, n - 2q)}{C(N,n)},$$
    etc.
    So, in principle we can compute ##S_2##.

    In a similar way we have
    $$P(E_i E_j E_l) = \frac{C(b_i,q) C(b_j,q) C(b_l,q) C(N-b_i-b_k-b_l, n-3q)}{C(N,n)}, $$
    etc.

    Of course, in any practical example, some of the ##P(E_i E_j E_l \cdots E_r)## may be zero, just because ##qr > n## for example.

    So, it is all computable in principle, but it will not be pleasant. Best of luck.
     
    Last edited: Jun 24, 2017
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Probability of Random Numbers
  1. Random Number (Replies: 2)

Loading...