Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Homework Help: Statistics help needed

  1. Nov 17, 2005 #1
    Hey guys I've got several questions about statistics.

    Here's the first one.

    1. Suppose X is a discrete random variable that takes on the three values x1, x2, x3 with probabilities p1, p2, p3 respectively. Describe how you could generate a random sample from X if all you had access to were a list of numbers generated at random from the interval [0, 1].

    2. Suppose X is a continuous random variable with c.d.f. FX(x) = P(X  x). To make things easier, suppose further that X takes on values in an interval [a, b] (do allow for a to be −1 and b to be +1).

    Let Y be the random variable defined by Y = FX(X). This looks strange, but is perfectly
    valid since FX is just a function, and you are allowed to take functions of random variables.

    Your problem: show that Y distributed U[0, 1].

    Method: calculate the c.d.f. Y , FY (y) = P(Y  y), for all real y. Replace Y with FX(X),
    and consider when you can take the inverse of FX.
    Taking the inverse of FX is not always possible, in particular, for x < a and x > b. Consider
    those cases separately.
    You will find that the c.d.f. of Y is 0 for y < 0, y for 0  y  1, and 1 for y > 1, so indeed
    Y distributed U[0, 1].
    An important application of this fact is that if u is a random selection from the interval [0, 1], F−1 X (u) is a random selection from X.
  2. jcsd
  3. Nov 17, 2005 #2

    Tom Mattson

    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Regarding the part in blue:

    My browser is showing that as "X square x", as in there is a square-shaped symbol between the X and the x. Is that what you meant? Does anyone else see it differently? Your post is riddled with those squares.
  4. Nov 17, 2005 #3
    Concerning part 1. Find the pdf of X and call it f(t). Consider the
    integral int(f(t), t,0,x)=y, where y is one of the numbers generated randomly from (0,1). Solve for x in the above integral. This will generate a quasi-random sequence of samplings of X.
  5. Nov 20, 2005 #4
    P(X  x) = P (X <= x)
    and P(Y  y) = P (Y <= y)

    and the other two are strict inequalities <

    Sorry for the confusion.
  6. Nov 20, 2005 #5
    hmm..what do you mean by int(f(t), t,0,x)=y...??

    o to x is the limits of integration?
  7. Nov 20, 2005 #6
    My familiarity with part 1. comes from Monte Carlo Methods for Integration, in which a definite integral gets approximated by the expectation value of a random variable (who's pdf matches the argument of the integral). In order to sample the random variable, call it x, one creates the equation int(pdf(x),x,o,y)=z, where pdf(x) is the probability density function of the random variable x, y is the variable to be solved for, and z is a random sequence chosen from an arbitrary distribution. As you can see, my example deals with the continuous case, but I would imagine that the discrete case, such as yours, may be dealt with in the same manner.
  8. Nov 26, 2005 #7
    Whoa..that's a little hard for me to understand. i don't think i've learn that yet.
  9. Nov 26, 2005 #8
    And here's another question:

    2. Suppose X is a discrete random variable that takes on values from {1, 2, 3, . . .} with probabilities
    {p1, p2, p3, . . .}. If u is a number selected at random from [0, 1] explain why

    min {sum from i=1 to n of p subscript i >= u}

    can be considered as a random selection from X.
  10. Nov 26, 2005 #9


    User Avatar
    Science Advisor
    Homework Helper

    Don't you have to know according to which random distribution that list of numbers was generated? I think at the very least you have to assume that you know their distribution even if you don't assume what it is (e.g. uniform).
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook