Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Homework Help: Probability Distribution and Confidence Interval

  1. Oct 27, 2011 #1
    1. The problem statement, all variables and given/known data

    Let X1, X2,....Xn be a random sample from the distribution with probability density function

    fX (x;θ) = (θ+1)(1-x)θ, 0<x<1 θ>-1

    a) What is the probability distribution of Y= -[itex]\sum ln(1-Xi[/itex] from i=1 -> n
    b) Suggest a (1-α)100% confidence interval for θ based on Y= -[itex]\sum ln(1-Xi[/itex] from i=1 -> n

    2. Relevant equations

    3. The attempt at a solution

    a) I began by transforming the equation.
    Y= -Ʃ ln(1-Xi) from i=1 -> n= -ln((1-Xi)n
    eY = 1/(1-Xi)n
    Xi=1-e -Y/n

    fy(y) = fx(g-1(y)) [itex]\frac{dx}{dy}[/itex]
    =[itex]\frac{(θ+1}{n}[/itex] e=[itex]\frac{Y}{n}[/itex](θ+1)

    I don't think this is the correct answer. I'm not even sure if my math or the method to solving this is even correct.

    b) I'm not even sure what they're asking for this part of the question. Could someone please clarify what I'm suppose to do after I find the probability distribution?

    Thanks in advance.
  2. jcsd
  3. Oct 27, 2011 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    What is that notation supposed to mean? The [itex] X_i [/itex] are different sample values. You can't treat them as if they were all the same unknown quantity.

    [tex] e^Y = ( \frac{1}{1-X_1})(\frac{1}{1-X_2})....(\frac{1}{1-X_n}) [/tex]

    If we take a routine approach to this problem we would find the distribution of the random variable [itex] h = -\ln ( 1 - X) [/itex] and then compute the distribution of Y as the n-fold convolution of the distribution of [itex] h [/itex]. But perhaps you are studying some topic that makes doing this problem easier. What definitions and theorems are explained in the chapter where this problem occurs?

    Part b) Suggests that you can use Y to find an estimator of [itex] \theta [/itex]. Have you studied what estimators are? After you find the estimator, you can worry about the confidence interval.
  4. Oct 27, 2011 #3


    User Avatar
    Science Advisor
    Gold Member

    If n is sufficiently large to invoke the Central Limit Theorem then Y as a sum of many independent RV's can be assumed to be normally distributed. If this "shortcut" is valid then it is simply a matter of calculating the mean and st. dev of Y based on the mean and standard deviation of [itex] -ln(1-X)[/itex]

    The mean and variance should be calculable expectation values of functions of X.

    I don't know if that's what is wanted in the problem.
  5. Oct 27, 2011 #4
    We're basically doing the Central limit theorem and confidence intervals. I think we just have to do it the long way.

    So I first solved for H=-ln(1-X) and I got 1-e-(θ+1)h I think that's an exponential with mean 1/θ+1

    I'm assuming the next step would be to find the probability distribution of ƩHi from i=1 to n

    How would I go about doing that? Can it look something like Y=Wi bar?
  6. Oct 27, 2011 #5

    Stephen Tashi

    User Avatar
    Science Advisor

    Since you are studying the Central Limit Theorem, I think jimbaugh's approach is correct.
    So you need to find the variance of H. Then approximate Y as a normal distribution.
  7. Oct 28, 2011 #6


    User Avatar
    Science Advisor
    Gold Member

    If you use CLT...

    Firstly I suggest you transform from X to T=1-X, (t = 1-x) and since |dt| = |dx| the pdf is unchanged except for the change of variable.

    Call [itex]W=-\ln(1-X) = -\ln(T)[/itex]
    [tex] \mu_W = E[W] = -E[\ln(T)] =-\int \ln(t)f_T(t) dt[/tex]

    [itex]\sigma^2_W = E[W^2] - \mu_W^2 [/itex]

    [tex]E[W^2] = \int_0^1 \ln(t)\ln(t)f_T(t)dt[/tex]

    Ugly integrals but integration by parts [itex]dv = t^\theta dt[/itex] will do it I believe.

    Get mean and stdev of W.

    Y = W1+W2+... + Wn.

    [itex]\mu_Y = \mu_W[/itex], [itex]\sigma_Y = \frac{ \sigma_W}{\sqrt{n}}[/itex]

    You have Y's distribution assumed to be normal via C.L.T. with now known mean and st.dev.
  8. Oct 28, 2011 #7

    Ray Vickson

    User Avatar
    Science Advisor
    Homework Helper

    I posted this once but it did not appear. Here it is again.

    The random variable Z_i = -ln(1-X_i) have a simple, well-known distribution (although that fact may not, itself, be well known). The easiest way to get the distribution is to compute the Laplace transform E[exp(-s*Z_i)]. Once you have the -ln(1-X_i) distributions the rest is easy---or, at least, familiar and easily found in books, notes, etc.

  9. Oct 28, 2011 #8
    I think we can do it some shorter way. I looked at some previous examples and Y=-Ʃln(1-Xi) has a Gamma distribution with α=2n and β=1/(θ+1)

    However I'm stumped on how to get there. I know how to find the probability distribution of W=-ln(1-X), but I don't know what to do from there.

    For part b, would I be able to do something like:
    u=2n/(θ+1) var=2n/(θ+1)2

    Zn= xbar - u / √var which converges in distribution to N(0,1)

    P[-za/2 < Zn < za/2 ]
    P[ xbar-za/2 √var < u < xbar + za/2√var ]
  10. Oct 28, 2011 #9

    Ray Vickson

    User Avatar
    Science Advisor
    Homework Helper

    That is not how I would do it. First of all I would get the distribution of a single term -log(1-X) and find it to be exponential with some easily-computed rate parameter r related to theta. Then I would note that becauise Y is a sum of n exponential RVs, it is an n-Erlang random variable with parameters r and n (mean = n/r, variance = n/r^2); of course, this is a special case of a Gamma distribution, but is more convenient to work with. Finally, I would note that -log(1-X) = V/r, where V is exponentially distributed with rate 1 (mean = 1), so Y is (1/r)*W, where W is n-Erlang with rate 1--call it E(n,1). I would work out a 100a% probability interval for E(n,1), then use the fact that Y = (1/r)*E(n,1) to get a confidence interval for r, and hence for theta. That would work even if n is *not* large. Doing this is almost like converting to a standard N(0,1) distribution in problems involving normal distributions.

Share this great discussion with others via Reddit, Google+, Twitter, or Facebook