Probability Distribution and Confidence Interval

stevenham · Oct 27, 2011

Homework Statement

Let X₁, X₂,...X_n be a random sample from the distribution with probability density function

f_X (x;θ) = (θ+1)(1-x)^θ, 0<x<1 θ>-1

a) What is the probability distribution of Y= -[itex]\sum ln(1-Xi[/itex] from i=1 -> n
b) Suggest a (1-α)100% confidence interval for θ based on Y= -[itex]\sum ln(1-Xi[/itex] from i=1 -> n

Homework Equations

The Attempt at a Solution

a) I began by transforming the equation.
Y= -Ʃ ln(1-X_i) from i=1 -> n= -ln((1-X_i)ⁿ
e^Y = 1/(1-X_i)ⁿ
X_i=1-e ^-Y/n

f_y(y) = f_x(g^-1(y)) [itex]\frac{dx}{dy}[/itex]
=[itex]\frac{(θ+1}{n}[/itex] e^{=[itex]\frac{Y}{n}[/itex](θ+1)}

I don't think this is the correct answer. I'm not even sure if my math or the method to solving this is even correct.

b) I'm not even sure what they're asking for this part of the question. Could someone please clarify what I'm suppose to do after I find the probability distribution?

Thanks in advance.

Stephen Tashi · Oct 27, 2011

stevenham said:

e^Y = 1/(1-X_i)ⁿ

What is that notation supposed to mean? The [itex]X_i[/itex] are different sample values. You can't treat them as if they were all the same unknown quantity.

[tex]e^Y = ( \frac{1}{1-X_1})(\frac{1}{1-X_2})...(\frac{1}{1-X_n})[/tex]

If we take a routine approach to this problem we would find the distribution of the random variable [itex]h = -\ln ( 1 - X)[/itex] and then compute the distribution of Y as the n-fold convolution of the distribution of [itex]h[/itex]. But perhaps you are studying some topic that makes doing this problem easier. What definitions and theorems are explained in the chapter where this problem occurs?

Part b) Suggests that you can use Y to find an estimator of [itex]\theta[/itex]. Have you studied what estimators are? After you find the estimator, you can worry about the confidence interval.

jambaugh · Oct 27, 2011

If n is sufficiently large to invoke the Central Limit Theorem then Y as a sum of many independent RV's can be assumed to be normally distributed. If this "shortcut" is valid then it is simply a matter of calculating the mean and st. dev of Y based on the mean and standard deviation of [itex]-ln(1-X)[/itex]

The mean and variance should be calculable expectation values of functions of X.

I don't know if that's what is wanted in the problem.

stevenham · Oct 27, 2011

Stephen Tashi said:

What is that notation supposed to mean? The [itex]X_i[/itex] are different sample values. You can't treat them as if they were all the same unknown quantity.

[tex]e^Y = ( \frac{1}{1-X_1})(\frac{1}{1-X_2})...(\frac{1}{1-X_n})[/tex]

If we take a routine approach to this problem we would find the distribution of the random variable [itex]h = -\ln ( 1 - X)[/itex] and then compute the distribution of Y as the n-fold convolution of the distribution of [itex]h[/itex]. But perhaps you are studying some topic that makes doing this problem easier. What definitions and theorems are explained in the chapter where this problem occurs?

Part b) Suggests that you can use Y to find an estimator of [itex]\theta[/itex]. Have you studied what estimators are? After you find the estimator, you can worry about the confidence interval.

We're basically doing the Central limit theorem and confidence intervals. I think we just have to do it the long way.

So I first solved for H=-ln(1-X) and I got 1-e^-(θ+1)h I think that's an exponential with mean 1/θ+1

I'm assuming the next step would be to find the probability distribution of ƩH_i from i=1 to n

How would I go about doing that? Can it look something like Y=W_i bar?

Stephen Tashi · Oct 27, 2011

Since you are studying the Central Limit Theorem, I think jimbaugh's approach is correct.
So you need to find the variance of H. Then approximate Y as a normal distribution.

jambaugh · Oct 28, 2011

If you use CLT...

Firstly I suggest you transform from X to T=1-X, (t = 1-x) and since |dt| = |dx| the pdf is unchanged except for the change of variable.

Call [itex]W=-\ln(1-X) = -\ln(T)[/itex]
[tex]\mu_W = E[W] = -E[\ln(T)] =-\int \ln(t)f_T(t) dt[/tex]

[itex]\sigma^2_W = E[W^2] - \mu_W^2[/itex]

[tex]E[W^2] = \int_0^1 \ln(t)\ln(t)f_T(t)dt[/tex]

Ugly integrals but integration by parts [itex]dv = t^\theta dt[/itex] will do it I believe.

Get mean and stdev of W.

Y = W1+W2+... + Wn.

[itex]\mu_Y = \mu_W[/itex], [itex]\sigma_Y = \frac{ \sigma_W}{\sqrt{n}}[/itex]

You have Y's distribution assumed to be normal via C.L.T. with now known mean and st.dev.

Ray Vickson · Oct 28, 2011

stevenham said:

Homework Statement

Let X₁, X₂,...X_n be a random sample from the distribution with probability density function

f_X (x;θ) = (θ+1)(1-x)^θ, 0<x<1 θ>-1

a) What is the probability distribution of Y= -[itex]\sum ln(1-Xi[/itex] from i=1 -> n
b) Suggest a (1-α)100% confidence interval for θ based on Y= -[itex]\sum ln(1-Xi[/itex] from i=1 -> n

Homework Equations

The Attempt at a Solution

a) I began by transforming the equation.
Y= -Ʃ ln(1-X_i) from i=1 -> n= -ln((1-X_i)ⁿ
e^Y = 1/(1-X_i)ⁿ
X_i=1-e ^-Y/n

f_y(y) = f_x(g^-1(y)) [itex]\frac{dx}{dy}[/itex]
=[itex]\frac{(θ+1}{n}[/itex] e^{=[itex]\frac{Y}{n}[/itex](θ+1)}

I don't think this is the correct answer. I'm not even sure if my math or the method to solving this is even correct.

b) I'm not even sure what they're asking for this part of the question. Could someone please clarify what I'm suppose to do after I find the probability distribution?

Thanks in advance.

I posted this once but it did not appear. Here it is again.

The random variable Z_i = -ln(1-X_i) have a simple, well-known distribution (although that fact may not, itself, be well known). The easiest way to get the distribution is to compute the Laplace transform E[exp(-s*Z_i)]. Once you have the -ln(1-X_i) distributions the rest is easy---or, at least, familiar and easily found in books, notes, etc.

RGV

stevenham · Oct 28, 2011

jambaugh said:

If you use CLT...

Firstly I suggest you transform from X to T=1-X, (t = 1-x) and since |dt| = |dx| the pdf is unchanged except for the change of variable.

Call [itex]W=-\ln(1-X) = -\ln(T)[/itex]
[tex]\mu_W = E[W] = -E[\ln(T)] =-\int \ln(t)f_T(t) dt[/tex]

[itex]\sigma^2_W = E[W^2] - \mu_W^2[/itex]

[tex]E[W^2] = \int_0^1 \ln(t)\ln(t)f_T(t)dt[/tex]

Ugly integrals but integration by parts [itex]dv = t^\theta dt[/itex] will do it I believe.

Get mean and stdev of W.

Y = W1+W2+... + Wn.

[itex]\mu_Y = \mu_W[/itex], [itex]\sigma_Y = \frac{ \sigma_W}{\sqrt{n}}[/itex]

You have Y's distribution assumed to be normal via C.L.T. with now known mean and st.dev.

I think we can do it some shorter way. I looked at some previous examples and Y=-Ʃln(1-X_i) has a Gamma distribution with α=2n and β=1/(θ+1)

However I'm stumped on how to get there. I know how to find the probability distribution of W=-ln(1-X), but I don't know what to do from there.

For part b, would I be able to do something like:
u=2n/(θ+1) var=2n/(θ+1)²

Zn= xbar - u / √var which converges in distribution to N(0,1)

P[-z_a/2 < Zn < z_a/2 ]
P[ xbar-z_a/2 √var < u < xbar + z_a/2√var ]

Ray Vickson · Oct 28, 2011

stevenham said:

I think we can do it some shorter way. I looked at some previous examples and Y=-Ʃln(1-X_i) has a Gamma distribution with α=2n and β=1/(θ+1)

However I'm stumped on how to get there. I know how to find the probability distribution of W=-ln(1-X), but I don't know what to do from there.

For part b, would I be able to do something like:
u=2n/(θ+1) var=2n/(θ+1)²

Zn= xbar - u / √var which converges in distribution to N(0,1)

P[-z_a/2 < Zn < z_a/2 ]
P[ xbar-z_a/2 √var < u < xbar + z_a/2√var ]

That is not how I would do it. First of all I would get the distribution of a single term -log(1-X) and find it to be exponential with some easily-computed rate parameter r related to theta. Then I would note that becauise Y is a sum of n exponential RVs, it is an n-Erlang random variable with parameters r and n (mean = n/r, variance = n/r^2); of course, this is a special case of a Gamma distribution, but is more convenient to work with. Finally, I would note that -log(1-X) = V/r, where V is exponentially distributed with rate 1 (mean = 1), so Y is (1/r)*W, where W is n-Erlang with rate 1--call it E(n,1). I would work out a 100a% probability interval for E(n,1), then use the fact that Y = (1/r)*E(n,1) to get a confidence interval for r, and hence for theta. That would work even if n is *not* large. Doing this is almost like converting to a standard N(0,1) distribution in problems involving normal distributions.

RGV

Probability Distribution and Confidence Interval

Homework Help Overview

Discussion Character

Approaches and Questions Raised

Discussion Status

Contextual Notes

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

Similar threads

Hi! Can someone explain about Differential Equations?

Deriving spatial derivatives

Is this the correct general solution of the given PDE?

What does "compute Aut(G)" mean?

J_1(x) = (x^2/10)*(J_1(x) + J_3(x)) How to solve?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect