Probability Distribution and Confidence Interval

In summary: Let h = -ln(1-x). The pdf of h is (1/θ+1)e^{-(1/θ+1)h} . So the pdf of Y = -ln(1-X_1)-ln(1-X_2)-...-ln(1-X_n) is the n -fold convolution of this function with itself. I think that is(1/((θ+1)^n)) e^{-(1/θ+1)Y} .So the distribution of Y is a gamma distribution. I guess that means that the mean of Y is n/(θ+1) and its variance is n
  • #1
stevenham
8
0

Homework Statement



Let X1, X2,...Xn be a random sample from the distribution with probability density function

fX (x;θ) = (θ+1)(1-x)θ, 0<x<1 θ>-1

a) What is the probability distribution of Y= -[itex]\sum ln(1-Xi[/itex] from i=1 -> n
b) Suggest a (1-α)100% confidence interval for θ based on Y= -[itex]\sum ln(1-Xi[/itex] from i=1 -> n

Homework Equations





The Attempt at a Solution



a) I began by transforming the equation.
Y= -Ʃ ln(1-Xi) from i=1 -> n= -ln((1-Xi)n
eY = 1/(1-Xi)n
Xi=1-e -Y/n

fy(y) = fx(g-1(y)) [itex]\frac{dx}{dy}[/itex]
=[itex]\frac{(θ+1}{n}[/itex] e=[itex]\frac{Y}{n}[/itex](θ+1)

I don't think this is the correct answer. I'm not even sure if my math or the method to solving this is even correct.

b) I'm not even sure what they're asking for this part of the question. Could someone please clarify what I'm suppose to do after I find the probability distribution?

Thanks in advance.
 
Physics news on Phys.org
  • #2
stevenham said:
eY = 1/(1-Xi)n

What is that notation supposed to mean? The [itex] X_i [/itex] are different sample values. You can't treat them as if they were all the same unknown quantity.

[tex] e^Y = ( \frac{1}{1-X_1})(\frac{1}{1-X_2})...(\frac{1}{1-X_n}) [/tex]

If we take a routine approach to this problem we would find the distribution of the random variable [itex] h = -\ln ( 1 - X) [/itex] and then compute the distribution of Y as the n-fold convolution of the distribution of [itex] h [/itex]. But perhaps you are studying some topic that makes doing this problem easier. What definitions and theorems are explained in the chapter where this problem occurs?

Part b) Suggests that you can use Y to find an estimator of [itex] \theta [/itex]. Have you studied what estimators are? After you find the estimator, you can worry about the confidence interval.
 
  • #3
If n is sufficiently large to invoke the Central Limit Theorem then Y as a sum of many independent RV's can be assumed to be normally distributed. If this "shortcut" is valid then it is simply a matter of calculating the mean and st. dev of Y based on the mean and standard deviation of [itex] -ln(1-X)[/itex]

The mean and variance should be calculable expectation values of functions of X.

I don't know if that's what is wanted in the problem.
 
  • #4
Stephen Tashi said:
What is that notation supposed to mean? The [itex] X_i [/itex] are different sample values. You can't treat them as if they were all the same unknown quantity.

[tex] e^Y = ( \frac{1}{1-X_1})(\frac{1}{1-X_2})...(\frac{1}{1-X_n}) [/tex]

If we take a routine approach to this problem we would find the distribution of the random variable [itex] h = -\ln ( 1 - X) [/itex] and then compute the distribution of Y as the n-fold convolution of the distribution of [itex] h [/itex]. But perhaps you are studying some topic that makes doing this problem easier. What definitions and theorems are explained in the chapter where this problem occurs?

Part b) Suggests that you can use Y to find an estimator of [itex] \theta [/itex]. Have you studied what estimators are? After you find the estimator, you can worry about the confidence interval.

We're basically doing the Central limit theorem and confidence intervals. I think we just have to do it the long way.

So I first solved for H=-ln(1-X) and I got 1-e-(θ+1)h I think that's an exponential with mean 1/θ+1

I'm assuming the next step would be to find the probability distribution of ƩHi from i=1 to n

How would I go about doing that? Can it look something like Y=Wi bar?
 
  • #5
Since you are studying the Central Limit Theorem, I think jimbaugh's approach is correct.
So you need to find the variance of H. Then approximate Y as a normal distribution.
 
  • #6
If you use CLT...

Firstly I suggest you transform from X to T=1-X, (t = 1-x) and since |dt| = |dx| the pdf is unchanged except for the change of variable.

Call [itex]W=-\ln(1-X) = -\ln(T)[/itex]
[tex] \mu_W = E[W] = -E[\ln(T)] =-\int \ln(t)f_T(t) dt[/tex]

[itex]\sigma^2_W = E[W^2] - \mu_W^2 [/itex]

[tex]E[W^2] = \int_0^1 \ln(t)\ln(t)f_T(t)dt[/tex]

Ugly integrals but integration by parts [itex]dv = t^\theta dt[/itex] will do it I believe.

Get mean and stdev of W.

Y = W1+W2+... + Wn.

[itex]\mu_Y = \mu_W[/itex], [itex]\sigma_Y = \frac{ \sigma_W}{\sqrt{n}}[/itex]

You have Y's distribution assumed to be normal via C.L.T. with now known mean and st.dev.
 
  • #7
stevenham said:

Homework Statement



Let X1, X2,...Xn be a random sample from the distribution with probability density function

fX (x;θ) = (θ+1)(1-x)θ, 0<x<1 θ>-1

a) What is the probability distribution of Y= -[itex]\sum ln(1-Xi[/itex] from i=1 -> n
b) Suggest a (1-α)100% confidence interval for θ based on Y= -[itex]\sum ln(1-Xi[/itex] from i=1 -> n

Homework Equations





The Attempt at a Solution



a) I began by transforming the equation.
Y= -Ʃ ln(1-Xi) from i=1 -> n= -ln((1-Xi)n
eY = 1/(1-Xi)n
Xi=1-e -Y/n

fy(y) = fx(g-1(y)) [itex]\frac{dx}{dy}[/itex]
=[itex]\frac{(θ+1}{n}[/itex] e=[itex]\frac{Y}{n}[/itex](θ+1)

I don't think this is the correct answer. I'm not even sure if my math or the method to solving this is even correct.

b) I'm not even sure what they're asking for this part of the question. Could someone please clarify what I'm suppose to do after I find the probability distribution?

Thanks in advance.

I posted this once but it did not appear. Here it is again.

The random variable Z_i = -ln(1-X_i) have a simple, well-known distribution (although that fact may not, itself, be well known). The easiest way to get the distribution is to compute the Laplace transform E[exp(-s*Z_i)]. Once you have the -ln(1-X_i) distributions the rest is easy---or, at least, familiar and easily found in books, notes, etc.

RGV
 
  • #8
jambaugh said:
If you use CLT...

Firstly I suggest you transform from X to T=1-X, (t = 1-x) and since |dt| = |dx| the pdf is unchanged except for the change of variable.

Call [itex]W=-\ln(1-X) = -\ln(T)[/itex]
[tex] \mu_W = E[W] = -E[\ln(T)] =-\int \ln(t)f_T(t) dt[/tex]

[itex]\sigma^2_W = E[W^2] - \mu_W^2 [/itex]

[tex]E[W^2] = \int_0^1 \ln(t)\ln(t)f_T(t)dt[/tex]

Ugly integrals but integration by parts [itex]dv = t^\theta dt[/itex] will do it I believe.

Get mean and stdev of W.

Y = W1+W2+... + Wn.

[itex]\mu_Y = \mu_W[/itex], [itex]\sigma_Y = \frac{ \sigma_W}{\sqrt{n}}[/itex]

You have Y's distribution assumed to be normal via C.L.T. with now known mean and st.dev.

I think we can do it some shorter way. I looked at some previous examples and Y=-Ʃln(1-Xi) has a Gamma distribution with α=2n and β=1/(θ+1)

However I'm stumped on how to get there. I know how to find the probability distribution of W=-ln(1-X), but I don't know what to do from there.

For part b, would I be able to do something like:
u=2n/(θ+1) var=2n/(θ+1)2

Zn= xbar - u / √var which converges in distribution to N(0,1)

P[-za/2 < Zn < za/2 ]
P[ xbar-za/2 √var < u < xbar + za/2√var ]
 
  • #9
stevenham said:
I think we can do it some shorter way. I looked at some previous examples and Y=-Ʃln(1-Xi) has a Gamma distribution with α=2n and β=1/(θ+1)

However I'm stumped on how to get there. I know how to find the probability distribution of W=-ln(1-X), but I don't know what to do from there.

For part b, would I be able to do something like:
u=2n/(θ+1) var=2n/(θ+1)2

Zn= xbar - u / √var which converges in distribution to N(0,1)

P[-za/2 < Zn < za/2 ]
P[ xbar-za/2 √var < u < xbar + za/2√var ]

That is not how I would do it. First of all I would get the distribution of a single term -log(1-X) and find it to be exponential with some easily-computed rate parameter r related to theta. Then I would note that becauise Y is a sum of n exponential RVs, it is an n-Erlang random variable with parameters r and n (mean = n/r, variance = n/r^2); of course, this is a special case of a Gamma distribution, but is more convenient to work with. Finally, I would note that -log(1-X) = V/r, where V is exponentially distributed with rate 1 (mean = 1), so Y is (1/r)*W, where W is n-Erlang with rate 1--call it E(n,1). I would work out a 100a% probability interval for E(n,1), then use the fact that Y = (1/r)*E(n,1) to get a confidence interval for r, and hence for theta. That would work even if n is *not* large. Doing this is almost like converting to a standard N(0,1) distribution in problems involving normal distributions.

RGV
 

FAQ: Probability Distribution and Confidence Interval

1. What is a probability distribution?

A probability distribution is a mathematical function that shows the likelihood of different outcomes occurring in a given situation. It can be used to describe the possible values and their probabilities for a random variable.

2. How is a probability distribution related to a confidence interval?

A probability distribution is used to calculate the confidence interval, which is a range of values that is likely to contain the true value of a population parameter with a certain level of confidence.

3. What is the difference between a discrete and a continuous probability distribution?

A discrete probability distribution is used for variables that can only take on a finite or countable number of values, while a continuous probability distribution is used for variables that can take on any value within a given range.

4. How is the normal distribution related to probability distribution and confidence interval?

The normal distribution is a commonly used probability distribution that follows a bell-shaped curve. It is used to calculate confidence intervals for continuous variables, as well as to model real-world phenomena.

5. How is the central limit theorem related to probability distribution and confidence interval?

The central limit theorem states that as sample size increases, the sampling distribution of the sample mean will approach a normal distribution. This is important in calculating confidence intervals, as it allows us to make inferences about population parameters based on sample data.

Similar threads

Replies
14
Views
1K
Replies
1
Views
2K
Replies
4
Views
1K
Replies
8
Views
2K
Replies
3
Views
525
Back
Top