Proof that an interval is a confidence interval for Geom(q)

Alex_Doge · Jan 28, 2016

Hello Physicsforum

Homework Statement

I have a problem proving this:
Given [itex]C(x)=[0, 3/x][/itex] for all [itex]x\in\chi[/itex], with [itex]\chi=\Omega[/itex] being the sample space and [itex]P_q=Geom(q)[/itex] being the geometric distribution.

I have to show that C(x) is a confidence Interval for q but I don't know how to get started.

I've been given the tip [itex]P_q([0,3/q])=P_q(x\in[0,3/q])=P_q(\{1,2,\lfloor3/q\rfloor\})[/itex] and then use the geometric series. It also says that the function won't be steady and that I should nest it between two steady ones.

Homework Equations

The definition of a confidence interval [itex]P_q(u(X)<q<v(X))=\gamma[/itex] for all [itex]q\in(0,1][/itex] and [itex]\gamma[/itex] close to near 1.
Geometric summation formula and sigma additivity for disjoint sets.

The Attempt at a Solution

I tried using the definition but don't know how to continue. I think I have to prove the equalities:
[itex]P_q(u(X)<q)=\gamma[/itex] and [itex]P_q(q<v(X))=\gamma[/itex] but I don't know what I'm supposed to use for X. And I don't know what they mean with functions. I can't seem to see any dependency of a variable anywhere.
Any tips are very welcome!

Kind regards
Alex

Ray Vickson · Jan 28, 2016

Alex_Doge said:

Hello Physicsforum

Homework Statement

I have a problem proving this:
Given [itex]C(x)=[0, 3/x][/itex] for all [itex]x\in\chi[/itex], with [itex]\chi=\Omega[/itex] being the sample space and [itex]P_q=Geom(q)[/itex] being the geometric distribution.

I have to show that C(x) is a confidence Interval for q but I don't know how to get started.

I've been given the tip [itex]P_q([0,3/q])=P_q(x\in[0,3/q])=P_q(\{1,2,\lfloor3/q\rfloor\})[/itex] and then use the geometric series. It also says that the function won't be steady and that I should nest it between two steady ones.

Homework Equations

The definition of a confidence interval [itex]P_q(u(X)<q<v(X))=\gamma[/itex] for all [itex]q\in(0,1][/itex] and [itex]\gamma[/itex] close to near 1.
Geometric summation formula and sigma additivity for disjoint sets.

The Attempt at a Solution

I tried using the definition but don't know how to continue. I think I have to prove the equalities:
[itex]P_q(u(X)<q)=\gamma[/itex] and [itex]P_q(q<v(X))=\gamma[/itex] but I don't know what I'm supposed to use for X. And I don't know what they mean with functions. I can't seem to see any dependency of a variable anywhere.
Any tips are very welcome!

Kind regards
Alex

This is most definitely a calculus problem, so does not belong in the precalculus forum.

Here is how I would approach it. I would operate as a Bayesian, and suppose the Geometic parameter ##q## is governed by a prior distribution ##f_0(q)## for ##0 <q<1##. Let ##X = 1,2,3, \ldots## be the Geometric random variable under observation. The probability of seeing ##X = k##, given a value of ##q##, is
[tex] P(X = k\,| \, q) = q\,(1-q)^{k-1}, \: k = 1,2,3, \ldots [/tex]
The posterior probability density of ##q##, given the observation ##X = k##, is
[tex] f(q \,| \, k) = \frac{f_0(q) P(X=k\, | \, q)}{P(k)}, [/tex]
where
[tex] P(k) = \int_0^1 f_0(q)P(X = k\,| \, q) \, dq = \int_0^1 f_0(q) \, q\,(1-q)^{k-1} \, dq [/tex]
Note that ##P(k)## is the prior probability of observing ##X=k##.

Things become much easier if we use the so-called uninformative prior, which in this case means that ##f_0(q) = 1## is the uniform distribution on ##(0,1)##; that is, we assume initially that ##q## is equally likely to take any value between 0 and 1. Basically, we know nothing at all about ##q##, except that it must be between 0 and 1.

In this case we can do the integrals:
[tex] P(k) = \int_0^1 q (1-q)^{k-1} dq = \frac{1}{k(k+1)} , [/tex]
so the posterior probability density of ##q## is
##f(q|k) = k(k+1) q (1-q)^{k-1}, \; 0 < q < 1##.

You can now look at the interval ##(0,3/k)##. Clearly, the probability that the (random quantity) ##q## lies in ##(0,3/k)## is 1 for ##k = 1, 2, 3##. For ##k \geq 4## the probability that ##q## lies in ##(0,3/k)## is
[tex] P(0 < q < 3/k) = \int_0^{3/k} k(k+1) q (1-q)^{k-1} \, dq [/tex]
You can evaluate this as a function of ##k## and plot it out for ##k = 3, 4, 5, 6, \ldots ## to see if it is near 1 or not.

Alex_Doge · Jan 28, 2016

First of all thanks for the detailed answer. Sorry I didn't know it would turn out to be a calculus problem.
The integral gives [itex]
P(0 < q < 3/k) = \int_0^{3/k} k(k+1) q (1-q)^{k-1} \, dq=\frac{12\cdot3^k k^{-k} (-1)^k(1-1/3k)^{k+1}+k-3}{k-3}
[/itex] It converges towards 0.8 if I'm not mistaken

What does this mean?

Ray Vickson · Jan 28, 2016

Alex_Doge said:

First of all thanks for the detailed answer. Sorry I didn't know it would turn out to be a calculus problem.
The integral gives [itex]
P(0 < q < 3/k) = \int_0^{3/k} k(k+1) q (1-q)^{k-1} \, dq=\frac{12\cdot3^k k^{-k} (-1)^k(1-1/3k)^{k+1}+k-3}{k-3}
[/itex] It converges towards 0.8 if I'm not mistaken
View attachment 94952
What does this mean?

Why are you plotting it for negative values of ##k##? We need ##k = 1,2,3,4, \ldots##, so plotting it for ##k \geq 4## has meaning. Negative values of ##k## have no meaning at all in this problem.

Alex_Doge · Jan 28, 2016

Oh sorry, I'm getting tired, been stuck on this problem all day now.

That's a strange plot. What does this mean then? :)
Is it something similar to a delta function, or have I made a mistake plotting it?

Ray Vickson · Jan 28, 2016

Alex_Doge said:

Oh sorry, I'm getting tired, been stuck on this problem all day now.
View attachment 94954
That's a strange plot. What does this mean then? :)
Is it something similar to a delta function, or have I made a mistake plotting it?

Part of your problem is that you have a result for your integral that seems to work for all ##k##, but when you specify that ##k## is a positive integer, it simplifies a lot; in particular, the pesky factors ##(-1)^k## disappear, giving you a formula that works well for all positive values of ##k \geq 3## (no division by 0 anymore). Then it plots out nicely.

Alex_Doge · Jan 28, 2016

Yea it looks better now:

But how do I continue from here?

Alex_Doge · Jan 28, 2016

So because q lies in C with the probability 1, C is a confidence interval? For [itex]
k \geq 4
[/itex] the probability is not 1 for large k. Is that a problem?

Ray Vickson · Jan 29, 2016

Alex_Doge said:

So because q lies in C with the probability 1, C is a confidence interval? For [itex]
k \geq 4
[/itex] the probability is not 1 for large k. Is that a problem?

A confidence interval (with confidence ##p \in (0,1)##) is an interval for which the probability is at least ##p## that it contains the unknown parameter of interest. So, if the parameter we want to estimate is ##q##, we want an interval that has a probability of at least ##p## to overlap the unknown ##q##.

In most problems there is not much difference between the Bayesian approach (with non-informative prior) that I outlined above, and the classical (non-Bayesian) confidence-interval method; the interpretations are different, but usually the computations are almost the same. However, that is not the case in your problem (because the alleged confidence interval is a bit unusual). So: the confidence-interval method will deliver different results in your problem.

In your case (without yet specifying ##p##) the claim is that for observation ##\{X=k\}## the interval ##(0,3/k)## overlaps ##q## with a probability of ##p## or more. Note that the interval overlaps ##q## if and only if ##q < 3/k##, so the probability is ##P(k/3 > q) = P(k < 3/q)##. For a geometric random variable ##X## with parameter ##q## this probability is
[tex] P(X < 3/q) = \sum_{k=1}^{\lfloor 3/q \rfloor} q (1-q)^{k-1} [/tex]
where ##\lfloor u \rfloor## is the greatest integer ##\leq u##.

The problem is asking you to figure out a value of ##p## (hopefully, near 1.0) that is a lower bound on that probability (so that you can be at least ##100 p\%## sure the interval contains the true parameter value).

Alex_Doge · Jan 29, 2016

I calculated that Probability [itex]
P(X < 3/q) = \sum_{k=1}^{\lfloor 3/q \rfloor} q (1-q)^{k-1}
=q\sum_{k=1}^{\lfloor3/q\rfloor}(1-q)^{k-1}=(1-‌q)^{\lfloor3/q\rfloor}
[/itex]How do I calculate this lower bound? Like this: [itex](1-‌q)^{\lfloor3/q\rfloor}=1[/itex] and then solve for q?
Or do I minimize and maximize to find lower and upper bound?

Ray Vickson · Jan 29, 2016

Alex_Doge said:

I calculated that Probability [itex]
P(X < 3/q) = \sum_{k=1}^{\lfloor 3/q \rfloor} q (1-q)^{k-1}
=q\sum_{k=1}^{\lfloor3/q\rfloor}(1-q)^{k-1}=(1-‌q)^{\lfloor3/q\rfloor}
[/itex]How do I calculate this lower bound? Like this: [itex](1-‌q)^{\lfloor3/q\rfloor}=1[/itex] and then solve for q?
Or do I minimize and maximize to find lower and upper bound?

Actually, ##\sum_{k=1}^n q (1-q)^{k-1} = 1 - (1-q)^n##, NOT ##(1-q)^n##.

You are not "solving for ##q##"; you do not know the value of ##q##, but want to know a value ##\alpha## (called ##p## before), such that
[tex] \sum_{k=1}^{\lfloor3/q\rfloor} q (1-q)^{k-1} \geq \alpha [/tex]
for all ##q \in (0,1)##. If it happens that ##\alpha## is "large" (near 1) then you have a useful ##100 \alpha \%## confidence interval.

Alex_Doge · Jan 29, 2016

That means I have this:
[itex]\sum_{k=1}^{\lfloor3/q\rfloor} q (1-q)^{k-1} =1-(1-q)^{\lfloor3/q\rfloor}\geq \alpha[/itex] and need to find alpha.
Can I look at [itex]1-(1-q)^{\lfloor3/q\rfloor}[/itex] and see where it's minimum is, for [itex]
q \in (0,1)
[/itex], and then define alpha to be just lower? What do you mean by "large"?
Thanks for the help so far

Alex_Doge · Jan 29, 2016

Thanks a lot for the help. I solved it now.

The red graph is the probability. The other two are the bounds of the floor function. The level is then 0.05.
The plots helped me understand it and gave the hint sense.

Proof that an interval is a confidence interval for Geom(q)

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

1. What is a confidence interval?

2. How is a confidence interval calculated?

3. What is the purpose of a confidence interval?

4. How does a confidence interval relate to Geom(q)?

5. How is the confidence level chosen for a confidence interval?

Similar threads

Hot Threads

Recent Insights