Probability for a binomially distributed variable X

TheSodesa · Sep 23, 2016

Homework Statement

Let ##X \sim Bin(n, p)## where ##n=20## and ##p=0.1##. Calculate ##P(|X-\mu| \leq \sigma)##.

Give your answer up to three decimal places.

Homework Equations

For a binomially distributed random variable, using moment generating functions we have:
\begin{equation}
\mu= E(X) = np
\end{equation}

\begin{equation}
\sigma^{2} = Var(X) = np(1-p)
\end{equation}

The probability density function is
\begin{equation}
f(x) = {n \choose x} p^{x}(1-p)^{n-x}
\end{equation}

The Attempt at a Solution

Now the asked probability looked a lot like Tsebyshev's inequality, but that just gave me a zero, and the electronic return system complained about it. It also gave me a hint:

First solve ##|X - \mu| \leq \sigma## and then calculate
<br /> P(|X - \mu| \leq \sigma) = P(X=x_1 \text{ OR } X = x_2 ...)<br />

I started out by solving for ##X##:

\begin{align*}
|X-\mu| \leq \sigma\\
\iff\\
-\sigma \leq X-\mu \leq \sigma\\
\iff\\
\mu-\sigma \leq X \leq \mu + \sigma\\
\iff\\
np - \sqrt{np(1-p)} \leq X \leq np + \sqrt{np(1-p)}\\
\iff\\
\stackrel{\approx 0.658}{2 - \sqrt{2(0.9)}} \leq X \leq \stackrel{\approx 3.342}{2 + \sqrt{2(0.9)}}\\
\end{align*}

Alright. Now I have some numerical values. But now what? I don't know what ##x_1##, ##x_2## etc. are in the hint. Are they referring to ##X=1##, ##X=2## and so on?

LCKurtz · Sep 23, 2016

TheSodesa said:

Homework Statement

Let ##X \sim Bin(n, p)## where ##n=20## and ##p=0.1##. Calculate ##P(|X-\mu| \leq \sigma)##.

Give your answer up to three decimal places.

Homework Equations

For a binomially distributed random variable, using moment generating functions we have:
\begin{equation}
\mu= E(X) = np
\end{equation}

\begin{equation}
\sigma^{2} = Var(X) = np(1-p)
\end{equation}

The probability density function is
\begin{equation}
f(x) = {n \choose x} p^{x}(1-p)^{n-x}
\end{equation}

The Attempt at a Solution

Now the asked probability looked a lot like Tsebyshev's inequality, but that just gave me a zero, and the electronic return system complained about it. It also gave me a hint:

First solve ##|X_\mu| \leq \sigma## and then calculate
<br /> P(|X - \mu| \leq \sigma) = P(X=x_1 \text{ OR } X = x_2 ...)<br />

I started out by solving for ##X##:

\begin{align*}
|X-\mu| \leq \sigma\\
\iff\\
-\sigma \leq X-\mu \leq \sigma\\
\iff\\
\mu-\sigma \leq X \leq \mu + \sigma\\
\iff\\
np - \sqrt{np(1-p)} \leq X \leq np + \sqrt{np(1-p)}\\
\iff\\
\stackrel{\approx 0.658}{2 - \sqrt{2(0.9)}} \leq X \leq \stackrel{\approx 3.342}{2 + \sqrt{2(0.9)}}\\
\end{align*}

Alright. Now I have some numerical values. But now what? I don't know what ##x_1##, ##x_2## etc. are in the hint. Are they referring to ##X=1##, ##X=2## and so on?

If you are asking what I think you are the answer is yes. What values of can X take that satisfy that last inequality? What is their probability?

TheSodesa · Sep 23, 2016

LCKurtz said:

If you are asking what I think you are the answer is yes. What values of can X take that satisfy that last inequality? What is their probability?

Well, for a discrete variable ##X##, the point probabilities are given by the distribution function ##f(x)##. If I wanted to find the probability of a certain interval, I would have to calculate the cumulative function in the interval ##0 < X \leq 4##. Or should it be ##0 < X \leq 3##? According the my course handout, for a binomially distributed variable the cumulative function
\begin{equation}
F(X) = P(X \leq x) = \sum_{t=0}^{\lfloor x \rfloor}b(t; n,p)
\end{equation}
I do not know what the small ##b## is in the definition, and the handout doesn't say how to calculate these values. It simply says to look them up in a table or a computer program.

The floor function ##{\lfloor x \rfloor}## is supposedly the largest whole number ##\leq x##, so I guess my cumulative function would be
F(X) = P(1 \leq X \leq 3) = \sum_{t=0}^{\lfloor x \rfloor}b(t; n,p)

TheSodesa · Sep 23, 2016

LCKurtz said:

If you are asking what I think you are the answer is yes. What values of can X take that satisfy that last inequality? What is their probability?

Alright, so I just summed the probabilities ##f(1)##, ##f(2)## and ##f(3)## together, which apparently gave me the right answer. I'm still a bit baffled (might have something to do with the fact that it's almost 4am here), but I'm going to have to let this one go for now.

Thanks for the assistance.

Ray Vickson · Sep 23, 2016

TheSodesa said:

Well, for a discrete variable ##X##, the point probabilities are given by the distribution function ##f(x)##. If I wanted to find the probability of a certain interval, I would have to calculate the cumulative function in the interval ##0 < X \leq 4##. Or should it be ##0 < X \leq 3##? According the my course handout, for a binomially distributed variable the cumulative function
\begin{equation}
F(X) = P(X \leq x) = \sum_{t=0}^{\lfloor x \rfloor}b(t; n,p)
\end{equation}
I do not know what the small ##b## is in the definition, and the handout doesn't say how to calculate these values. It simply says to look them up in a table or a computer program.

The floor function ##{\lfloor x \rfloor}## is supposedly the largest whole number ##\leq x##, so I guess my cumulative function would be
F(X) = P(1 \leq X \leq 3) = \sum_{t=0}^{\lfloor x \rfloor}b(t; n,p)

The notation ##b(k)## or ##b(k;n,p)## is used for the probability mass function (NOT density function!):
$$b(k;n,p) = {n \choose k} p^k (1-p)^{n-k}$$.
So the formula in the handout is just ##\sum_k P(X=k) ##, where the sum is over all non-negative integers ##0 \leq k \leq x##.

Your sum ##F(X) = P(1 \leq X \leq 3)## is incorrect (if you meant to write ##F(3)## instead of ##F(X)##). Can you see why?

TheSodesa · Sep 24, 2016

Ray Vickson said:

The notation ##b(k)## or ##b(k;n,p)## is used for the probability mass function (NOT density function!):
$$b(k;n,p) = {n \choose k} p^k (1-p)^{n-k}$$.
So the formula in the handout is just ##\sum_k P(X=k) ##, where the sum is over all non-negative integers ##0 \leq k \leq x##.

Your sum ##F(X) = P(1 \leq X \leq 3)## is incorrect (if you meant to write ##F(3)## instead of ##F(X)##). Can you see why?

I guess I should have written ##F(1 \leq X \leq 3)## instead? ##F(3)## would imply ##P(X\leq 3)##, if I've understood the notation correctly. I have to say it is kind of annoying they switch symbols for the mass functions between distributions. Can't they just stick to ##f##?

Ray Vickson · Sep 24, 2016

TheSodesa said:

I guess I should have written ##F(1 \leq X \leq 3)## instead? ##F(3)## would imply ##P(X\leq 3)##, if I've understood the notation correctly. I have to say it is kind of annoying they switch symbols for the mass functions between distributions. Can't they just stick to ##f##?

They should NOT use ##f## routinely for the probability mass function of a discrete random variable; it should be used primarily for the probability density function of a continuous random variable. However, ##F## is often used for the distribution function for both types. (Older sources routinely called this the cumulative distribution, but nowadays the adjective "cumulative" is being dropped more and more often.)

It would be much more annoying if they used the same symbol ##f## for different mass/density functions, but perhaps with subscripts and other modifiers. It often more convenient use something like ##b(k)## for the binomial probability ##P(X_{\text{binomial}}=k)##, something like ##po(k)## for ##P(X_{\text{poisson}}= k),## etc. This adds to communication clarity----as long as the source (books, notes, or whatever) makes sure to first define the terms before using them. If they do use the notation without definition or explanation then that, indeed, would be annoying.

Back when I was teaching this material I tried to use symbols like ##f(x)## etc, for density functions of continuous random variables (except when using ##\phi(z)## for the density function of the standard normal variable ##Z \sim N(0,1)##, and things like ##p(k)## for the probability mass function of a discrete random variable. Unfortunately that left something like ##F(x)## or ##F(k)## for both.

One final remark about notation: do not use ##X## as an argument of ##F## or ##f##, because it will hardly ever mean what you think it does. It is important to distinguish between the random variable ##X## (upper case) and its possible values ##x## (lower case). If ##F## is the (cumulative) distribution of ##X##, then for any distribution at all the quantity ##F(X)## is a continuous random variable with density uniform on ##(0,1)##, so unless you mean that you should not write it.

TheSodesa · Sep 24, 2016

Ray Vickson said:

They should NOT use ##f## routinely for the probability mass function of a discrete random variable; it should be used primarily for the probability density function of a continuous random variable. However, ##F## is often used for the distribution function for both types. (Older sources routinely called this the cumulative distribution, but nowadays the adjective "cumulative" is being dropped more and more often.)

It would be much more annoying if they used the same symbol ##f## for different mass/density functions, but perhaps with subscripts and other modifiers. It often more convenient use something like ##b(k)## for the binomial probability ##P(X_{\text{binomial}}=k)##, something like ##po(k)## for ##P(X_{\text{poisson}}= k),## etc. This adds to communication clarity----as long as the source (books, notes, or whatever) makes sure to first define the terms before using them. If they do use the notation without definition or explanation then that, indeed, would be annoying.

Back when I was teaching this material I tried to use symbols like ##f(x)## etc, for density functions of continuous random variables (except when using ##\phi(z)## for the density function of the standard normal variable ##Z \sim N(0,1)##, and things like ##p(k)## for the probability mass function of a discrete random variable. Unfortunately that left something like ##F(x)## or ##F(k)## for both.

One final remark about notation: do not use ##X## as an argument of ##F## or ##f##, because it will hardly ever mean what you think it does. It is important to distinguish between the random variable ##X## (upper case) and its possible values ##x## (lower case). If ##F## is the (cumulative) distribution of ##X##, then for any distribution at all the quantity ##F(X)## is a continuous random variable with density uniform on ##(0,1)##, so unless you mean that you should not write it.

Got it.

Probability for a binomially distributed variable X

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

Similar threads

Hot Threads

Prove that the integral is equal to ##\pi^2/8##

Solving the wave equation with piecewise initial conditions

Area of loop in x-y plane

Calculating radius of gyration of plane figure about x-axis

Solve this problem that involves induction

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective