Simple problems regarding sum of IID random variables

madilyn · May 30, 2014

Hi! I'm taking my first course in statistics and am hoping to get some intuition for this set of problems...

Suppose I have a bowl of marbles that each weighs [itex]m_{marble}=0.01[/itex] kg.

For each marble I swallow, there is a chance [itex]p=0.53[/itex] that it adds [itex]m_{marble}[/itex] to my weight, and chance [itex]1-p[/itex] that it causes me to puke, therefore losing [itex]m_{puke}=0.011[/itex] kg of my weight.

1. Assume I religiously swallow [itex]n=10^{4}[/itex] marbles each day. What fraction of the days do I expect to gain weight on?

Let [itex]X_{i}[/itex]denote the random variable for my weight gained for each swallowed marble, indexed by [itex]i\in\mathbb{Z}^{+}[/itex].

Let [itex]Y[/itex] denote the random variable for my total weight gained each day from swallowing [itex]n[/itex] marbles, [itex]Y=\sum_{i=1}^{n}X_{i}[/itex]. Then, denote
[tex]E\left(X\right) := E\left(X_{1}\right)=E\left(X_{2}\right)...=E\left(X_{n}\right)[/tex]
[tex]Var\left(X\right) := Var\left(X_{1}\right)=Var\left(X_{2}\right)=...=Var\left(X_{n}\right)[/tex]
such that the theoretical distribution of my daily weight gain is approximately normal with mean
[tex]E\left(Y\right)=E\left(X_{1}+...+X_{n}\right)=nE\left(X\right)[/tex]
and variance
[tex]Var\left(Y\right)=Var\left(X_{1}+...+X_{n}\right)=nVar\left(X\right)[/tex]
Then, I expect to gain weight on [itex]1-P\left(Y\leq0\right)\approx0.892[/itex] of the days. Is this correct?

2. Why does [itex]Y[/itex] approximately follow the distribution [itex]N\left(nE\left(X\right),\sqrt{nVar\left(X\right)}\right)[/itex]?

Firstly, am I correct that the variance is [itex]nVar\left(X\right)[/itex] and not [itex]n^{2}Var\left(X\right)[/itex]? Can someone refresh me what's the intuitive difference between the random variable [itex]Y=X_{1}+...+X_{n}[/itex] as compared to [itex]Y=50X[/itex] again?

Secondly, it's not immediately obvious to me how the distribution approaches a Gaussian distribution as [itex]n\rightarrow\infty[/itex]? Perhaps I can formulate this in terms of the convolution of a discrete function representing the distribution of my weight gain/loss for each marble swallowed? Will the discrete convolution approach generalize nicely to the sum of any discrete random variable?

3. For finite [itex]n[/itex], am I correct that this distribution converges faster to a Gaussian distribution near the center and slower in the tails as [itex]n[/itex] increases? Can I quantify this rate of convergence?

Just from my intuition, I think the best strategy to attack this problem is to express the sum of the random variables [itex]Y=\sum_{i=1}^{n}X_{i}[/itex] as a Fourier transform and then investigate the rate of convergence using an asymptotic expansion of the integral in large [itex]n[/itex] i.e. saddle point method?

Thanks!

Stephen Tashi · May 30, 2014

madilyn said:

Hi! I'm taking my first course in statistics and am hoping to get some intuition for this set of problems...

This thread will probably be moved to the homework section. (I think the math homework section would be more obvious as a "math homework" subsection of mathematics - so I sympathize with the misplacement.) For homework, you should state the problem - rather than leave the helpers to guess it from reading your reasoning.

Then, I expect to gain weight on [itex]1-P\left(Y\leq0\right)\approx0.892[/itex] of the days. Is this correct?

The question of "did gain" vs "did not gain" is a different question that "how much" is gained. "Did gain" can be represented by a random variable that only takes on the values 1 or 0. (A bernoulli random variable.) The theorems about expectation and variance apply to such a random variable but you'd get different answers than you get from answering questions about "how much".

Can someone refresh me what's the intuitive difference between the random variable [itex]Y=X_{1}+...+X_{n}[/itex] as compared to [itex]Y=50X[/itex] again?

Think of a computer simulation. The algorithm for simulating 50 X is to make a random determination for X and then multiply it by 50. The algorithm for simulation the sum of 50 different realizations of X is obviously to make 50 random determinations of X and add them. When you make 50 different random determinations, you have the possibility that opposite extremes will "cancel out". You don't get that with a simulation of 50 X since you only make one random determination for X.

3. For finite [itex]n[/itex], am I correct that this distribution converges faster to a Gaussian distribution near the center and slower in the tails as [itex]n[/itex] increases? Can I quantify this rate of convergence?

That's a good question! I'll guess there are several ways, but i don't know them. One thing to do is understand the distinction between "pointwise convergence" and "uniform convergence". Convergence of a sequence of functions to another function is more complicated than convergence of a function evaluated at a sequence of points to a single number. There are several different definitions for "convergence" when we deal with sequences of functions converging to a single function.

madilyn · May 30, 2014

Stephen Tashi said:

This thread will probably be moved to the homework section. (I think the math homework section would be more obvious as a "math homework" subsection of mathematics - so I sympathize with the misplacement.) For homework, you should state the problem - rather than leave the helpers to guess it from reading your reasoning.

Oh sorry, I came up with the questions myself so it isn't a homework problem.

Stephen Tashi said:

The question of "did gain" vs "did not gain" is a different question that "how much" is gained. "Did gain" can be represented by a random variable that only takes on the values 1 or 0. (A bernoulli random variable.) The theorems about expectation and variance apply to such a random variable but you'd get different answers than you get from answering questions about "how much".

I see. What would you intuitively interpret the 89.2% figure as if not the fraction of days I expect to have positive weight gain?

Stephen Tashi said:

Think of a computer simulation. The algorithm for simulating 50 X is to make a random determination for X and then multiply it by 50. The algorithm for simulation the sum of 50 different realizations of X is obviously to make 50 random determinations of X and add them. When you make 50 different random determinations, you have the possibility that opposite extremes will "cancel out". You don't get that with a simulation of 50 X since you only make one random determination for X.

Ah, this made a lot of sense! Thanks!

Stephen Tashi said:

That's a good question! I'll guess there are several ways, but i don't know them. One thing to do is understand the distinction between "pointwise convergence" and "uniform convergence". Convergence of a sequence of functions to another function is more complicated than convergence of a function evaluated at a sequence of points to a single number. There are several different definitions for "convergence" when we deal with sequences of functions converging to a single function.

Yes, I figured this is a difficult problem, but also one of the more interesting ones I've thought of while learning statistics. What's a good starting point for me to learn about the different definitions of "convergence" that you mentioned in your last sentence?

Thanks so much!

FactChecker · May 31, 2014

madilyn said:

3. For finite [itex]n[/itex], am I correct that this distribution converges faster to a Gaussian distribution near the center and slower in the tails as [itex]n[/itex] increases? Can I quantify this rate of convergence?

A couple of notes. (Talking informally in terms of X=sum of events, not divided by n.)

1) You are talking about approximating a continuous PDF with a discrete PDF so they will always differ significantly at values of X that are not possible for the discrete PDF. You can handle this several ways. I think the best way is to look at convergence of the CDFs instead of the PDF. The CDF of a discrete real random variable is easy to unambiguously extend to all real values.

2) You know that the discrete CDF is 1.0 for all values of X > n. And n+epsilon is where the normal distribution CDF is farthest from 1.0. I suspect that this is where the greatest difference between the two CDFs is and that it gives the rate of uniform convergence. I can not prove it without some work.

Simple problems regarding sum of IID random variables

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad The countability paradox of computable numbers

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Simple problems regarding sum of IID random variables

Similar threads