madilyn
- 13
- 0
Hi! I'm taking my first course in statistics and am hoping to get some intuition for this set of problems...
Suppose I have a bowl of marbles that each weighs m_{marble}=0.01 kg.
For each marble I swallow, there is a chance p=0.53 that it adds m_{marble} to my weight, and chance 1-p that it causes me to puke, therefore losing m_{puke}=0.011 kg of my weight.
1. Assume I religiously swallow n=10^{4} marbles each day. What fraction of the days do I expect to gain weight on?
Let X_{i}denote the random variable for my weight gained for each swallowed marble, indexed by i\in\mathbb{Z}^{+}.
Let Y denote the random variable for my total weight gained each day from swallowing n marbles, Y=\sum_{i=1}^{n}X_{i}. Then, denote
E\left(X\right) := E\left(X_{1}\right)=E\left(X_{2}\right)...=E\left(X_{n}\right)
Var\left(X\right) := Var\left(X_{1}\right)=Var\left(X_{2}\right)=...=Var\left(X_{n}\right)
such that the theoretical distribution of my daily weight gain is approximately normal with mean
E\left(Y\right)=E\left(X_{1}+...+X_{n}\right)=nE\left(X\right)
and variance
Var\left(Y\right)=Var\left(X_{1}+...+X_{n}\right)=nVar\left(X\right)
Then, I expect to gain weight on 1-P\left(Y\leq0\right)\approx0.892 of the days. Is this correct?
2. Why does Y approximately follow the distribution N\left(nE\left(X\right),\sqrt{nVar\left(X\right)}\right)?
Firstly, am I correct that the variance is nVar\left(X\right) and not n^{2}Var\left(X\right)? Can someone refresh me what's the intuitive difference between the random variable Y=X_{1}+...+X_{n} as compared to Y=50X again?
Secondly, it's not immediately obvious to me how the distribution approaches a Gaussian distribution as n\rightarrow\infty? Perhaps I can formulate this in terms of the convolution of a discrete function representing the distribution of my weight gain/loss for each marble swallowed? Will the discrete convolution approach generalize nicely to the sum of any discrete random variable?
3. For finite n, am I correct that this distribution converges faster to a Gaussian distribution near the center and slower in the tails as n increases? Can I quantify this rate of convergence?
Just from my intuition, I think the best strategy to attack this problem is to express the sum of the random variables Y=\sum_{i=1}^{n}X_{i} as a Fourier transform and then investigate the rate of convergence using an asymptotic expansion of the integral in large n i.e. saddle point method?
Thanks!
Suppose I have a bowl of marbles that each weighs m_{marble}=0.01 kg.
For each marble I swallow, there is a chance p=0.53 that it adds m_{marble} to my weight, and chance 1-p that it causes me to puke, therefore losing m_{puke}=0.011 kg of my weight.
1. Assume I religiously swallow n=10^{4} marbles each day. What fraction of the days do I expect to gain weight on?
Let X_{i}denote the random variable for my weight gained for each swallowed marble, indexed by i\in\mathbb{Z}^{+}.
Let Y denote the random variable for my total weight gained each day from swallowing n marbles, Y=\sum_{i=1}^{n}X_{i}. Then, denote
E\left(X\right) := E\left(X_{1}\right)=E\left(X_{2}\right)...=E\left(X_{n}\right)
Var\left(X\right) := Var\left(X_{1}\right)=Var\left(X_{2}\right)=...=Var\left(X_{n}\right)
such that the theoretical distribution of my daily weight gain is approximately normal with mean
E\left(Y\right)=E\left(X_{1}+...+X_{n}\right)=nE\left(X\right)
and variance
Var\left(Y\right)=Var\left(X_{1}+...+X_{n}\right)=nVar\left(X\right)
Then, I expect to gain weight on 1-P\left(Y\leq0\right)\approx0.892 of the days. Is this correct?
2. Why does Y approximately follow the distribution N\left(nE\left(X\right),\sqrt{nVar\left(X\right)}\right)?
Firstly, am I correct that the variance is nVar\left(X\right) and not n^{2}Var\left(X\right)? Can someone refresh me what's the intuitive difference between the random variable Y=X_{1}+...+X_{n} as compared to Y=50X again?
Secondly, it's not immediately obvious to me how the distribution approaches a Gaussian distribution as n\rightarrow\infty? Perhaps I can formulate this in terms of the convolution of a discrete function representing the distribution of my weight gain/loss for each marble swallowed? Will the discrete convolution approach generalize nicely to the sum of any discrete random variable?
3. For finite n, am I correct that this distribution converges faster to a Gaussian distribution near the center and slower in the tails as n increases? Can I quantify this rate of convergence?
Just from my intuition, I think the best strategy to attack this problem is to express the sum of the random variables Y=\sum_{i=1}^{n}X_{i} as a Fourier transform and then investigate the rate of convergence using an asymptotic expansion of the integral in large n i.e. saddle point method?
Thanks!