Monte Carlo method to generate standard normal

Click For Summary
SUMMARY

The discussion focuses on using the Monte Carlo method to generate the mean, variance, and kurtosis of the standard normal distribution, specifically implemented in Matlab. The user initially attempted to calculate the expected value using the probability density function but realized the need to generate a large sample of standard normal random variables using the randn function. The final Matlab code provided effectively computes the mean, variance, and kurtosis, demonstrating the importance of using sufficient sample sizes and understanding the statistical properties of the generated data.

PREREQUISITES
  • Understanding of the Monte Carlo method
  • Familiarity with Matlab programming
  • Knowledge of statistical concepts such as mean, variance, and kurtosis
  • Experience with random number generation using randn in Matlab
NEXT STEPS
  • Explore advanced Monte Carlo techniques for higher-dimensional distributions
  • Learn about low-discrepancy sequences for improved convergence rates
  • Investigate the use of var(x,1) and kurtosis(x,1) for uncorrected statistics in Matlab
  • Study the implications of sample size on statistical estimators
USEFUL FOR

Statisticians, data scientists, and Matlab users interested in implementing Monte Carlo simulations for statistical analysis and those looking to enhance their understanding of random variable generation and statistical properties.

Jameson
Insights Author
Gold Member
MHB
Messages
4,533
Reaction score
13
Problem:
I need to use the Monte Carlo method to generate the mean, variance and kurtosis of the standard normal distribution. It has to be coded in Matlab, so there are two parts to this question:

1) The theory
2) The code

Any help on either is appreciated.

My attempt:
To find $E[X]$ for a random variable we can use the definition $$E[X]=\int_{-\infty}^{\infty}x*f(x)dx$$, where $f(x)$ is the distribution's pdf.

I believe the first step is to general a random sample to use. Now my thought is to use random standard uniform variables on $[0,1]$ but I'm not sure.

Once we do that we find the mean for each $X$ and average them for n samples. It should converge to 0.

There are clearly some mistakes here in the set up so any guidance or comments is appreciated.
 
Physics news on Phys.org
Oh man was I way off. :o

I need to simply generate a large amount of standard normal random variables and then take their average to estimate $E[X]$.

Here's the Matlab code I did for this in case anyone is interested:

Code:
clear all;

x=randn(10000,1);

for i=1:10000
    m(i,1)=sum(x(1:i))/i;
end;

figure(1)
plot (1:10000,m(1:10000,:))

Now on to the variance and kurtosis!
 
Jameson said:
Oh man was I way off. :o

I need to simply generate a large amount of standard normal random variables and then take their average to estimate $E[X]$.

Here's the Matlab code I did for this in case anyone is interested:

Code:
clear all;

x=randn(10000,1);

for i=1:10000
    m(i,1)=sum(x(1:i))/i;
end;

figure(1)
plot (1:10000,m(1:10000,:))

Now on to the variance and kurtosis!

Code that generalises could be something like:

Code:
N=100;M=20;
x=randn(N,M);
mN=mean(x);  %takes means of the columns of x, leaves a sample of M means of N standard normals
GrandMean=mean(mN)

Code:
N=100;M=20;
x=randn(N,M);
vN=var(x,0);  %takes var of the columns of x, leaves a sample of M bias corrected vars of N standard normals
GrandVar=mean(vN)

Code:
N=100;M=20;
x=randn(N,M);
kN=kurtosis(x,0);  %takes kurts of the columns of x, leaves a sample of M bias 
                   %corrected kurts of N standard normals
GrandKur=mean(kN)

If you just want the statistic from a large sample set M=1.

If you are interested in the sampling distribution of your statistics look at the histograms of mN, vN and kN.

You might also be interested in the un-corrected stats in which case IIRC use var(x,1) and kurtosis(x,1)

.
 
Last edited:
For some reason I have a tendency to write things as one long column or row vector, so I'll make something that's 100x1 instead of 10x10. Sometimes the 10x10 configuration seems to have benefits, like you're pointing out.

Here is the code I just finished. I'm not quite satisfied because I had to cheat a little bit to get the kurtosis and use the fact that I know the mean is 0 and the S.D. is 1, instead of using the estimators I found.

Code:
clear all;
x=randn(10000,1);

for i=1:10000
    %mean
    m(i,1)=sum(x(1:i))/i;
    varianceofm(i,1)=var(m(1:i));

    %variance
    sse(i,1)=(x(i)-m(i))^2;
    v(i,1)=(1/(i))*sum(sse(1:i));
    varianceofv(i,1)=var(v(1:i));

    %kurtosis
    k(i,1)=(1/i)*sum(x(1:i).^4);
    varianceofk(i,1)=var(k(1:i));
end;
 
You can also use a quasirandom (low discrepancy) number sequence to converge faster and require less samples, though it's not that big a deal for a 1D integral, but it's good to know since most numerical computing languages have something like that built-in as a drop-in replacement for e.g. randn().

Uniform random sequences converge at a rate of $O(\sqrt{n})$ for $n$ samples, whereas low-discrepancy sequences converge at a rate of $O(n)$, i.e. after $n$ samples the error will be on the order of $\frac{1}{n}$ compared to $\frac{1}{\sqrt{n}}$. The catch is that you need to pay attention to how you generate those sequences, as they may be correlated depending on the algorithm used (especially in high dimensions e.g. > 10, where it just gets messy) but the language will take care of that for you.
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K