MHB Monte Carlo method to generate standard normal

AI Thread Summary
The discussion focuses on using the Monte Carlo method to estimate the mean, variance, and kurtosis of the standard normal distribution in Matlab. The initial approach involved generating random samples and calculating averages, but the user realized the need for a more systematic method. Several code snippets were shared, illustrating how to generate standard normal variables and compute the required statistics efficiently. The conversation also touched on the benefits of using quasirandom sequences for faster convergence in numerical computations. Ultimately, the participants emphasized the importance of understanding the statistical properties and coding techniques for accurate results.
Jameson
Insights Author
Gold Member
MHB
Messages
4,533
Reaction score
13
Problem:
I need to use the Monte Carlo method to generate the mean, variance and kurtosis of the standard normal distribution. It has to be coded in Matlab, so there are two parts to this question:

1) The theory
2) The code

Any help on either is appreciated.

My attempt:
To find $E[X]$ for a random variable we can use the definition $$E[X]=\int_{-\infty}^{\infty}x*f(x)dx$$, where $f(x)$ is the distribution's pdf.

I believe the first step is to general a random sample to use. Now my thought is to use random standard uniform variables on $[0,1]$ but I'm not sure.

Once we do that we find the mean for each $X$ and average them for n samples. It should converge to 0.

There are clearly some mistakes here in the set up so any guidance or comments is appreciated.
 
Physics news on Phys.org
Oh man was I way off. :o

I need to simply generate a large amount of standard normal random variables and then take their average to estimate $E[X]$.

Here's the Matlab code I did for this in case anyone is interested:

Code:
clear all;

x=randn(10000,1);

for i=1:10000
    m(i,1)=sum(x(1:i))/i;
end;

figure(1)
plot (1:10000,m(1:10000,:))

Now on to the variance and kurtosis!
 
Jameson said:
Oh man was I way off. :o

I need to simply generate a large amount of standard normal random variables and then take their average to estimate $E[X]$.

Here's the Matlab code I did for this in case anyone is interested:

Code:
clear all;

x=randn(10000,1);

for i=1:10000
    m(i,1)=sum(x(1:i))/i;
end;

figure(1)
plot (1:10000,m(1:10000,:))

Now on to the variance and kurtosis!

Code that generalises could be something like:

Code:
N=100;M=20;
x=randn(N,M);
mN=mean(x);  %takes means of the columns of x, leaves a sample of M means of N standard normals
GrandMean=mean(mN)

Code:
N=100;M=20;
x=randn(N,M);
vN=var(x,0);  %takes var of the columns of x, leaves a sample of M bias corrected vars of N standard normals
GrandVar=mean(vN)

Code:
N=100;M=20;
x=randn(N,M);
kN=kurtosis(x,0);  %takes kurts of the columns of x, leaves a sample of M bias 
                   %corrected kurts of N standard normals
GrandKur=mean(kN)

If you just want the statistic from a large sample set M=1.

If you are interested in the sampling distribution of your statistics look at the histograms of mN, vN and kN.

You might also be interested in the un-corrected stats in which case IIRC use var(x,1) and kurtosis(x,1)

.
 
Last edited:
For some reason I have a tendency to write things as one long column or row vector, so I'll make something that's 100x1 instead of 10x10. Sometimes the 10x10 configuration seems to have benefits, like you're pointing out.

Here is the code I just finished. I'm not quite satisfied because I had to cheat a little bit to get the kurtosis and use the fact that I know the mean is 0 and the S.D. is 1, instead of using the estimators I found.

Code:
clear all;
x=randn(10000,1);

for i=1:10000
    %mean
    m(i,1)=sum(x(1:i))/i;
    varianceofm(i,1)=var(m(1:i));

    %variance
    sse(i,1)=(x(i)-m(i))^2;
    v(i,1)=(1/(i))*sum(sse(1:i));
    varianceofv(i,1)=var(v(1:i));

    %kurtosis
    k(i,1)=(1/i)*sum(x(1:i).^4);
    varianceofk(i,1)=var(k(1:i));
end;
 
You can also use a quasirandom (low discrepancy) number sequence to converge faster and require less samples, though it's not that big a deal for a 1D integral, but it's good to know since most numerical computing languages have something like that built-in as a drop-in replacement for e.g. randn().

Uniform random sequences converge at a rate of $O(\sqrt{n})$ for $n$ samples, whereas low-discrepancy sequences converge at a rate of $O(n)$, i.e. after $n$ samples the error will be on the order of $\frac{1}{n}$ compared to $\frac{1}{\sqrt{n}}$. The catch is that you need to pay attention to how you generate those sequences, as they may be correlated depending on the algorithm used (especially in high dimensions e.g. > 10, where it just gets messy) but the language will take care of that for you.
 
Back
Top