Need statistics help working with normal distribution

tim8691 · Jun 30, 2011

Hello experts,

Thanks to discussions with Stephen Tashi for getting me this far.

See the problem statement in the attached PDF page 1. I need help solving for Qc in equation form, as a function of the other variables (N and C), preferably using erfc so I can program an accurate algorithm for very large values of N (>1E+16).

Pages 2 and 3 attempt to solve this problem, but only take me so far. Not sure if this is the right approach, of it there's just another step or two needed from what's presented there.

Looking forward if anyone can help me figure this out.

Tim

mathman · Jun 30, 2011

Attached pdf?

tim8691 · Jun 30, 2011

Hmm, not sure why it didn't take. I'll try again.

bpet · Jun 30, 2011

To rephrase, the problem would be to determine the Q such that
[tex]P[\max(X_1,...,X_N)-\min(X_1,...,X_N) \le 2Q]=C[/tex]
where the [itex]X_i[/itex] are iid N(0,1). The LHS can be written as
[tex]\int_{-\infty}^{\infty}N(F(x+2Q)-F(x))^{N-1}f(x) dx[/tex]
where [itex]F(x)[/itex] and [itex]f(x)[/itex] are the Normal CDF and PDF respectively, though the integral may not have a closed form.

If instead you solve for the Q such that
[tex]P[-Q\le \min(X_1,...,X_N) \le \max(X_1,...,X_N) \le Q] = C[/tex]
then the left hand side is
[tex]P[-Q\le X_1 \le Q]^N = (F(Q)-F(-Q))^N = (2F(Q)-1)^N[/tex]
The solution is
[tex]Q = F^{-1}((1+C^{1/N})/2)[/tex]
The normal quantile function is implemented in many computer languages, e.g. with C=0.95 and N=1e6 the Excel formula "=NORMSINV((1+0.95^(1/1e6))/2)" returns 5.446768 which agrees with R and MATLAB.

Edit: if you must use erf, use [itex]F(x)=(erf(x/\sqrt{2})+1)/2[/itex] so
[tex]Q=\sqrt{2}erf^{-1}(C^{1/N})[/tex]

tim8691 · Jul 1, 2011

Thanks so much bpet, I believe your solution above solves the equation:

(p(Q))^N = C

as I've defined on page 2 of the attached document. This Q is the probability of running one experiment of population N and computing the range (e.g. 2Q) of the normally distributed random variable x.

But after that, how do we then account for running that experiment many times and solving for Qc? That is, "If we repeat the above experiment an infinite (or, very large) number of times, and create a histogram from all the values of 2Q measured, how far (e.g. 2Qc) into this new histogram contains C percent of the population?"

I think Qc should differ from Q, is that right?

Need statistics help working with normal distribution

Attachments

1. What is a normal distribution?

2. How is a normal distribution calculated?

3. What is the purpose of using a normal distribution in statistics?

4. How do you interpret a normal distribution curve?

5. What are some real-world examples of a normal distribution?

Similar threads

Hot Threads

Recent Insights