Variance of statistic used in runs test

DavideGenoa · Jul 30, 2013

Hi, friends! Since this is my first post, I want to present myself as an Italian who is trying to teach himself mathematics and natural sciences, while having a strictly humanities-centered school background, and I am tempted very much to enrol in a university scientific course.
I read in the Italian language Wikipedia that the variance [itex]\text{Var}_{H_0}(R)[/itex] of the statistic [itex]R[/itex] used in the Wald-Wolfowitz test, under the null hypothesis that the [itex]X_1,...,X_n[/itex] are independent, is[tex](4N-6)p(1-p)-(12N-20)p^2(1-p)^2.[/tex] It is worth to notice that, when discussing the test, it is common to give what I think to be approximations used in the case of the Gaussian approximation of [itex]R[/itex], rather than the real expectation and variance...
That statistic, as my book (S.M. Ross, Introduction to Probability and Statistics for Engineers and Scientists) explains, and as I find in the German language Wikipedia too, has the probability mass function[tex]P_{H_0}(R=2k)=2\frac{\binom{N^+ -1}{k-1}\binom{N^- -1}{k-1}}{\binom{N^+ +N^-}{n}}[/tex][tex]P_{H_0}(R=2k+1)=\frac{\binom{N^+ -1}{k-1}\binom{N^- -1}{k}+\binom{N^+ -1}{k}\binom{N^- -1}{k-1}}{\binom{N^+ +N^-}{n}}[/tex]
The Italian language Wikipedia considers the statistic [itex]R[/itex] to be the same, under the null hypothesis of independence, as [itex]1+\sum_{i=1}^{N-1}|X_i-X_{i+1}|[/itex] where the [itex]X_i[/itex] are Bernoulli random variables with expectation [itex]p[/itex], and the expectation [itex]E_{H_0}[R][/itex] given in that Wikipedia is the same talked about by a user here, who gives a short proof of the value of [itex]E_{H_0}[R]=1+2(N-1)p(1-p)[/itex].

As to variance, I have tried a lot but my efforts to prove it by myself have been useless. I have tried to calculate the second moment by manipulating the sums [itex]\sum_{k=1}^{\min\{N^+ ,N^-\}}(4k^2 P_{H_0}(R=2k)+(2k+1)^2P_{H_0}(R=2k+1))[/itex] if [itex]N^+ \ne N^-[/itex] and by similarly treating the case [itex]N^+ =N^-[/itex] where I would say that the second moment is [itex]E_{H_0}[R^2]=\sum_{k=1}^{N^+ -1}(4k^2 P_{H_0}(R=2k)+(2k+1)^2 P_{H_0}(R=2k+1))+\frac{2(N^+)^2}{\binom{2N^+}{N^+}}[/itex], but I haven't been to simplify those sums with their factorials.
Does anybody knows or can link a proof of the formula for the variance [itex]\text{Var}_{H_0}(R)[/itex]?
I [itex]\infty[/itex]-ly thank you all!

Stephen Tashi · Aug 3, 2013

DavideGenoa said:

The Italian language Wikipedia considers the statistic [itex]R[/itex] to be the same, under the null hypothesis of independence, as [itex]1+\sum_{i=1}^{N-1}|X_i-X_{i+1}|[/itex] where the [itex]X_i[/itex] are Bernoulli random variables with expectation [itex]p[/itex]

One thought is to let [itex] Y_i = |X_i - X_{i+1}| [/itex].

Only consecutive [itex]Y's[/itex] are not independent, so

[tex] Var( 1 + \sum_{i=1}^{N-1} Y_i ) = \sum_{i=1}^{N-1} Var(Y_i) + 2 \sum_{i=1}^{N-2} Cov(Y_i,Y_{i+1}) [/tex]

Then you have to find formula for [itex] Var(Y_i) [/itex] and [itex] Cov(Y_i,Y_{i+1}) [/itex]. That might not be easy, but iat least it focuses our attention on only three of the [itex] X's [/itex] at a time.

Variance of statistic used in runs test

What is the variance of a statistic used in runs test?

Why is the variance of a statistic used in runs test important?

How is the variance of a statistic used in runs test calculated?

What does a high variance of a statistic used in runs test indicate?

How can the variance of a statistic used in runs test be reduced?

Similar threads

Hot Threads

Recent Insights