# Variance of statistic used in runs test

1. Jul 30, 2013

### DavideGenoa

Hi, friends! Since this is my first post, I want to present myself as an Italian who is trying to teach himself mathematics and natural sciences, while having a strictly humanities-centered school background, and I am tempted very much to enrol in a university scientific course.
I read in the Italian language Wikipedia that the variance $\text{Var}_{H_0}(R)$ of the statistic $R$ used in the Wald-Wolfowitz test, under the null hypothesis that the $X_1,...,X_n$ are independent, is$$(4N-6)p(1-p)-(12N-20)p^2(1-p)^2.$$ It is worth to notice that, when discussing the test, it is common to give what I think to be approximations used in the case of the Gaussian approximation of $R$, rather than the real expectation and variance...
That statistic, as my book (S.M. Ross, Introduction to Probability and Statistics for Engineers and Scientists) explains, and as I find in the German language Wikipedia too, has the probability mass function$$P_{H_0}(R=2k)=2\frac{\binom{N^+ -1}{k-1}\binom{N^- -1}{k-1}}{\binom{N^+ +N^-}{n}}$$$$P_{H_0}(R=2k+1)=\frac{\binom{N^+ -1}{k-1}\binom{N^- -1}{k}+\binom{N^+ -1}{k}\binom{N^- -1}{k-1}}{\binom{N^+ +N^-}{n}}$$
The Italian language Wikipedia considers the statistic $R$ to be the same, under the null hypothesis of independence, as $1+\sum_{i=1}^{N-1}|X_i-X_{i+1}|$ where the $X_i$ are Bernoulli random variables with expectation $p$, and the expectation $E_{H_0}[R]$ given in that Wikipedia is the same talked about by a user here, who gives a short proof of the value of $E_{H_0}[R]=1+2(N-1)p(1-p)$.

As to variance, I have tried a lot but my efforts to prove it by myself have been useless. I have tried to calculate the second moment by manipulating the sums $\sum_{k=1}^{\min\{N^+ ,N^-\}}(4k^2 P_{H_0}(R=2k)+(2k+1)^2P_{H_0}(R=2k+1))$ if $N^+ \ne N^-$ and by similarly treating the case $N^+ =N^-$ where I would say that the second moment is $E_{H_0}[R^2]=\sum_{k=1}^{N^+ -1}(4k^2 P_{H_0}(R=2k)+(2k+1)^2 P_{H_0}(R=2k+1))+\frac{2(N^+)^2}{\binom{2N^+}{N^+}}$, but I haven't been to simplify those sums with their factorials.
Does anybody knows or can link a proof of the formula for the variance $\text{Var}_{H_0}(R)$?
I $\infty$-ly thank you all!!!

Last edited: Jul 30, 2013
2. Aug 3, 2013

### Stephen Tashi

One thought is to let $Y_i = |X_i - X_{i+1}|$.

Only consecutive $Y's$ are not independent, so

$$Var( 1 + \sum_{i=1}^{N-1} Y_i ) = \sum_{i=1}^{N-1} Var(Y_i) + 2 \sum_{i=1}^{N-2} Cov(Y_i,Y_{i+1})$$

Then you have to find formula for $Var(Y_i)$ and $Cov(Y_i,Y_{i+1})$. That might not be easy, but iat least it focuses our attention on only three of the $X's$ at a time.