Variance of statistic used in runs test

DavideGenoa · Jul 30, 2013

Hi, friends! Since this is my first post, I want to present myself as an Italian who is trying to teach himself mathematics and natural sciences, while having a strictly humanities-centered school background, and I am tempted very much to enrol in a university scientific course.
I read in the Italian language Wikipedia that the variance [itex]\text{Var}_{H_0}(R)[/itex] of the statistic [itex]R[/itex] used in the Wald-Wolfowitz test, under the null hypothesis that the [itex]X_1,...,X_n[/itex] are independent, is[tex](4N-6)p(1-p)-(12N-20)p^2(1-p)^2.[/tex] It is worth to notice that, when discussing the test, it is common to give what I think to be approximations used in the case of the Gaussian approximation of [itex]R[/itex], rather than the real expectation and variance...
That statistic, as my book (S.M. Ross, Introduction to Probability and Statistics for Engineers and Scientists) explains, and as I find in the German language Wikipedia too, has the probability mass function[tex]P_{H_0}(R=2k)=2\frac{\binom{N^+ -1}{k-1}\binom{N^- -1}{k-1}}{\binom{N^+ +N^-}{n}}[/tex][tex]P_{H_0}(R=2k+1)=\frac{\binom{N^+ -1}{k-1}\binom{N^- -1}{k}+\binom{N^+ -1}{k}\binom{N^- -1}{k-1}}{\binom{N^+ +N^-}{n}}[/tex]
The Italian language Wikipedia considers the statistic [itex]R[/itex] to be the same, under the null hypothesis of independence, as [itex]1+\sum_{i=1}^{N-1}|X_i-X_{i+1}|[/itex] where the [itex]X_i[/itex] are Bernoulli random variables with expectation [itex]p[/itex], and the expectation [itex]E_{H_0}[R][/itex] given in that Wikipedia is the same talked about by a user here, who gives a short proof of the value of [itex]E_{H_0}[R]=1+2(N-1)p(1-p)[/itex].

As to variance, I have tried a lot but my efforts to prove it by myself have been useless. I have tried to calculate the second moment by manipulating the sums [itex]\sum_{k=1}^{\min\{N^+ ,N^-\}}(4k^2 P_{H_0}(R=2k)+(2k+1)^2P_{H_0}(R=2k+1))[/itex] if [itex]N^+ \ne N^-[/itex] and by similarly treating the case [itex]N^+ =N^-[/itex] where I would say that the second moment is [itex]E_{H_0}[R^2]=\sum_{k=1}^{N^+ -1}(4k^2 P_{H_0}(R=2k)+(2k+1)^2 P_{H_0}(R=2k+1))+\frac{2(N^+)^2}{\binom{2N^+}{N^+}}[/itex], but I haven't been to simplify those sums with their factorials.
Does anybody knows or can link a proof of the formula for the variance [itex]\text{Var}_{H_0}(R)[/itex]?
I [itex]\infty[/itex]-ly thank you all!

Stephen Tashi · Aug 3, 2013

DavideGenoa said:

The Italian language Wikipedia considers the statistic [itex]R[/itex] to be the same, under the null hypothesis of independence, as [itex]1+\sum_{i=1}^{N-1}|X_i-X_{i+1}|[/itex] where the [itex]X_i[/itex] are Bernoulli random variables with expectation [itex]p[/itex]

One thought is to let [itex]Y_i = |X_i - X_{i+1}|[/itex].

Only consecutive [itex]Y's[/itex] are not independent, so

[tex]Var( 1 + \sum_{i=1}^{N-1} Y_i ) = \sum_{i=1}^{N-1} Var(Y_i) + 2 \sum_{i=1}^{N-2} Cov(Y_i,Y_{i+1})[/tex]

Then you have to find formula for [itex]Var(Y_i)[/itex] and [itex]Cov(Y_i,Y_{i+1})[/itex]. That might not be easy, but iat least it focuses our attention on only three of the [itex]X's[/itex] at a time.

Variance of statistic used in runs test

Similar threads

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad Understanding permutations and combinations in a coin toss experiment

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect