Binomial Distribution: Likelihood Ratio Test for Equality of Several Proportions

In summary: Anyways, the final ratio is $$\lambda=\frac{\prod_{i=1}^4\szdp{y_i/(n-y_i)}^{y_i}}{\prod_{i=1}^4\szdp{y_i/(n-y_i)}^{n-y_i}}.$$In summary, a likelihood ratio test is constructed to compare the fractions of voters favoring candidate $A$ in four midcity political wards. The test is based on the null hypothesis that the proportions are the same in all four wards and the alternative hypothesis that at least one proportion is different. The likelihood function is used to calculate the likelihood ratio, which is simplified to $\lambda=\prod_{i=1}^4\sz
  • #1
Ackbach
Gold Member
MHB
4,155
93
$\newcommand{\szdp}[1]{\!\left(#1\right)}
\newcommand{\szdb}[1]{\!\left[#1\right]}$
Problem Statement: A survey of voter sentiment was conducted in four midcity political wards to compare the fraction of voters favoring candidate $A.$ Random samples of $200$ voters were polled in each of the four wards. The numbers of voters favoring $A$ in the four samples can be regarded as four independent binomial random variables. Construct a likelihood ratio test of the hypothesis that the fractions of voters favoring candidate $A$ are the same in all four wards. Use $\alpha=0.05.$

Note 1: This is essentially Exercise 10.88 in Mathematical Statistics with Applications, 5th Ed., by Wackerly, Mendenhall, and Sheaffer.

Note 2: This is cross-posted here.

My Work So Far: Let $p_i$ be the proportion of voters favoring $A$ in Ward $i.$ So the null hypothesis is that $p_1=p_2=p_3=p_4,$ while the alternative hypothesis is that at least one proportion is different from the others. We have $f$ as the underlying distribution:
$$f(y_i)=\binom{n}{y_i}p_i^{y_i}(1-p_i)^{n-{y_i}}.$$
It follows that the likelihood function is
$$L(p_1,p_2,p_3,p_4)
=\prod_{i=1}^4\szdb{\binom{n}{y_i}p_i^{y_i}(1-p_i)^{n-y_i}}.$$
Then we construct $L\big(\hat\Omega_0\big)$ and $L\big(\hat\Omega\big).$ Note that under the null hypothesis, we will set $p_1=p_2=p_3=p_4=p.$ Hence,
$$L\big(\hat\Omega_0\big)
=\prod_{i=1}^4\szdb{\binom{n}{y_i}p^{y_i}(1-p)^{n-y_i}}.$$
The one remaining parameter $p$ we will replace with its MLE, which we can confidently say is $\big(\sum y_i\big)/(4n).$ Hence
\begin{align*}
L\big(\hat\Omega_0\big)
&=\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\frac{\sum
y_i}{4n}}^{\!\!y_i}\szdp{1-\frac{\sum y_i}{4n}}^{\!\!n-y_i}}\\
&=\frac{1}{(4n)^n}\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\sum
y_i}^{\!y_i}\szdp{4n-\sum y_i}^{\!n-y_i}}.
\end{align*}
Next, we turn our attention to $L\big(\hat\Omega\big):$
\begin{align*}
L\big(\hat\Omega\big)
&=\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\frac{y_i}{n}}^{\!y_i}
\szdp{1-\frac{y_i}{n}}^{\!n-y_i}}\\
&=\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\frac{y_i}{n}}^{\!y_i}
\szdp{\frac{n-y_i}{n}}^{\!n-y_i}}\\
&=\frac{1}{n^{4n}}\prod_{i=1}^4\szdb{\binom{n}{y_i}y_i^{y_i}
\,\szdp{n-y_i}^{n-y_i}}.
\end{align*}
Next we form the likelihood ratio:
\begin{align*}
\lambda
&=\frac{L\big(\hat\Omega_0\big)}{L\big(\hat\Omega\big)}\\
&=\frac{\displaystyle
\frac{1}{(4n)^n}\prod_{i=1}^4\szdb{\binom{n}{y_i}
\szdp{\sum y_i}^{\!y_i}\szdp{4n-\sum y_i}^{\!n-y_i}}}
{\displaystyle
\frac{1}{n^{4n}}\prod_{i=1}^4\szdb{\binom{n}{y_i}y_i^{y_i}
\,\szdp{n-y_i}^{n-y_i}}}\\
&=\frac{n^{4n}}{4^n\,n^n}\cdot
\prod_{i=1}^4\szdb{\szdp{\frac{\sum y_j}{y_i}}^{\!y_i}\,
\szdp{\frac{4n-\sum y_j}{n-y_i}}^{\!n-y_i}}\\
&=\szdp{\frac{n^3}{4}}^{\!\!n}\cdot
\prod_{i=1}^4\szdb{\szdp{\frac{\sum y_j}{y_i}}^{\!y_i}\,
\szdp{\frac{4n-\sum y_j}{n-y_i}}^{\!n-y_i}}.
\end{align*}

My Questions:
1. This looks wrong to me, because I'm told (and it totally makes sense) that $0\le\lambda\le 1,$ whereas everything in sight is greater than $1.$
2. Supposing this expression can be salvaged, what are the next steps? Should I take logs and try to simplify somehow?
3. I'm expecting to be able to obtain a test something along the lines of
$$\frac{(1/(4n))\sum_{j=1}^ny_j-\sum_{j=1}^n(y_j/n)}{\displaystyle\sqrt{\sum_{j=1}^4\dfrac{(y_j/n)(1-y_j/n)}{n}}},$$
although this test doesn't strike me as sensitive enough. We could have $y_1/n$ much too low, and $y_4/n$ much too high, and this test could still mark them down as equal because they "average out" to the right thing. What's the right generalization to the standard difference of proportions test?
 
Physics news on Phys.org
  • #2
COOLSerdash on CV.SE was able to simplify the likelihood ratio in such a way that I could tell mine is incorrect. Must have messed up in the algebra somewhere.
 
  • Like
Likes WWGD

Similar threads

Back
Top