MHB Binomial Distribution: Likelihood Ratio Test for Equality of Several Proportions

Click For Summary
The discussion focuses on constructing a likelihood ratio test to compare voter proportions favoring candidate A across four political wards, using a significance level of α=0.05. The null hypothesis posits that the proportions are equal, while the alternative suggests at least one differs. The likelihood functions under both hypotheses are derived, but the resulting likelihood ratio, λ, appears to exceed 1, which contradicts expected properties. Participants discuss the need to take logarithms for simplification and express concerns about the sensitivity of the resulting test. The conversation emphasizes the importance of correctly deriving the likelihood ratio and its implications for statistical testing.
Ackbach
Gold Member
MHB
Messages
4,148
Reaction score
94
$\newcommand{\szdp}[1]{\!\left(#1\right)}
\newcommand{\szdb}[1]{\!\left[#1\right]}$
Problem Statement: A survey of voter sentiment was conducted in four midcity political wards to compare the fraction of voters favoring candidate $A.$ Random samples of $200$ voters were polled in each of the four wards. The numbers of voters favoring $A$ in the four samples can be regarded as four independent binomial random variables. Construct a likelihood ratio test of the hypothesis that the fractions of voters favoring candidate $A$ are the same in all four wards. Use $\alpha=0.05.$

Note 1: This is essentially Exercise 10.88 in Mathematical Statistics with Applications, 5th Ed., by Wackerly, Mendenhall, and Sheaffer.

Note 2: This is cross-posted here.

My Work So Far: Let $p_i$ be the proportion of voters favoring $A$ in Ward $i.$ So the null hypothesis is that $p_1=p_2=p_3=p_4,$ while the alternative hypothesis is that at least one proportion is different from the others. We have $f$ as the underlying distribution:
$$f(y_i)=\binom{n}{y_i}p_i^{y_i}(1-p_i)^{n-{y_i}}.$$
It follows that the likelihood function is
$$L(p_1,p_2,p_3,p_4)
=\prod_{i=1}^4\szdb{\binom{n}{y_i}p_i^{y_i}(1-p_i)^{n-y_i}}.$$
Then we construct $L\big(\hat\Omega_0\big)$ and $L\big(\hat\Omega\big).$ Note that under the null hypothesis, we will set $p_1=p_2=p_3=p_4=p.$ Hence,
$$L\big(\hat\Omega_0\big)
=\prod_{i=1}^4\szdb{\binom{n}{y_i}p^{y_i}(1-p)^{n-y_i}}.$$
The one remaining parameter $p$ we will replace with its MLE, which we can confidently say is $\big(\sum y_i\big)/(4n).$ Hence
\begin{align*}
L\big(\hat\Omega_0\big)
&=\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\frac{\sum
y_i}{4n}}^{\!\!y_i}\szdp{1-\frac{\sum y_i}{4n}}^{\!\!n-y_i}}\\
&=\frac{1}{(4n)^n}\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\sum
y_i}^{\!y_i}\szdp{4n-\sum y_i}^{\!n-y_i}}.
\end{align*}
Next, we turn our attention to $L\big(\hat\Omega\big):$
\begin{align*}
L\big(\hat\Omega\big)
&=\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\frac{y_i}{n}}^{\!y_i}
\szdp{1-\frac{y_i}{n}}^{\!n-y_i}}\\
&=\prod_{i=1}^4\szdb{\binom{n}{y_i}\szdp{\frac{y_i}{n}}^{\!y_i}
\szdp{\frac{n-y_i}{n}}^{\!n-y_i}}\\
&=\frac{1}{n^{4n}}\prod_{i=1}^4\szdb{\binom{n}{y_i}y_i^{y_i}
\,\szdp{n-y_i}^{n-y_i}}.
\end{align*}
Next we form the likelihood ratio:
\begin{align*}
\lambda
&=\frac{L\big(\hat\Omega_0\big)}{L\big(\hat\Omega\big)}\\
&=\frac{\displaystyle
\frac{1}{(4n)^n}\prod_{i=1}^4\szdb{\binom{n}{y_i}
\szdp{\sum y_i}^{\!y_i}\szdp{4n-\sum y_i}^{\!n-y_i}}}
{\displaystyle
\frac{1}{n^{4n}}\prod_{i=1}^4\szdb{\binom{n}{y_i}y_i^{y_i}
\,\szdp{n-y_i}^{n-y_i}}}\\
&=\frac{n^{4n}}{4^n\,n^n}\cdot
\prod_{i=1}^4\szdb{\szdp{\frac{\sum y_j}{y_i}}^{\!y_i}\,
\szdp{\frac{4n-\sum y_j}{n-y_i}}^{\!n-y_i}}\\
&=\szdp{\frac{n^3}{4}}^{\!\!n}\cdot
\prod_{i=1}^4\szdb{\szdp{\frac{\sum y_j}{y_i}}^{\!y_i}\,
\szdp{\frac{4n-\sum y_j}{n-y_i}}^{\!n-y_i}}.
\end{align*}

My Questions:
1. This looks wrong to me, because I'm told (and it totally makes sense) that $0\le\lambda\le 1,$ whereas everything in sight is greater than $1.$
2. Supposing this expression can be salvaged, what are the next steps? Should I take logs and try to simplify somehow?
3. I'm expecting to be able to obtain a test something along the lines of
$$\frac{(1/(4n))\sum_{j=1}^ny_j-\sum_{j=1}^n(y_j/n)}{\displaystyle\sqrt{\sum_{j=1}^4\dfrac{(y_j/n)(1-y_j/n)}{n}}},$$
although this test doesn't strike me as sensitive enough. We could have $y_1/n$ much too low, and $y_4/n$ much too high, and this test could still mark them down as equal because they "average out" to the right thing. What's the right generalization to the standard difference of proportions test?
 
Physics news on Phys.org
COOLSerdash on CV.SE was able to simplify the likelihood ratio in such a way that I could tell mine is incorrect. Must have messed up in the algebra somewhere.
 
There is a nice little variation of the problem. The host says, after you have chosen the door, that you can change your guess, but to sweeten the deal, he says you can choose the two other doors, if you wish. This proposition is a no brainer, however before you are quick enough to accept it, the host opens one of the two doors and it is empty. In this version you really want to change your pick, but at the same time ask yourself is the host impartial and does that change anything. The host...

Similar threads

Replies
1
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
9
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
7K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K