# Hypothesis testing for pairs of means

kingwinner

## Homework Statement

1a) The strength of concrete depends to some extent on the method used for drying it. Two different drying methods were tested independently on specimens. The strength using each of the methods follow a normal distribution with mean μ_x and μ_y respectively and the same variance. The results are:
method 1: n1=7, x bar=3250, s1=210
method 2: n2=10, y bar=3240, s2=190
Do the methods appear (use alpha=0.05) to produce concrete with different mean strength?

1b) Suppose σ_x=210, σ_y=190 (n1, n2, x bar, y bar same as part a). Find the probability of deciding that the methods are not different when the true difference in means is 2.

## Homework Equations

Hypothesis testing for pairs of means

## The Attempt at a Solution

I am OK with part a). Here the test is H_o: μ_x=μ_y v.s. H_a: μ_x≠μ_y. I computed a p-value of >0.2, so the p-value is greater than alpha(which is 0.05), and so we fail to reject H_o and the answer is "no".

Now I have some troubles with part b)...
Here we have:
Ho: μ_x = μ_y
Ha: μ_x - μ_y = 2
I think in part b) we have to find P(type II error) = P(fail to reject H_o | μ_x-μ_y=2)

How can we translate "fail to reject H_o" into a mathematical statement for which we can compute the probability? How can we find the "rejection region" in this case??

Thanks a lot!

Billy Bob
For (b), Ha is still μ_x≠μ_y as in (a), you still use s1 and s2 not sigma1 and sigma2, and the rejection region for (b) is still what you used in (a).

The additional assumptions in (b) are that you know reality (i.e. true sigmas and true diff of means), so that P(type II error) can actually be calculated.

kingwinner
But I think P(type II error) depends on a specific value in H_a so H_a should be a simple (not composite) hypothesis like Ha: μ_x - μ_y = 2 ?

Also, in part b) the population standard deviations are KNOWN, so I think the test statisitc would be different from that in part a) in which the population standard deviations are UNKNOWN? What would be the test statisitc in part b) ?

Thanks!

Billy Bob
I still think my interpretation is correct. Also, the test statistic from (a) depends only on the difference μ_x - μ_y, not the individual values of μ_x and μ_y, so you can do (b) my way.

(b) is not asking for a different test. It is asking for the value of beta in question (a). This can't be found without knowing true sigma_x, true sigma_y, and true difference μ_x - μ_y, which is why those values were given.

Billy Bob
On further thought, you may be more right than I thought. You may have to use a different statistic to calculate (b). The question is, what is the cutoff - a new one? or the same as (a)?

Write down your work from (a), especially the statistic you used.

kingwinner
For part a), I used the test statistic
t_stat = (X bar - Y bar) / [S_p sqrt(1/n1 + 1/n2)]
where S_p is the pooled estimator of the common variance.

S_p is based on the sample standard deviations, so it cannot be used in part b (we don't have the values of the sample standard deviations in part b, but instead the population standard deviations are given).

Billy Bob
OK, I think you are more right than I was. For (b), use

Z = numerator / denominator

where numerator = Xbar - Ybar - (μx - μy)

and denominator = √( σ²x / n1 - σ²y / n2 )

kingwinner
OK, I think you are more right than I was. For (b), use

Z = numerator / denominator

where numerator = Xbar - Ybar - (μx - μy)

and denominator = √( σ²x / n1 - σ²y / n2 )

For the denominator, should it be a + (instead of -)?

Also, for part b, should we use a one-tailed test or two-tailed test? Would the alternative hypothesis still be H_a: μ_x≠μ_y?

Thanks!

Billy Bob
For the denominator, should it be a + (instead of -)?

Yes, use + instead of - !! I'm not good at "non-TeX."

Also, for part b, should we use a one-tailed test or two-tailed test?

Two.

Would the alternative hypothesis still be H_a: μ_x≠μ_y?

Yes.

kingwinner
um...but how can we find beta for a TWO-tailed test?

P(type II error)
= P(fail to reject H_o | μ_x-μ_y=2)
= P( ??? | μ_x-μ_y=2) ?

Thanks!

Billy Bob
Maybe it is one. Can't you do two like this (?) :

Numerator0 = Num0 = Xbar - Ybar - 0

Numerator1 = Num1 = Xbar - Ybar - 2

Denominator = D = √( σ²x / n1 + σ²y / n2 )

If H0 were true, you'd know P(-1.96<Num0/D<1.96)=0.95

But if true diff of means were 2, what is P(-1.96<Num0/D<1.96), realizing that Num1/D is std normal.

P(-1.96 < Num0/D < 1.96)
=P(-1.96D < Num0 < 1.96D)
=P(-1.96D-2 < Num1 < 1.96D-2)
=P(-1.96-2/D < Num1/D < 1.96-2/D)
=P(-1.96-2/D < Z < 1.96-2/D) then look it up in the table.

If you think it should be one tail, then modify accordingly. Write up both and turn them both in?

What do you think, does this work?

kingwinner
I think it has to be a TWO-tailed test, because we're testing for the difference.

Assuming alpha=0.05,

P(type II error)
= P(fail to reject H_o | μ_x-μ_y=2)
= P( -1.96<Z<1.96 | μ_x-μ_y=2)
But what is Z here?
Is it
Z = (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]
or Z = (Xbar - Ybar - 2) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] ??? Which one?? I am a little confused...

Billy Bob
But what is Z here?
Is it
Z = (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]
or Z = (Xbar - Ybar - 2) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] ???

It's

P( -1.96<(Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]<1.96 | μ_x-μ_y=2)

but I wouldn't use Z as the name of (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] at this point because (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] isn't standard normal once you assume μ_x-μ_y=2.

You now have to manipulate

P( -1.96<(Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]<1.96 | μ_x-μ_y=2)

into the form

P( thing <(Xbar - Ybar - 2) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]<other thing),

and since (Xbar - Ybar - 2) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] is standard normal, you can then finish off the problem.

kingwinner
But commonly, the Z statistic is denoted by Z (which is also the notation for standard normal random variable)
Z = (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]
So this is why (due to abuse of notation) we need to be CAREFUL with the following, right?

P( -1.96<Z<1.96 | μ_x-μ_y=2)
= P( -1.96 < (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] <1.96 | μ_x-μ_y=2)
But for the -1.96<Z<1.96 part, the Z is NOT standard normal in this case, it is the Z statistic.

Billy Bob
Seems like you understand it. I prefer reserving the letter Z for an (at least approximately) N(0,1) random variable, exclusively.

When I write out a hypothesis test and call the test statistic Z, I am doing so because if H0 is true, this statistic is N(0,1).

When I calculate beta or power, I no longer use the letter Z to refer to that same statistic, because H0 is not true, so the statistic is not N(0,1).