# Hypothesis testing for pairs of means

• kingwinner
In summary: So use a different name for (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ].I should have used a different name for σ_x, σ_y, etc. And note that in practice, you don't know μ_x, μ_y, σ_x, or σ_y, so there is no practical reason to ever compute this. Only in a theoretical exercise like this is it done.BTW, I am not sure if it is a two-tailed test or not. But if you think it is one-tailed, you have to modify my answer

## Homework Statement

1a) The strength of concrete depends to some extent on the method used for drying it. Two different drying methods were tested independently on specimens. The strength using each of the methods follow a normal distribution with mean μ_x and μ_y respectively and the same variance. The results are:
method 1: n1=7, x bar=3250, s1=210
method 2: n2=10, y bar=3240, s2=190
Do the methods appear (use alpha=0.05) to produce concrete with different mean strength?

1b) Suppose σ_x=210, σ_y=190 (n1, n2, x bar, y bar same as part a). Find the probability of deciding that the methods are not different when the true difference in means is 2.

## Homework Equations

Hypothesis testing for pairs of means

## The Attempt at a Solution

I am OK with part a). Here the test is H_o: μ_x=μ_y v.s. H_a: μ_x≠μ_y. I computed a p-value of >0.2, so the p-value is greater than alpha(which is 0.05), and so we fail to reject H_o and the answer is "no".

Now I have some troubles with part b)...
Here we have:
Ho: μ_x = μ_y
Ha: μ_x - μ_y = 2
I think in part b) we have to find P(type II error) = P(fail to reject H_o | μ_x-μ_y=2)

How can we translate "fail to reject H_o" into a mathematical statement for which we can compute the probability? How can we find the "rejection region" in this case??

Thanks a lot!

For (b), Ha is still μ_x≠μ_y as in (a), you still use s1 and s2 not sigma1 and sigma2, and the rejection region for (b) is still what you used in (a).

The additional assumptions in (b) are that you know reality (i.e. true sigmas and true diff of means), so that P(type II error) can actually be calculated.

But I think P(type II error) depends on a specific value in H_a so H_a should be a simple (not composite) hypothesis like Ha: μ_x - μ_y = 2 ?

Also, in part b) the population standard deviations are KNOWN, so I think the test statisitc would be different from that in part a) in which the population standard deviations are UNKNOWN? What would be the test statisitc in part b) ?

Thanks!

I still think my interpretation is correct. Also, the test statistic from (a) depends only on the difference μ_x - μ_y, not the individual values of μ_x and μ_y, so you can do (b) my way.

(b) is not asking for a different test. It is asking for the value of beta in question (a). This can't be found without knowing true sigma_x, true sigma_y, and true difference μ_x - μ_y, which is why those values were given.

On further thought, you may be more right than I thought. You may have to use a different statistic to calculate (b). The question is, what is the cutoff - a new one? or the same as (a)?

Write down your work from (a), especially the statistic you used.

For part a), I used the test statistic
t_stat = (X bar - Y bar) / [S_p sqrt(1/n1 + 1/n2)]
where S_p is the pooled estimator of the common variance.

S_p is based on the sample standard deviations, so it cannot be used in part b (we don't have the values of the sample standard deviations in part b, but instead the population standard deviations are given).

OK, I think you are more right than I was. For (b), use

Z = numerator / denominator

where numerator = Xbar - Ybar - (μx - μy)

and denominator = √( σ²x / n1 - σ²y / n2 )

Billy Bob said:
OK, I think you are more right than I was. For (b), use

Z = numerator / denominator

where numerator = Xbar - Ybar - (μx - μy)

and denominator = √( σ²x / n1 - σ²y / n2 )

For the denominator, should it be a + (instead of -)?

Also, for part b, should we use a one-tailed test or two-tailed test? Would the alternative hypothesis still be H_a: μ_x≠μ_y?

Thanks!

For the denominator, should it be a + (instead of -)?

Yes, use + instead of - ! I'm not good at "non-TeX."

Also, for part b, should we use a one-tailed test or two-tailed test?

Two.

Would the alternative hypothesis still be H_a: μ_x≠μ_y?

Yes.

um...but how can we find beta for a TWO-tailed test?

P(type II error)
= P(fail to reject H_o | μ_x-μ_y=2)
= P( ? | μ_x-μ_y=2) ?

Thanks!

Maybe it is one. Can't you do two like this (?) :

Numerator0 = Num0 = Xbar - Ybar - 0

Numerator1 = Num1 = Xbar - Ybar - 2

Denominator = D = √( σ²x / n1 + σ²y / n2 )

If H0 were true, you'd know P(-1.96<Num0/D<1.96)=0.95

But if true diff of means were 2, what is P(-1.96<Num0/D<1.96), realizing that Num1/D is std normal.

P(-1.96 < Num0/D < 1.96)
=P(-1.96D < Num0 < 1.96D)
=P(-1.96D-2 < Num1 < 1.96D-2)
=P(-1.96-2/D < Num1/D < 1.96-2/D)
=P(-1.96-2/D < Z < 1.96-2/D) then look it up in the table.

If you think it should be one tail, then modify accordingly. Write up both and turn them both in?

What do you think, does this work?

I think it has to be a TWO-tailed test, because we're testing for the difference.

Assuming alpha=0.05,

P(type II error)
= P(fail to reject H_o | μ_x-μ_y=2)
= P( -1.96<Z<1.96 | μ_x-μ_y=2)
But what is Z here?
Is it
Z = (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]
or Z = (Xbar - Ybar - 2) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] ? Which one?? I am a little confused...

But what is Z here?
Is it
Z = (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]
or Z = (Xbar - Ybar - 2) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] ?

It's

P( -1.96<(Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]<1.96 | μ_x-μ_y=2)

but I wouldn't use Z as the name of (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] at this point because (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] isn't standard normal once you assume μ_x-μ_y=2.

You now have to manipulate

P( -1.96<(Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]<1.96 | μ_x-μ_y=2)

into the form

P( thing <(Xbar - Ybar - 2) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]<other thing),

and since (Xbar - Ybar - 2) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] is standard normal, you can then finish off the problem.

But commonly, the Z statistic is denoted by Z (which is also the notation for standard normal random variable)
Z = (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ]
So this is why (due to abuse of notation) we need to be CAREFUL with the following, right?

P( -1.96<Z<1.96 | μ_x-μ_y=2)
= P( -1.96 < (Xbar - Ybar) / √[ (σ_x^)2 / n1 + (σ_y)^2 / n2 ] <1.96 | μ_x-μ_y=2)
But for the -1.96<Z<1.96 part, the Z is NOT standard normal in this case, it is the Z statistic.

Seems like you understand it. I prefer reserving the letter Z for an (at least approximately) N(0,1) random variable, exclusively.

When I write out a hypothesis test and call the test statistic Z, I am doing so because if H0 is true, this statistic is N(0,1).

When I calculate beta or power, I no longer use the letter Z to refer to that same statistic, because H0 is not true, so the statistic is not N(0,1).

## 1. What is hypothesis testing for pairs of means?

Hypothesis testing for pairs of means is a statistical method used to compare the means of two groups or samples. It involves making a hypothesis about the difference between the means and then using data to determine if there is enough evidence to reject or accept the null hypothesis.

## 2. How is hypothesis testing for pairs of means different from other types of hypothesis testing?

Hypothesis testing for pairs of means is specifically used when comparing the means of two groups or samples. Other types of hypothesis testing may involve comparing means of more than two groups, proportions, or variances.

## 3. What is the null hypothesis in hypothesis testing for pairs of means?

The null hypothesis in hypothesis testing for pairs of means is the assumption that there is no significant difference between the means of the two groups being compared. It is often denoted as H0.

## 4. How is the significance level determined in hypothesis testing for pairs of means?

The significance level in hypothesis testing for pairs of means is typically set at 0.05 or 5%. This means that there is a 5% chance of rejecting the null hypothesis when it is actually true. The significance level can also be adjusted based on the specific research question and the desired level of confidence.

## 5. What is the purpose of conducting a power analysis in hypothesis testing for pairs of means?

A power analysis is used to determine the sample size needed to detect a significant difference between the means of two groups. This helps to ensure that the study has enough statistical power to detect a true difference if it exists. A higher power means a lower chance of making a Type II error, or incorrectly accepting the null hypothesis when it is actually false.