Undergrad Hypothesis Testing: Comparing Gaussian Distributions

Click For Summary
The discussion focuses on using hypothesis testing to compare two Gaussian distributions, specifically testing if the mean of the data is greater than zero. The maximum likelihood ratio is proposed as the test statistic, but there are concerns about the validity of the alternative hypothesis when the mean is very close to zero. It is emphasized that a hypothesis must be specific enough to compute the probability of observed data, and a Bayesian approach may be necessary for treating the mean as a random variable. Frequentist methods, such as power curves, can help analyze the effectiveness of the test, but they do not constitute a hypothesis test on their own. Overall, a one-tailed t-test is suggested as a suitable method for this scenario.
Gaussian97
Homework Helper
Messages
683
Reaction score
412
TL;DR
How to formalize the hypothesis of having a mean bigger than some value.
Hi, I have some set of data and I want to use Hypothesis Testing to discriminate between two hypotheses:
H0: My data follows a Gaussian distribution with a given mean and a given std (the actual values are ugly, so let's say mean = 0 and std = 1).
H1: My data follows a Gaussian distribution with mean > 0 and std = 1 (the same as before).

So, I want to use the maximum likelihood ratio to define my test statistic as
$$t(\vec{x})=\frac{f(\vec{x}|H_1)}{f(\vec{x}|H_0)}$$
So, for ##H_0## its clear that ##f(x|H_0)=N(0,1)##, but how do I find the expresion for ##f(x|H_1)?##.

Would be valid to compute the sample mean and, since it's actually bigger than 0, use $$f(x|H_1)=N(\bar{x},1)$$?

Thanks.
 
Physics news on Phys.org
The two cases that you describe can not have a useful test. The cases of mean=0 versus mean=0.0000000000001 will not be distinguishable without billions of samples. You must start with an alternative hypothesis where a reasonable sample has a chance of being convincing. It is not usually necessary to determine the distribution associated with the alternative hypothesis. The assumption of the null hypothesis gives you the distribution that you will use and a sample that is out of line with that distribution allows you to convincingly state that the alternative hypothesis is the better choice.

CORRECTION: In your stated cases, if the sample mean is large enough, you can reject the null hypothesis.
 
Last edited:
  • Like
Likes WWGD and Stephen Tashi
Gaussian97 said:
Summary: How to formalize the hypothesis of having a mean bigger than some value.

A hypothesis for a hypothesis test must be specific enough to allow computing the probability of the observed data using the assumption that the hypothesis is correct. A statement like "The mean of the distribution is greater than 1" is not specific enough. So you must add additional assumptions.

A Bayesian approach is to treat the mean as a random variable that has a "prior" probability distribution. You must assume a particular distribution for the mean. (This like saying that Nature picked the value of the mean in some random way when she created the population.)

If you wish to stay with the "frequentist" outlook that the mean has a "fixed but unknown value" (i.e. it is not a random variable) then you can't compute the probability of the data only knowing that the mean is greater than 1. So no hypothesis test is possible.

A frequentist might resort to looking at "power curves". These curves are used to analyze the "power" of a statistical test. In your example, the hypothesis that the mean and standard deviations have given values is specific enough to do a "one tailed" hypothesis test using those values as the null hypothesis. You can look at power curves for that test. However, making a decision based on the behavior of power curves is not a hypothesis test

It's important to keep in mind that a hypothesis test is not a mathematical deduction and, unless a lot more additional information is assumed than in your example, it is not a procedure that produces an optimal decision. (What function would we be optimizing?) A hypothesis test is simply a procedure. Hypothesis tests have proven empirically useful in many fields of study. However, the mathematical theory (Optimal Statistical Decisions) that justifies the use of particular hypothesis tests requires assuming more structure and given information than is found in the garden variety problems (like yours) that occur in introductory books on statistics.
 
If there are an infinite number of natural numbers, and an infinite number of fractions in between any two natural numbers, and an infinite number of fractions in between any two of those fractions, and an infinite number of fractions in between any two of those fractions, and an infinite number of fractions in between any two of those fractions, and... then that must mean that there are not only infinite infinities, but an infinite number of those infinities. and an infinite number of those...

Similar threads

Replies
20
Views
2K
  • · Replies 20 ·
Replies
20
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
1
Views
4K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 6 ·
Replies
6
Views
1K