Undergrad Hypothesis Testing: Comparing Gaussian Distributions

Gaussian97 · Nov 26, 2019

Hi, I have some set of data and I want to use Hypothesis Testing to discriminate between two hypotheses:
H0: My data follows a Gaussian distribution with a given mean and a given std (the actual values are ugly, so let's say mean = 0 and std = 1).
H1: My data follows a Gaussian distribution with mean > 0 and std = 1 (the same as before).

So, I want to use the maximum likelihood ratio to define my test statistic as
$$t(\vec{x})=\frac{f(\vec{x}|H_1)}{f(\vec{x}|H_0)}$$
So, for ##H_0## its clear that ##f(x|H_0)=N(0,1)##, but how do I find the expresion for ##f(x|H_1)?##.

Would be valid to compute the sample mean and, since it's actually bigger than 0, use $$f(x|H_1)=N(\bar{x},1)$$?

Thanks.

FactChecker · Nov 26, 2019

The two cases that you describe can not have a useful test. The cases of mean=0 versus mean=0.0000000000001 will not be distinguishable without billions of samples. You must start with an alternative hypothesis where a reasonable sample has a chance of being convincing. It is not usually necessary to determine the distribution associated with the alternative hypothesis. The assumption of the null hypothesis gives you the distribution that you will use and a sample that is out of line with that distribution allows you to convincingly state that the alternative hypothesis is the better choice.

CORRECTION: In your stated cases, if the sample mean is large enough, you can reject the null hypothesis.

Stephen Tashi · Nov 28, 2019

Gaussian97 said:

Summary: How to formalize the hypothesis of having a mean bigger than some value.

A hypothesis for a hypothesis test must be specific enough to allow computing the probability of the observed data using the assumption that the hypothesis is correct. A statement like "The mean of the distribution is greater than 1" is not specific enough. So you must add additional assumptions.

A Bayesian approach is to treat the mean as a random variable that has a "prior" probability distribution. You must assume a particular distribution for the mean. (This like saying that Nature picked the value of the mean in some random way when she created the population.)

If you wish to stay with the "frequentist" outlook that the mean has a "fixed but unknown value" (i.e. it is not a random variable) then you can't compute the probability of the data only knowing that the mean is greater than 1. So no hypothesis test is possible.

A frequentist might resort to looking at "power curves". These curves are used to analyze the "power" of a statistical test. In your example, the hypothesis that the mean and standard deviations have given values is specific enough to do a "one tailed" hypothesis test using those values as the null hypothesis. You can look at power curves for that test. However, making a decision based on the behavior of power curves is not a hypothesis test

It's important to keep in mind that a hypothesis test is not a mathematical deduction and, unless a lot more additional information is assumed than in your example, it is not a procedure that produces an optimal decision. (What function would we be optimizing?) A hypothesis test is simply a procedure. Hypothesis tests have proven empirically useful in many fields of study. However, the mathematical theory (Optimal Statistical Decisions) that justifies the use of particular hypothesis tests requires assuming more structure and given information than is found in the garden variety problems (like yours) that occur in introductory books on statistics.

Ygggdrasil · Nov 29, 2019

It sounds like something like a one-tailed t-test would be appropriate. See https://stats.idre.ucla.edu/other/m...nces-between-one-tailed-and-two-tailed-tests/ for a discussion of one-tailed hypothesis tests.

Undergrad Hypothesis Testing: Comparing Gaussian Distributions

Thread 'My basic understanding of set theory'

Similar threads

Undergrad A variant of the Monty Hall problem

Undergrad Please Explain (actually explain) The Monty Hall Problem

Undergrad What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

High School How Rare Is Low Smartphone Usage Among Metro Travelers in Japan?

High School Onto set mapping is the surjective set mapping, and into injective?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers