Type 1 Error Increases with Sample Size?

beakymango · Apr 28, 2020

My professor is teaching us that type 1 error increases with sample size if you keep alpha constant, and I think I understand what she's getting at, but I can't find anything online that supports the idea. Here's what I'm thinking:

We accept that there is an equal chance that a flipped coin will land on heads or tails. This is one scenario where we know that the null hypothesis cannot be rejected. However, if you flip 10,000 coins and you find that 5,005 coins land on heads and 4,995 coins land on tails, you might be able to show that p<0.05 that coins are more likely to land on heads, so you would falsely reject the null. With a smaller sample size, you would be able to disregard the variation as insignificant.

But I'm pretty sure we don't apply statistical analysis to things like this. And when we're testing the efficacy of a drug compared to placebo, we use statistical analysis instead of testing it on increasing sample sizes to see if the numbers converge. I can't exactly put my finger on why that is (besides practicality), but I think that's why my coin example isn't valid.

Mark44 · Apr 28, 2020

beakymango said:

We accept that there is an equal chance that a flipped coin will land on heads or tails. This is one scenario where we know that the null hypothesis cannot be rejected. However, if you flip 10,000 coins and you find that 5,005 coins land on heads and 4,995 coins land on tails, you might be able to show that p<0.05 that coins are more likely to land on heads, so you would falsely reject the null. With a smaller sample size, you would be able to disregard the variation as insignificant.

Your probability is off. The null hypothesis, ##H_0## is p = 0.5. The alternate hypothesis, ##H_a##, would be p < 0.5, not p < 0.05.

beakymango said:

But I'm pretty sure we don't apply statistical analysis to things like this.

No, that's not correct. That's exactly how you would determine which hypothesis to accept. In this case, with a large number of sample coin flips, a normal distribution could be used for this binomial probability. It's been quite a few years since I taught any statistics classes, but I believe what I'm saying is true.

beakymango · Apr 28, 2020

Mark44 said:

Your probability is off. The null hypothesis, ##H_0## is p = 0.5. The alternate hypothesis, ##H_a##, would be p < 0.5, not p < 0.05.
No, that's not correct. That's exactly how you would determine which hypothesis to accept. In this case, with a large number of sample coin flips, a normal distribution could be used for this binomial probability. It's been quite a few years since I taught any statistics classes, but I believe what I'm saying is true.

Sorry if I was unclear. I didn't mean p in this case to be probability, I meant p as a Student T-test. Like p<alpha. My main question is if type 1 error increases with sample size.

Mark44 · Apr 28, 2020

beakymango said:

Sorry if I was unclear. I didn't mean p in this case to be probability, I meant p as a Student T-test. Like p<alpha. My main question is if type 1 error increases with sample size.

I'm still not following how p as you describe it relates to ##\alpha## or the Student' t-test. For one thing, the Student's t-test is generally used for relatively small sample sizes. With large sample sizes, like 10,000 in your first post, the t distribution is identical to the normal distribution.
For a binomial distribution, p represents the probability that one of two events occurs.
Also, a Type I error is defined as ##P(\text{Type I error}) = P(\text{we reject H}_0 | \text{H}_0 \text{ is true})##.
##\alpha## represents the total probability outside the critical region.
Please describe what you are using p to represent.

Just off the top of my head, I'd say that Type I error decreases with larger sample sizes, but I haven't worked out anything to allow me to justify that.

Stephen Tashi · Apr 28, 2020

beakymango said:

My professor is teaching us that type 1 error increases with sample size if you keep alpha constant,

Are you sure that's what the professor said? Try stating what was said word for word.

By the usual terminology, alpha is the probability of rejecting the null hypothesis when it is true and such an error is called type 1 error. So if you keep alpha constant, you keep type 1 error constant.

We accept that there is an equal chance that a flipped coin will land on heads or tails. This is one scenario where we know that the null hypothesis cannot be rejected.

You mean "should not" be rejected.

However, if you flip 10,000 coins and you find that 5,005 coins land on heads and 4,995 coins land on tails, you might be able to show that p<0.05 that coins are more likely to land on heads,

Taking a guess at what the professor actually said, we can say this:
We usually don't base rejecting the null hypothesis on the probability of the exact outcome of an experiment. The probability of each exact outcome of a fair toin tossing experiment goes down as we increase the number of tosses. For example, in 2 tosses, the probability of 1 head and 1 tail (in some order) is 1/2. By contrast, the probability of the exact outcome of 5,005 heads and 4,9995 tails (in some order) is ##{10000 \choose 5005} (1/2)^{10000}##. Whatever that is, it's smaller than 1/2.

However, the commonly used hypothesis tests are not based on assuming the null hypothesis and computing the probability of the exact outcome of an experiment. Instead, they are so-called "one tailed" and "two tailed tests" where we compute the probability of an event that includes the exact outcome of the experiment plus other outcomes that did not happen.

If you scrutinize hypothesis testing, you can ask the very good question "Why should hypothesis testing involve the probability of outcomes that did not occur?". It is easy to give various intuitive answers to this question. However, a mathematically precise answer requires introducing the concept of the "power" of a statistical test and the concept of "power curves". The usual approach in introductory statistics is to present one tailed and two tailed hypothesis tests as "accepted" procedures, without getting into the sophisticated concepts that somewhat justify using them.

beakymango · Apr 28, 2020

Stephen Tashi said:

Are you sure that's what the professor said? Try stating what was said word for word.

By the usual terminology, alpha is the probability of rejecting the null hypothesis when it is true and such an error is called type 1 error. So if you keep alpha constant, you keep type 1 error constant.
You mean "should not" be rejected.

Taking a guess at what the professor actually said, we can say this:
We usually don't base rejecting the null hypothesis on the probability of the exact outcome of an experiment. The probability of each exact outcome of a fair toin tossing experiment goes down as we increase the number of tosses. For example, in 2 tosses, the probability of 1 head and 1 tail (in some order) is 1/2. By contrast, the probability of the exact outcome of 5,005 heads and 4,9995 tails (in some order) is ##{10000 \choose 5005} (1/2)^{10000}##. Whatever that is, it's smaller than 1/2.

However, the commonly used hypothesis tests are not based on assuming the null hypothesis and computing the probability of the exact outcome of an experiment. Instead, they are so-called "one tailed" and "two tailed tests" where we compute the probability of an event that includes the exact outcome of the experiment plus other outcomes that did not happen.

If you scrutinize hypothesis testing, you can ask the very good question "Why should hypothesis testing involve the probability of outcomes that did not occur?". It is easy to give various intuitive answers to this question. However, a mathematically precise answer requires introducing the concept of the "power" of a statistical test and the concept of "power curves". The usual approach in introductory statistics is to present one tailed and two tailed hypothesis tests as "accepted" procedures, without getting into the sophisticated concepts that somewhat justify using them.

Okay -- that makes a lot of sense. I think I can conceptualize why we care about the probability of outcomes that do not occur. Nonetheless, I still don't quite understand what my professor is saying. We defined alpha as the researcher's tolerance for false positives, if that helps? This is copy pasted off our powerpoint: "Larger sample size while keeping same α results in higher chance of type I error"

Stephen Tashi · Apr 28, 2020

beakymango said:

This is copy pasted off our powerpoint: "Larger sample size while keeping same α results in higher chance of type I error"

We need to look at the definitions of "alpha" and "type I error" that your course is using. Is the powerpoint about a specific hypothesis test? - or is it a general claim?

beakymango · Apr 28, 2020

Stephen Tashi said:

We need to look at the definitions of "alpha" and "type I error" that your course is using. Is the powerpoint about a specific hypothesis test? - or is it a general claim?

Just in general, but we are a biostats class if that makes anything different. We define Type 1 Errors as "false positive" and alpha as "the highest risk of making a false positive error." I asked for clarification today, and she it's because as you increase the sample size, you're making it more likely to reject the null hypothesis, so you're increasing the risk of making a false positive. She explained that if you use a large enough sample size, you can almost always prove a statistically significant difference between two groups of people. This sort of makes sense to me, but I also thought increasing sample sizes made it easier to detect smaller (but true) differences between two groups by shrinking the standard deviation.

Stephen Tashi · Apr 29, 2020

beakymango said:

Just in general, but we are a biostats class if that makes anything different. We define Type 1 Errors as "false positive"

That use of "Type 1 Error" is like defining it to be throwing a head in a in single toss of a coin. By contrast, the event used in defining Type 1 Error for a coin toss experiment will be something like " 7000 or more heads out of 10,000 tosses".

Of course, it's true that you're more likely to get, say, at least 1 head in 10,000 tosses of a fair coin that to get at least 1 head in 2 tosses of a fair coin.

and alpha as "the highest risk of making a false positive error."

Perhaps "alpha" is being used to refer to the probability of a false positive on a single trial.

In mathematical statistics, "type one error" does not refer to the occurence of a single "a false positive" and "alpha" does not refer to the probability of getting a false positive on a single trial. So it isn't surprising that you found no sources to support the professor's claims. They are incorrect vis-a-vis the standard terminology.

WWGD · May 1, 2020

I would believe the opposite would be the case because of the Law of Large Numbers and related. Effect sizes would approach the " True Value" the more evidence you have.

Type 1 Error Increases with Sample Size?

1. What is a Type 1 error?

2. How does sample size affect Type 1 error?

3. Why is it important to control for Type 1 error?

4. How can sample size be controlled to minimize Type 1 error?

5. Can Type 1 error ever be completely eliminated?

Similar threads

Hot Threads

Recent Insights