Continuity correction when using normal as approximation for binomial

Click For Summary
SUMMARY

The discussion centers on the application of continuity correction when approximating a binomial distribution with a normal distribution. Participants explore scenarios where the value of X is not an integer, specifically questioning whether to adjust using P(X < 1.2 - 0.5) or P(X < 1.2 - 0.05). It is established that continuity correction is necessary for discrete distributions to approximate continuous distributions accurately, and the correction depends on the context of the data intervals. The conversation concludes that while curiosity is valuable, practical applications should focus on individual samples rather than overly complex binning.

PREREQUISITES
  • Understanding of binomial distribution and its properties
  • Familiarity with normal distribution and continuity correction
  • Knowledge of discrete versus continuous data types
  • Basic statistical concepts, including probability notation
NEXT STEPS
  • Research "continuity correction in statistics" for deeper insights
  • Study "binomial to normal approximation" techniques
  • Explore "discrete vs continuous data analysis" methodologies
  • Learn about "variable width binning" in statistical data analysis
USEFUL FOR

Statisticians, data analysts, students in statistics courses, and anyone involved in statistical modeling and analysis of discrete data distributions.

songoku
Messages
2,509
Reaction score
393
TL;DR
I have learnt how to do continuity correction when the value of random variable is integer, such as P(X < 5) changes to P(X < 4.5) when the distribution changes from binomial to normal
What if the value of X is not integer, such as P(X < 1.2)?

a) Will the continuity correction be P(X < 1.2 - 0.5) = P(X < 0.7)?

or

b) Will the continuity correction be P(X < 1.2 - 0.05) = P(X < 1.15)?

or

c) Something else?

Thanks
 
Physics news on Phys.org
Do you understand why we have to make the continuity correction for an integer (i.e. discontinuous) variable?
 
  • Like
Likes   Reactions: songoku
pbuk said:
Do you understand why we have to make the continuity correction for an integer (i.e. discontinuous) variable?
Because binomial distribution is discrete distribution so to change it to normal distribution (continuous distribution) there should be adjustment. I am thinking like changing a line of x = 4 (discrete) to a box where the left vertex of the box is 3.5 and right vertex is 4.5 so that the box can touch the other box made from line x = 3 and x = 5 (becoming continuous distribution)
 
So how does that apply when you have a line at 1.2?
 
  • Like
Likes   Reactions: songoku
pbuk said:
So how does that apply when you have a line at 1.2?
It means I have to know the location of other lines. I am trying to form a hypothetical question regarding this but I just can't think of one.

So if the location of other lines are 1.1 and 1.3, P(X < 1.2) will be P(X < 1.15) and P(X > 1.2) will be P(X > 1.25)

But if the location of other lines is not in regular intervals, such as one is at 1.1 and the other is at 1.4, then P(X < 1.2) will be P(X < 1.15) and P(X > 1.2) will be P(X > 1.3)?

Thanks
 
If your data is discrete so that it is only possible for a 'line' (whatever that is) to take values of 1.1, 1.2, 1.3, 1.4 etc. then a continuity correction may make sense, but if it the data are by nature continuous but there just happen to be some gaps then it would not make sense.
 
  • Like
Likes   Reactions: songoku
pbuk said:
If your data is discrete so that it is only possible for a 'line' (whatever that is) to take values of 1.1, 1.2, 1.3, 1.4 etc. then a continuity correction may make sense, but if it the data are by nature continuous but there just happen to be some gaps then it would not make sense.
I understand. I haven't encountered such questions. All the practice questions are about integers so my query is only due to curiosity.

Is it not possible to have discrete data with irregular intervals, such as 1.1 , 1.3 , 1.4 , 1.9?

Thanks
 
songoku said:
Is it not possible to have discrete data with irregular intervals, such as 1.1 , 1.3 , 1.4 , 1.9?
Of course it is, but the continuity correction is not about what values the data actually take, rather it is about what values the data can possibly take.
 
  • Like
Likes   Reactions: songoku
pbuk said:
Of course it is, but the continuity correction is not about what values the data actually take, rather it is about what values the data can possibly take.
So is it correct to say that if the data is 1.1 , 1.3 , 1.4 and 1.9 the continuity correction for P(X < 1.3) is P(X < 1.2) and for P(X > 1.3) is P(X > 1.35)?

What if for P(X < 1.5)? Since the midpoint of 1.4 and 1.9 is 1.65, would the continuity correction for P( X< 1.5) be P(X > 1.65)?

Thanks
 
  • #10
songoku said:
So is it correct to say that if the data is 1.1 , 1.3 , 1.4 and 1.9 the continuity correction for P(X < 1.3) is P(X < 1.2) and for P(X > 1.3) is P(X > 1.35)?
This would only be the case if the data was binned with variable width bins including (1.1, 1.3) and (1.3, 1.4).

songoku said:
What if for P(X < 1.5)? Since the midpoint of 1.4 and 1.9 is 1.65, would the continuity correction for P( X< 1.5) be P(X > 1.65)?
This would only be the case if the data was binned with variable width bins including (1.4, 1.9).

But unless there was a good reason for binning the data you would get a better result by using individual samples.

It is good to be curious, but I think you have got lost down a rabbit hole; it's time to move on.
 
  • Like
Likes   Reactions: songoku
  • #11
Thank you very much pbuk
 

Similar threads

  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 3 ·
Replies
3
Views
5K
Replies
4
Views
2K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 25 ·
Replies
25
Views
6K
  • · Replies 1 ·
Replies
1
Views
1K