B Continuity correction when using normal as approximation for binomial

AI Thread Summary
The discussion centers on the application of continuity correction when approximating a binomial distribution with a normal distribution, particularly when dealing with non-integer values like P(X < 1.2). Participants explore whether the correction should be based on values like 1.15 or 0.7, emphasizing that the correction is necessary due to the discrete nature of the binomial distribution. It is clarified that continuity correction depends on the possible values the data can take, rather than the actual values observed. The conversation also touches on the implications of using irregular intervals in discrete data and the importance of understanding binning in this context. Ultimately, the discussion highlights the nuances of applying continuity correction in various scenarios.
songoku
Messages
2,470
Reaction score
386
TL;DR Summary
I have learnt how to do continuity correction when the value of random variable is integer, such as P(X < 5) changes to P(X < 4.5) when the distribution changes from binomial to normal
What if the value of X is not integer, such as P(X < 1.2)?

a) Will the continuity correction be P(X < 1.2 - 0.5) = P(X < 0.7)?

or

b) Will the continuity correction be P(X < 1.2 - 0.05) = P(X < 1.15)?

or

c) Something else?

Thanks
 
Physics news on Phys.org
Do you understand why we have to make the continuity correction for an integer (i.e. discontinuous) variable?
 
  • Like
Likes songoku
pbuk said:
Do you understand why we have to make the continuity correction for an integer (i.e. discontinuous) variable?
Because binomial distribution is discrete distribution so to change it to normal distribution (continuous distribution) there should be adjustment. I am thinking like changing a line of x = 4 (discrete) to a box where the left vertex of the box is 3.5 and right vertex is 4.5 so that the box can touch the other box made from line x = 3 and x = 5 (becoming continuous distribution)
 
So how does that apply when you have a line at 1.2?
 
  • Like
Likes songoku
pbuk said:
So how does that apply when you have a line at 1.2?
It means I have to know the location of other lines. I am trying to form a hypothetical question regarding this but I just can't think of one.

So if the location of other lines are 1.1 and 1.3, P(X < 1.2) will be P(X < 1.15) and P(X > 1.2) will be P(X > 1.25)

But if the location of other lines is not in regular intervals, such as one is at 1.1 and the other is at 1.4, then P(X < 1.2) will be P(X < 1.15) and P(X > 1.2) will be P(X > 1.3)?

Thanks
 
If your data is discrete so that it is only possible for a 'line' (whatever that is) to take values of 1.1, 1.2, 1.3, 1.4 etc. then a continuity correction may make sense, but if it the data are by nature continuous but there just happen to be some gaps then it would not make sense.
 
  • Like
Likes songoku
pbuk said:
If your data is discrete so that it is only possible for a 'line' (whatever that is) to take values of 1.1, 1.2, 1.3, 1.4 etc. then a continuity correction may make sense, but if it the data are by nature continuous but there just happen to be some gaps then it would not make sense.
I understand. I haven't encountered such questions. All the practice questions are about integers so my query is only due to curiosity.

Is it not possible to have discrete data with irregular intervals, such as 1.1 , 1.3 , 1.4 , 1.9?

Thanks
 
songoku said:
Is it not possible to have discrete data with irregular intervals, such as 1.1 , 1.3 , 1.4 , 1.9?
Of course it is, but the continuity correction is not about what values the data actually take, rather it is about what values the data can possibly take.
 
  • Like
Likes songoku
pbuk said:
Of course it is, but the continuity correction is not about what values the data actually take, rather it is about what values the data can possibly take.
So is it correct to say that if the data is 1.1 , 1.3 , 1.4 and 1.9 the continuity correction for P(X < 1.3) is P(X < 1.2) and for P(X > 1.3) is P(X > 1.35)?

What if for P(X < 1.5)? Since the midpoint of 1.4 and 1.9 is 1.65, would the continuity correction for P( X< 1.5) be P(X > 1.65)?

Thanks
 
  • #10
songoku said:
So is it correct to say that if the data is 1.1 , 1.3 , 1.4 and 1.9 the continuity correction for P(X < 1.3) is P(X < 1.2) and for P(X > 1.3) is P(X > 1.35)?
This would only be the case if the data was binned with variable width bins including (1.1, 1.3) and (1.3, 1.4).

songoku said:
What if for P(X < 1.5)? Since the midpoint of 1.4 and 1.9 is 1.65, would the continuity correction for P( X< 1.5) be P(X > 1.65)?
This would only be the case if the data was binned with variable width bins including (1.4, 1.9).

But unless there was a good reason for binning the data you would get a better result by using individual samples.

It is good to be curious, but I think you have got lost down a rabbit hole; it's time to move on.
 
  • Like
Likes songoku
  • #11
Thank you very much pbuk
 
Back
Top