A new computer virus attacks a folder consisting of 200 files

Click For Summary
A new computer virus damages files in a folder with a 20% probability per file, leading to a binomial distribution scenario. The mean (μ) is calculated as 40, and the standard deviation (σ) is approximately 5.7. The normal approximation is used to find the probability of fewer than 50 files being damaged, resulting in a Z-score of 1.8. However, corrections are suggested, noting that the variance should be clarified and a continuity correction applied, adjusting the upper bound to 49 or 49.5. The discussion emphasizes the importance of accurate statistical methods in approximating probabilities.
TomJerry
Messages
49
Reaction score
0
Question:
A new computer virus attacks a folder consisting of 200 files. Each file gets damaged with probability 0.2 independently of other files. Using Normal approximation of binomial distribution, find the probability that fewer than 50 files get damaged.


Solution:

Here n=200, p =0.2, q=0.8.

Formulae for normal approx is Z = X - \mu/\sigma

For binomial distribution \mu = np and \sigma2 = npq

Therefore
\mu = 40

\sigma = 5.7

when X=50

Z = 50 - 40 / 5.7 = 1.8

P(X<50) = P(Z<1.8) = 0.5 - P(0<Z<1.8) = 0.5 - 0.4641 = 0.0359

Is this correct ?
 
Physics news on Phys.org
I think that you have made some minor mistakes.
First of all, npq is not the standard deviation, it is the variance (which is related to the st.dev. how?)

Moreover, this distribution is actually discrete and you are only using the normal distribution as an approximation. This means that you should probably apply the so-called continuity correction, by changing your upper bound a bit (it should not be 50 but ...?)
 
CompuChip said:
I think that you have made some minor mistakes.
First of all, npq is not the standard deviation, it is the variance (which is related to the st.dev. how?)

Moreover, this distribution is actually discrete and you are only using the normal distribution as an approximation. This means that you should probably apply the so-called continuity correction, by changing your upper bound a bit (it should not be 50 but ...?)
Thanks X should be 49 and not 50 ...Isnt that correct
 
How about 49.5?
 
The standard _A " operator" maps a Null Hypothesis Ho into a decision set { Do not reject:=1 and reject :=0}. In this sense ( HA)_A , makes no sense. Since H0, HA aren't exhaustive, can we find an alternative operator, _A' , so that ( H_A)_A' makes sense? Isn't Pearson Neyman related to this? Hope I'm making sense. Edit: I was motivated by a superficial similarity of the idea with double transposition of matrices M, with ## (M^{T})^{T}=M##, and just wanted to see if it made sense to talk...