# Sigma confidence rating

1. Jul 5, 2012

### g.lemaitre

I'm trying to figure out what the odds of 5 sigma confidence rating has of being wrong. According to one website it is
0.000028% which is 1 in 35,000 but I've seen so many divergent answers as to what the odds of 5 sigma being wrong are that I want to be sure. I've seen people say it is as high as 1 in 3 million or as low as 1 in 700

2. Jul 5, 2012

### chiro

Hey g.lemaitre and welcome to the forums.

For this problem, I'm assuming you have a standard normal and wish to figure out the probability of being greater than 5 standard deviations outside of the mean.

If you are using different distributions, different assumptions, or you have a specific problem then please inform the rest of the readers here so that we can give you better advice.

Using R, I got the answer to be 2 * 2.866515718791939118515e-07 = 5.733031437583878237117e-07 = 0.000000573303.. which is really small. Taking the inverse of this gives us: 1744277.89 which equates roughly to a 1 in 1744278 chance or say a 1 in 1.7 million chance.

If you only considered one tail it would be just under a 1 in 3.5 million chance.

The thing is though that this is misleading if you don't provide more information, and this assumes that the distribution relating to what you are measuring has a Gaussian distribution. If it doesn't, or if you need to use another model, then this assumption will be wrong.

To get the calculation in R I used pnorm(-5.0,0,1) and multiplied that by 2 to get final probability (because of symmetry).

3. Jul 5, 2012

### Number Nine

You're not being very precise here (e.g. "odds of being wrong" is more complex than you realize, as is sigma); the best I can tell you is that, for normally distributed data, approximately 99.99995% of the data lie within 5 standard deviations of the mean.

4. Jul 5, 2012

### g.lemaitre

let me give an exact quote

5. Jul 5, 2012

### chiro

This quote looks like it assumes a normal distribution and refers to the quantities P(Z < z) where z = -3 and -5 respectively, but the figure for -5 is off by two decimal places according to R with the pnorm function, if they are assuming a Gaussian distribution.

What this means is that there is a cutoff value for the probability and they are saying that if goes below some cutoff or above some cutoff, then it is considered more than -3 or -5 standard deviations in the respective direction.

Can you point the readers to the article?

6. Jul 6, 2012

### cmos

It looks like you're trying to make sense of the numbers being thrown around regarding the Higgs boson, so I'll just throw out a summary.

For starters, the integral of a normal distribution from -n*sigma to n*sigma is equal to erf(n/sqrt(2)), where n is any real number and erf is the error function. We can therefore say that a 5-sigma result has a probability of erf(5/sqrt(2)) = 0.9999994 (i.e. 99.99994 %).

Now, a lot of the news reports are saying that this indicates that there is a "1 in 3.5 million" chance that there was no Higgs detection. This number is equal to
0.5 - erf(5/sqrt(2))/2.

Why do they divide by 2? Because they are looking for "bumps" above a "noise" level. In other words, they are only considering one side of the distribution. The 0.5 is just the integral over half of the normal distribution.

But you want to know about "odds." By definition, odds = P(failure)/P(success) where P means "probability of." Therefore, the odds of the Higgs result being a fluke is
[1 - erf(5/sqrt(2))] / erf(5/sqrt(2)) = "1 to 1.75 million."

7. Jul 9, 2012

### haruspex

No, you've missed the '%'. It's 1 in 3.5 million.
Btw, this is not the chance of being wrong in rejecting the null hypothesis. It is the chance that the observed data was merely by chance, i.e. the chance of the data being thus given the null hypothesis. This is not the same thing as the chance that the null hypothesis is correct.