Plus/minus What? How to Interpret Error Bars

[Total: 2    Average: 3.5/5]

People some times find themselves staring at a number with a ± in it when a new physics result is presented. But what does it mean? The aim of this Insight is to give a fast overview of how physicists (and other scientist) tend to present their results in terms of statistics and measurement errors. If we are faced with a value ##m_H = 125.7\pm 0.4## GeV, does that mean that the Higgs mass definitely has to be within the range 125.3 to 126.1 GeV? Is 125.3 GeV as likely as the central value of 125.7 GeV?

Confidence Levels

When performing a statistical analysis of an experiment there are two possible probability interpretations. We will here only deal with the one most common in high-energy physics, called the frequentist interpretation. This interpretation answers questions regarding how likely a certain outcome was given some underlying assumption, such as a physics model. This is quantified by quoting with which frequency we would obtain the outcome, or a more extreme one, if we repeated the experiment an infinite number of times.

Naturally, we cannot perform any experiment an infinite, or even a large enough, number of times, which is why this is generally inferred through assumptions on the distribution of the outcomes or by numerical simulation. For each outcome, the frequency with which it, or a more extreme outcome, would occur is called the “p-value” of the outcome.

When an experiment is performed and a particular outcome has occurred, we can use the p-value to infer the “confidence level” (CL) at which the underlying hypothesis can be ruled out. The CL is given by one minus the p-value of the outcome that occured, i.e., if we have a hypothesis where the outcome from our experiment is among the 5% most extreme ones, we would say that the hypothesis is ruled out at the 100%-5% = 95% CL.


Another commonly encountered nomenclature is that of using a number of σ. This is really not much different from the CL introduced above and is simply a way to referring the CL at which an outcome would be ruled out if the distribution was Gaussian and it was a given number of standard deviations away from the mean. The following list summarises the confidence levels associated with the most common numbers of sigmas:

1σ = 68.27% CL
2σ = 95.45% CL
3σ = 99.73% CL
5σ = 99.999943% CL

In particle physics, the confidence levels of 3 and 5σ have a special standing. If the hypothesis that a particle does not exist can be ruled out at 3 sigma, we refer to the outcome as “evidence” for the existence of the particle, while if it can be ruled out at 5σ, we refer to it as a “discovery” of the particle. Therefore, when we say that we have discovered a particle, what is really being implied is that, if the particle did not exist, the experimental outcome would only happen by chance in 0.000057% of the experiments if we repeated the experiment an infinite number of times. This means that if you could perform an experiment every day, it would take you on average about 5000 years to get such an extreme result by chance.

It is worth noting that this interpretation of probabilities makes no attempt to quantify how likely it is that the hypothesis is true or false, but only refers to how likely the outcomes are if it is true. Many physicists get this wrong too! You may often hear statements such as “we are 99.999943% certain that the new particle exists”, such statements will generally not be accurate and rather belong to the other interpretation of probability, which we will not cover here.

Error Bars

So what does all of this have to do with the ± we talked about in the beginning when discussing the errors in some parameter? Unless otherwise specified, the quoted errors generally refer to the errors at 1σ, i.e., 68.27% CL. What this means is that all of the values outside the error bars are excluded at the 1σ level or stronger. Consequently, all the values inside the error bar are not excluded at this confidence level, meaning that the observed outcome was among the 68.27% less extreme ones for those values of the parameter.

This also means that the error bars are not sharp cutoffs, a value just outside the error bars will generally not be excluded at a level much stronger than 1σ and a value just inside the error bars will generally be excluded at almost 1σ.

The last observation we will make is that the confidence level of a given interval may not be interpreted as the probability that the parameters are actually in that interval. Again, this is a question which is not treated by frequentist statistics. You may therefore not say that the true value of the parameter will be within the 1σ confidence interval with a probability of 68.27%. The confidence interval is only a means of telling you for which values of the parameter the outcome was not very unlikely. To conclude, let us return to the statement ##m_H = 125.7\pm 0.4## GeV and interpret its meaning:

“If the Higgs mass is between 125.3 and 126.1 GeV, then what we have observed so far is among the 68.27% less extreme results. If the Higgs mass is outside of the interval, it is among the 31.73% more extreme results.”

Associate professor in theoretical astroparticle physics. He did his thesis on phenomenological neutrino physics and is currently also working with different aspects of dark matter as well as physics beyond the Standard Model. Author of “Mathematical Methods for Physics and Engineering” (see Insight “The Birth of a Textbook”). A member at Physics Forums since 2014.

3 replies
  1. Stephen Tashi
    Stephen Tashi says:

    I think you should clarify the point that there are two distinct types of confidence intervals in frequentist statistics.  The kind of confidence interval treated by statistical theory is an interval defined with respect to a fixed but unknown population parameter such as the unknown population mean.   This type of interval can have a known numerical  width but it does not have known numerical endpoints since the population parameter is unknown. For this type interval, we can state a probability (e.g. .6827)  that the true population parameter is within it – we just don't know where the interval is!The second type of confidence interval is an interval stated with respect to an observed  or estimated value of a population parameter, such as 125.7.  (Some statistics textbooks say that is "an abuse of language" to call such an interval a "confidence interval".)As you point out, the interpretation of a "confidence interval" with numerical endpoints is best done by putting the scenario for statistical  confidence intervals out of one's mind and replacing it by the  scenario for  hypothesis testing.

  2. gleem
    gleem says:

    It might be instructive to show the type of data that is  used  to get the result mH=125.7±0.4 GeV,   Unlike the biological sciences Physics experiments are not usually  dominated by statistical uncertainties.

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply