Confidence in estimated parameters

  • Thread starter Thread starter Fixxxer125
  • Start date Start date
  • Tags Tags
    Parameters
AI Thread Summary
The discussion revolves around the correct interpretation of "confidence" in statistical terms, particularly in relation to fitting a cosine model to experimental data on dark matter. It highlights the distinction between confidence intervals and probabilities, emphasizing that confidence does not equate to probability in a strict statistical sense. The conversation also touches on the use of Bayesian versus frequentist approaches for making statements about parameter estimates and hypothesis testing. The participant grapples with calculating probabilities based on Gaussian errors and how to properly reject or accept hypotheses based on statistical significance. Ultimately, the importance of using precise terminology and established statistical methods in scientific reporting is underscored.
Fixxxer125
Messages
40
Reaction score
0
Hi there, just wondered if someone could help me?
I have a set of data from an experiment and have fitted a cosine fit to it. If the theoretical model for dark matter I have says the value for θ should be,say 2, and the cosine I have fitted has a θ=2+2.2σ, how could I go about saying for example "I can say with x% confidence my experimental data is consistent with the phase expected from dark matter?" I was trying to assume Gaussian error distribution and relating it to the area under a one tailed Gaussian but I am unsure if this is the correct way?
Many thanks
 
Physics news on Phys.org
Fixxxer125 said:
how could I go about saying for example "I can say with x% confidence my experimental data is consistent with the phase expected from dark matter?"

Are you using "confidence" as a synonym for "probability"? If not, what do you mean by "confidence"? In mathematical statistics, there is a definition of "confidence" - as in "confidence interval", but if you read about that concept, it might not give you much "confidence" in the ordinary sense of the word.
 
I've read about the confidence level but I couldn't find out how to relate a confidence level to a Gaussian probability. All I could work out how to do was to say that if x was the theoretical value and my experimentally obtained value was x+2.2σ there was a 0.45% chance of getting a value for the parameter that was greater than the value I have calculated, but I assumed/hoped I was doing something incorrect as this seems to indicate that my parameter is incorrect when to eye it appears a very good fit and has a reduced chi squared value of 1.02.
 
You evaded the questions: "Are you using confidence as a synonym for probability? If not, what do you mean by 'confidence'?"
 
Ah sorry, erm I guess so. If the event if highly probably I would like to convert this into a probability of the data being from the theoretical model, if this answers the question?
 
Fixxxer125 said:
I guess so

Well, we should really be clear on this since "confidence" has a very technical definition in statistics and it is not a synonym for "probability".

If you are looking for statement that you can make in a scientific publication then the best thing to do is to look at work that got published and see exactly what statistics those papers used that passed the peer review process in your field. (Statistics is highly subjective and what procedures are favored in various academic arenas is largely a matter of tradition.)

If you are seeking an answer that satisifes you personally then you must be clear about what you are asking. If you are seeking what it takes to make statements like "There is a 0.97 probability that the true value of theta is 2 , given the data I observed" then you must use Bayesian statistics. This is also the situation if you want to make a statement about a "credible interval", such as "There is a 0.97 probability that the true value of theta is between 1.98 and 2.02".

The situation in non-Bayesian ("frequentist") statistics is that you make statements such as "There is a 0.97 probability that the value of theta observed in the experiment is within plus or minus 0.2 of the true value", but you can't take a particular numerical value, such as 2, and substitute it in place of "the true value of theta" in that statement. You also can't take a particular numerical value such as 2.2 and subsitute it for "the value theta observed in the experiment" . This is because a statement of a "confidence interval" says something about the general quality of the sampling process you use. It doesn't make any claims about one particular outcome of that process.
 
...and another approach would be to get into "hypothesis testing" and "statistical significance", which have technicalities all their own. The frequentist approach would be to take "theta = 2" as the "null hypothesis".
 
Cheers, I'll have another look
 
Ok I've had another go at this after reading up on it and I thought I'd use an example to make my self clearer.
If I obtain a value for parameter A=361.52±5.87 and the theoretical value if the model is correct is 365 then I assume Gaussian errors and say my result is 0.6σ from the value in the model. Using a one tailed Gaussian I find that there is a 237.43% chance of getting 365 or greater if my A is from the same distribution (is this correct?) Therefore I can say with 27% confidence that my data is from the same distribution as theory and the result occurred by chance due to normal deviation, and there is a 73% chance the results are not from the same distribution so I would reject the hypothesis of a cosine modulation with a time period A of 365 days with 73% confidence?
I think this is correct, please could you verify??
Thanks
 
  • #10
Fixxxer125 said:
assume Gaussian errors and say my result is 0.6σ from the value in the model. Using a one tailed Gaussian I find that there is a 237.43% chance of getting 365 or greater if my A is from the same distribution (is this correct?)
An online calculator shows 27.425 % = .27425 probability, which is probably what you meant.

Therefore I can say with 27% confidence that my data is from the same distribution as theory and the result occurred by chance due to normal deviation

No. "Confidence" in the technical sense would say something about estimating a parameter. Even if you intend "confidence" to mean "probability", you can't conclude that there is a 27% chance that your data is from the same distribution. You used the assumption that it was from that distribution to get the figure of 27% in the first place. You assumed (with certainty) that data was from that distribution. You can't now begin talking about having a probability ( of less than 1) that it was from that distribution.

and there is a 73% chance the results are not from the same distribution

No, you have computed anything about the probability of the results being from a certain distribution. You have assumed a certain distribution (which you described above) and computed the probability of the data given that assumption. You haven't computed the probability that the assumption is true given the observed data. (It's the difference between Pr( A given B) and Pr(B given A)

so I would reject the hypothesis of a cosine modulation with a time period A of 365 days with 73% confidence?

The idea of "rejecting" an assumption you already made is part of the procedure of "hypothesis testing". The words to use if you are doing hypothesis testing are such terms as "p value", "level of significance", "type I error" etc. Are you using "confidence" to mean one of those terms?

Hypothesis testing is subjective. If this was an exercise in a statistics text, I think the answer book would show the use of a two-tailed test to do testing for whether your curve differs (in any way, in "either direction") from the theoretical distribution.
 

Similar threads

Replies
7
Views
2K
Replies
7
Views
2K
Replies
9
Views
2K
Replies
16
Views
2K
Replies
13
Views
2K
Back
Top