Stephen Tashi said:
Assuming that I interpret "certainty" to mean "probability", do you want to be able to state a specific numerical interval based on an observed value taken from your data? - something like "There is a 95% probability that the mean of the distribution is between 1.8 and 3.8"?
If you want that, you are asking for a "credible interval" - at least according to the current Wikipedia article on "confidence interval". This is a natural thing to want, but you have to make enough assumptions to employ Bayesian statistics in order to get it.
In many fields of study, published papers use "confidence interval" by tradition. So if you are writing some kind of report, there is that consideration. We can discuss either approach, but "confidence intervals" do not have the same interpretation as "credible intervals".
This is strictly for my own uses, so tradition is not an important aspect.
I read briefly in the articles on credible intervals and confidence intervals and just want to paste in the following (if for nothing else, so that I can access it easier in this thread):
A confidence interval with a particular confidence level is intended to give the assurance that, if the statistical model is correct, then taken over all the data that might have been obtained, the procedure for constructing the interval would deliver a confidence interval that included the true value of the parameter the proportion of the time set by the confidence level.
A confidence interval does not predict that the true value of the parameter has a particular probability of being in the confidence interval given the data actually obtained. (An interval intended to have such a property, called a credible interval, can be estimated using Bayesian methods; but such methods bring with them their own distinct strengths and weaknesses).
With that said, what I was asking for in the first post is, just as you said, the credible interval. However, upon reading this I realize that my question is partially based on the misconception that a confidence interval of an estimate of a parameter would contain the parameter with a probability given by the confidence level. I'm not sure if I misremember the one statistics class I've taken or if it was taught erroneously (or if it was just for some subset of possible problems where the two are the same).
If possible, I would very much like to be able to give a credible interval; presuming I understand the idea correctly, it is what feels most natural to me to describe the certainty of an estimation if there is only a single experiment done. However, the Wiki article says that
credible intervals incorporate problem-specific contextual information from the prior distribution whereas confidence intervals are based only on the data
and as far as I can tell, I have no certain prior distribution to work with. How would one arrive at a prior distribution? Could you select any prior distribution and work with it, with the only caveat that the posterior distribution would vary depending on how you chose your prior?
If it is not possible, or if it proves to be too complex for me to learn right now, then a confidence interval would work.
But, just to see if I understand the distinction between the two, if I used a confidence interval instead of a credible interval:
1. I could
not say that with a confidence level of certainty the true value of the estimated parameter is within the confidence interval (e.g. that there's a 90% probability that the true value is within the confidence interval)
2. I could say that if there are a large number of different experiments run, and for each experiment a confidence interval is calculated, then a confidence level fraction of the confidence intervals would contain the true value. (e.g. for a 90% confidence level, 90% of the calculated confidence intervals would contain the true value).
Is that right?