Confidence Interval: Calculating Parameters of Populations

In summary, the value of 9.99+-0.002 is an estimate of some parameter with a confidence level of 95%, meaning there is a 95% chance that the true value falls within this interval. This is not a statement about the probability of samples falling within this interval. Statistical inference is used to infer information about and from statistics, and assumptions about the underlying data can affect the accuracy of these estimates. Additionally, the formula for variance does not assume normality, but the properties of the estimator may change depending on the distribution.
  • #1
ktran03
2
0
when we calculate a value, say
@95%
we get 9.99+- .002

this value is describing a parameter of the true population right?
it is not saying 95% of the time we get samples, it will fall in this range correct?

We never infer anything about statistics, only parameters of populations right?



I'm pretty sure that is correct, I just came across something that interprets this differently.
 
Physics news on Phys.org
  • #2
The [tex] 9.99 \pm .002 [/tex] is an estimate of some parameter. The 95% is the confidence level , also called the confidence coefficient . We typically say that we have 95% confidence that the given confidence interval contains the parameter .

The construction of one of these intervals is but one use of a procedure that, in the long run, produces result that will fall around the parameter 95% of the time. it does not mean that 95% of the time we take samples anything will fall inside this particular interval.
 
  • #3
Confidence intervals are closely allied with the concept of statistical significance. Suppose that a finite sample size experiment yields 9.99 as an estimate of some parameter. Whether the true value of that parameter is 9.991, 9.992, 9.993, or 200 thousand, we don't know.

That 200 thousand value can easily be rejected if one has any idea of the underlying random process. That the true parameter value is 200 thousand will not be credible at any reasonable level of significance. How about a true value of 9.993 value? In this case, we can reject the possibility that the true value of the parameter exceeds 9.99+0.002 at the 5% (100%-95%) significance level.
ktran03 said:
We never infer anything about statistics, only parameters of populations right?
I don't know what exactly you mean by this statement. However, we often infer many things about and from statistics. For example, one might ask "are these the right statistics?" Suppose you use the standard set of equations to compute a sample mean and standard deviation. These equations implicitly assume that the underlying data is normally distributed. What if it isn't? Those estimates might be very wrong if the random process is not normal. The collected data gives me some ammo to test whether that assumption is correct. You can infer something about the statistics as well as about the statistics themselves from your collected data.

The purpose of statistical inference is to infer information about and from statistics.
 
  • #4
The situation is even more complicated: When you repeat the experiment, you will in general get both a different estimate, e.g. 9.97 instead of 9.99 and different confidence interval e.g. 9.97+-0.003 instead of 9.99+-0.002. Of all this different confidence intervals from different experiments, 95% will cover the true value.
 
  • #5
"These equations implicitly assume that the underlying data is normally distributed. What if it isn't? Those estimates might be very wrong if the random process is not normal. "

? The formula for variance doesn't assume normality at all - many distributions have a variance.

Second - nothing is truly normally distributed - the normal distribution, like every distribution, is an idealized description of population behavior. the "test about the assumption" really indicates whether things differ enough from this idealized behavior to make the assumption of normality a poor one.
 
  • #6
statdad said:
The formula for variance doesn't assume normality at all - many distributions have a variance.
The simple expression for the sample variance,

[tex]s^2 = \frac 1 N \sum_{i=1}^N (x_i-\bar x)[/tex]

is a biased estimator. Removing the bias leads to

[tex]s^2 = \frac 1 {N-1} \sum_{i=1}^N (x_i-\bar x)[/tex]

But this is still a biased estimator of the standard deviation. The UMVU estimator for one distribution will not be the same as the UMVU estimator for another.
 
  • #7
[tex] s^2 [/tex] is not an estimator of the standard deviation - you misspoke.

The sample variance (your second version) is unbiased for the population variance regardless of the distribution, as long as second moments exist. When you begin speaking about other properties then distributional assumptions may come forward: but that isn't the original comment: that comment was "the formula for variance doesn't assume normality at all"

Also, remember it isn't simply distributional assumptions: even with normality, if [tex] \mu [/tex] is known [tex] s^2 [/tex] is no longer UMVU for [tex] \sigma^2 [/tex].
 
  • #8
statdad said:
[tex] s^2 [/tex] is not an estimator of the standard deviation - you misspoke.
Taking the square root of s2 obviously provides an estimate of the standard deviation. Did I really need to spell that out? Taking the square root of this unbiased estimate of the variance leads to a biased estimate of the standard deviation.
 
  • #9
I understand all this, and I said you simply misspoke. I simply pointed out that

1) The formulae for variance and standard deviation do not depend on the assumption of
normality
2) Properties of those estimators can change when the distributional assumptions
change.

The result you state about the sample standard deviation not being an unbiased estimator of [tex] \sigma [/tex] has nothing to do with normality either: it is because

[tex]
E\bigg[\sqrt{s^2}\bigg] \ne \sqrt{E[s^2]}
[/tex]

This isn't a big problem since

[tex]
\sqrt{\frac{n-2}2} \frac{\Gamma\left(\frac{n-1}2\right)}{\Gamma\left(\frac n 2\right)} s
[/tex]

is UMVU for [tex] \sigma [/tex] in the normally distributed setting.
 

1. What is a confidence interval?

A confidence interval is a range of values that is likely to contain the true value of a population parameter with a certain degree of confidence or probability. It is typically calculated from a sample of data and is used to estimate the true value of a population parameter.

2. How is a confidence interval calculated?

A confidence interval is calculated by taking the point estimate of a population parameter (such as the mean or proportion) and adding and subtracting the margin of error. The margin of error is based on the sample size, level of confidence, and standard deviation of the sample data.

3. What is the significance of the level of confidence in a confidence interval?

The level of confidence in a confidence interval represents the probability that the true value of the population parameter lies within the calculated interval. For example, a 95% confidence interval means that there is a 95% probability that the true value of the population parameter falls within the given range.

4. How does sample size affect the width of a confidence interval?

The larger the sample size, the smaller the margin of error and therefore the narrower the confidence interval. This is because a larger sample size provides more precise estimates of the population parameter, resulting in a more accurate confidence interval.

5. How are confidence intervals used in hypothesis testing?

Confidence intervals are often used in hypothesis testing to determine if a sample statistic is significantly different from a hypothesized value. If the hypothesized value falls outside the confidence interval, it can be concluded that there is a statistically significant difference between the sample statistic and the hypothesized value.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
645
  • Set Theory, Logic, Probability, Statistics
Replies
22
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
704
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
713
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
428
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
989
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
438
Back
Top