Conf.intervals for fitted parameters: divide by sqrt(n)?

In summary, the conversation discusses the process of fitting a parametrized model to data and obtaining the optimized parameters and covariance matrix. It is noted that the squares of the diagonal elements of the covariance matrix are the standard errors of the optimized parameters, and that multiplying these errors by 1.96 gives a confidence interval of 95%. The question is raised whether one should also divide by √n, similar to creating a confidence interval for a mean, and it is explained that this factor does come into play and decreases with increasing n. The conversation also discusses the correlation between the standard errors of m and c, and how this limits the variability of the regression line. It is mentioned that most computer programs estimate the statistical distributions of the model parameters by
  • #1
Jonas Hall
4
1
If you fit a parametrized model (i.e. y = a log(x + b) + c) to some data points the output is typically the optimized parameters (i.e. a, b, c) and the covariance matrix. The squares of the diagonal elements of this matrix are the standard errors of the optimized parameters. (i.e. sea, seb, sec). Now to get a confidence interval of 95% for a parameter you typically multiply this error with 1.96 (assuming a normal distribution)(i.e. a ± 1.96 sea). At least this is what I have found so far. But I wonder if this is the whole truth. Shouldn't you also divide by √n the way you do when you create a confidence interval for a mean? It just seems to me that the more data you have, the better the estimates of the parameters should become. Also, i find that if i don't divide by √n, then the values seem rather large, sometimes falling om the wrong side of 0.

Or... does the covariance matrix values grow smaller with increasing n and this is the reason that you don't divide by √n and the values are supposed to be quite large?

Grateful if someone could make this clear to me. I have never studied statistics "properly" but dabble in mathematical models and teach math in upper secondary school.
 
Physics news on Phys.org
  • #2
Jonas Hall said:
Shouldn't you also divide by √n the way you do when you create a confidence interval for a mean?
When you divide the standard deviation by the square root of n you obtain the standard error of the mean. You said that the values you have are already standard errors, so it wouldn’t make sense to divide them again.
 
  • Like
Likes FactChecker
  • #3
The √n factor does come into it. For example, for the equation y = mx + c, if the error variance of the points is σ2, the variance of m is
σ2/n(<x2> - <x>2)
If your measured error variance is s2, the estimated variance of m is s2/n(<x2> - <x>2). This is the diagonal element of the covariance matrix. (I think you meant to say "The diagonal elements of this matrix are the squares of the standard errors of the optimized parameters. (i.e. sea, seb, sec).") As you suggest, the value decreases with increasing n.
If you did p separate experiments, with independent data sets, and determined a value of m for each, and calculated the mean value of m, the standard error of this mean value would be the standard error of m divided by √p.
Another point is that though the standard errors of m and c might be large, the values of m and c are usually strongly correlated, so you can't have any value of m in its confidence interval with any value of c in its confidence interval. This limits the variability of the regression line more than might at first appear.
 
  • Like
Likes Dale
  • #4
Jonas Hall said:
teach math in upper secondary school
mjc123 said:
though the standard errors of m and c might be large, the values of m and c are usually strongly correlated, so you can't have any value of m in its confidence interval with any value of c in its confidence interval. This limits the variability of the regression line more than might at first appear.
e.g. for a straight line fit it's easy to check that the best line goes 'through the center of mass of the mesaured points' and can wiggle its slope due to the (hopefully) random errors. The abscissa error is a combination of this wiggling and shifting up and down a bit. The correlation disappears if the origin is at the 'center of mass'.
 
  • #5
Jonas Hall said:
The squares of the diagonal elements of this matrix are the standard errors of the optimized parameters. (i.e. sea, seb, sec).

It would be interesting to know if that's actually true for an arbitrary nonlinear model. In fact, it would interesting to know how one can speak of statistical errors in the optimized parameters at all since fitting the model to the data gives one value of each parameter, not a sample of several values for each parameter.

I think most computer programs estimate the statistical distributions of the model parameters by using a linear approximation to the model and assuming the measured data values tell us the correct location about which to make the linear approximation. However a linear approximation to log(ax + b) + c will have a different coefficients than a linear approximation to sin(ax + b) + c. So how general is the claim that the squares of the diagonal elements of the covariance matrix are (good estimators of) the standard errors of the parameters?
 
  • #6
Thank you all! You have convinced me that it is not appropriate to divide by sqrt(n). I find Stephen Tashis comment interesting. I also wonder if the statistical errors given by e.g. scipy etc compares to experimental values found when bootstrapping/jackknifeing (spelling?) your data. I guess I will have to run some experiments... I still find the standard errors quite large though, but I appreciate the comments on this by mjc123 and BvU. I envisage the following scenario: You take data and fit parameters according to your model. In reality though, you ar often only interested in one single parameter (such as b in y = a * b^x + c). So after you obtain your parameters (say a = 2, b = 3 and c = 4) you do a new fit according to the model y = 2 * b^x + 4. You will now presumably get b = 3 again but with a standard error that does not depend on any other parameters.

Would this work?
 
  • #7
Yes it does. It depends on a being 2 and c being 4. It is not a function that depends on a and c as variables, but a value that is only true for particular values of a and c.
 
  • #8
Jonas Hall said:
I also wonder if the statistical errors given by e.g. scipy etc compares to experimental values found when bootstrapping/jackknifeing (spelling?) your data.

The best experiments would be to compare all those methods to the correct answer.

Assume the correct model for the data has the form: ##Y = G(X,a,b,..)## where ##Y## and ##X## are random variables and ##a,b,...## are specific values of parameters A particular fitting algorithm produces estimates for ##a,b,...## that are functions of the sample data.

I'll denote this by:
##\hat{a} = f_1(X_s)##
##\hat{b} = f_2(X_s)##
...

##\hat{a},\hat{b},.. ## are random variables since they depend on the random values in a sample.

If we simulate a lot of samples ##X_s##, we get samples of ##\hat{a},\hat{b},...##. We can estimate the distributions of those random variables.

From that, we can estimate confidence intervals. However, we may have to do this by looking at the distribution of ##\hat{a}## in detail, not merely by looking at the parameters of that distribution. For example ##\hat{a}## may not be an unbiased estimator of ##a##. In that case, knowing the standard deviation of ##\hat{a}##, doesn't let us compute "confidence" by assuming ##a## is at the center of the interval. ( It's also possible that ##\hat{a}## may be unbiased estimator of ##a##, but not normally distributed.)

A limitation of such experiments is that the answer depends on particular choices of ##a,b,...## so the size of a confidence interval may vary with a big variation in the magnitudes of ##a,b,...##.
 

1. What does "Conf.intervals for fitted parameters: divide by sqrt(n)" mean?

This phrase refers to a statistical calculation used to estimate the range of values within which a fitted parameter is likely to fall. The calculation involves dividing the standard error by the square root of the sample size (n).

2. Why is it necessary to divide by sqrt(n) when calculating confidence intervals for fitted parameters?

Dividing by sqrt(n) helps to adjust for the variability in the sample size. As the sample size increases, the standard error decreases, so dividing by sqrt(n) ensures that the confidence interval reflects this decrease in variability.

3. How do you interpret the confidence interval for a fitted parameter that was divided by sqrt(n)?

A confidence interval for a fitted parameter that was divided by sqrt(n) can be interpreted as a range of values within which the true value of the parameter is likely to fall with a certain level of confidence. For example, a 95% confidence interval means that we can be 95% confident that the true value of the parameter falls within the interval.

4. Can you explain the relationship between the sample size and the width of the confidence interval?

The sample size and the width of the confidence interval have an inverse relationship. As the sample size increases, the width of the confidence interval decreases. This is because a larger sample size reduces the standard error, making the estimate of the parameter more precise.

5. Are there any assumptions or limitations when using the "divide by sqrt(n)" method to calculate confidence intervals for fitted parameters?

Yes, there are a few assumptions and limitations when using this method. It assumes that the data is normally distributed and that the sample is representative of the population. It also assumes that the data is independent and that there are no influential outliers. Additionally, this method may not be appropriate for small sample sizes or when the underlying distribution is not known.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
0
Views
465
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
843
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
28
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
475
Back
Top