Estimating the standard deviation

In summary, the conversation discusses a linear problem where the parameters and data are represented by p and d, respectively, and the observation matrix is represented by M. The problem is overdetermined and an example is given where the standard deviation of the data is estimated using the formula sigma^2_{estimate} = 1/N * sum_i (d_i - (Mp)_i)^2. This estimation method is different from the usual mean-based method, but it is justified in the context of linear regression and assuming a well-fitted p vector and identical distribution of errors. The conversation ends with the asker expressing their gratitude for the explanation.
  • #1
Niles
1,866
0
Hi

Say I have a linear problem Mp = d, where d is our data, p our parameters and M our "observation matrix" (see http://en.wikipedia.org/wiki/Inverse_problem#Linear_inverse_problems). So what we are dealing with is an overdetermined problem.

Now, I have read an example where we have a vector of data d whose standard deviation we don't know. Then we try and estimate it, and the estimate is given by

[tex]
\sigma ^2 _{estimate} = \frac{1}{N}\sum\limits_i {\left( {d_i - \left( {Mp} \right)_i } \right)^2 }
[/tex]

My question is: How can they estimate the standard deviation like this? Usually we would use the mean, but they use (Mp)i, and I can't quite see why this yields an estimate.
 
Physics news on Phys.org
  • #2
Niles said:
How can they estimate the standard deviation like this? Usually we would use the mean,

Suppose you are doing linear regression and the fitted equation is y = Ax + B. If you assume the random variable y_observed has mean Ax + B at each x value and also that the distribution of errors about the mean is the same at each x value, then you can estimate the standard deviation of y_observed by using the quantities ( y_observed - Ax)^2 since Ax is the mean value of y at that particular value of x.

The example may assume that the p vector is fitted well enough so that [itex] {(MP)}_i [/itex] is the mean value of [itex] d_i [/itex] and that the distribution of errors from the mean is identical for all [itex] i [/itex].
 
  • #3
Ah, I see. Thanks for taking the time to help me!
 

What is the standard deviation and why is it important in research?

The standard deviation is a measure of how much the data values vary from the mean. It is an important statistical tool in research because it helps to understand the spread or variability of the data, which is crucial for making accurate conclusions and predictions.

How do you calculate the standard deviation?

The standard deviation is calculated by finding the square root of the variance, which is the average squared difference between each data point and the mean. It involves several steps, including finding the mean, subtracting each data point from the mean, squaring the differences, finding the average of the squared differences, and finally taking the square root.

What is the relationship between standard deviation and variance?

Standard deviation and variance are both measures of variability in a data set. The standard deviation is the square root of the variance, so they are closely related. However, the standard deviation is easier to interpret as it is in the same units as the original data, whereas the variance is in squared units.

Why is it important to estimate the standard deviation?

Estimating the standard deviation is important because it provides a measure of the accuracy of our estimates. It allows us to understand the variability in our data and make more accurate predictions and conclusions. Furthermore, it helps us to identify outliers or extreme values that may affect our results.

What are some common methods for estimating the standard deviation?

There are several methods for estimating the standard deviation, including the empirical rule, the range rule of thumb, and the Chebyshev's theorem. The most commonly used method is the empirical rule, which states that approximately 68% of the data will fall within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
910
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
459
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
Back
Top