Why do we use a square and square root method in the standard deviation formula?

In summary: The reason for using the root mean square method is because it is the most accurate and consistent method for calculating the standard deviation, as it takes into account the squared differences from the mean. This method also aligns with the normal distribution, making it the most useful for understanding and analyzing data.
  • #1
gsingh2011
115
1
I would like to understand the theory behind the standard deviation formula. The way it was explained to me, you have to subtract each value from the mean and square it to avoid the negatives canceling out the positives. After multiplying and dividing by the correct frequencies, we have to square root our sum to correct the effects of squaring it. This explanation doesn't make sense to me because we could have just used absolute values to avoid the positive/negative problem, and it seems more accurate. So why do we use this square and then square root method?
 
Mathematics news on Phys.org
  • #2
It is, basically, measuring the distance from the data to the norm, just like [itex]\sqrt{(x- a)^2+ (y- b)^2+ (z- c)^2}[/itex] for distance in the plane.

Of course, you could measure "distance" by |x- a|+ |y- b|+ |z- c| and you could calculate such a number for a set of data but "absolute value" is not a very nice function- it is not differentiable at 0.

The most important reason for using "root mean square" definition is that it is the one that shows up in the normal distribution.
 
  • #3
HallsofIvy said:
It is, basically, measuring the distance from the data to the norm, just like [itex]\sqrt{(x- a)^2+ (y- b)^2+ (z- c)^2}[/itex] for distance in the plane.

Of course, you could measure "distance" by |x- a|+ |y- b|+ |z- c| and you could calculate such a number for a set of data but "absolute value" is not a very nice function- it is not differentiable at 0.

The most important reason for using "root mean square" definition is that it is the one that shows up in the normal distribution.

How does it "show up" in the normal distribution. I know there is a relation, like one standard deviation on each side of the mean is 68%, but why does there have to be a root mean square in order to have this correlation?
 
  • #4
The normal distribution is actually a family of distributions with two parameters. These are the mean and the standard deviation (or the variance).

The 68% ~ one standard deviation is simply a fact of the shape of the normal distribution.
 
  • #5


The standard deviation formula is a measure of how much the data values deviate from the mean. It gives us a sense of the spread or variability of the data set. The reason we square the differences from the mean and then take the square root is because it helps to give more weight to larger deviations from the mean, while still keeping the values positive. This is important because we want to capture the overall variability of the data set, rather than just the absolute differences.

Using absolute values instead of squaring the differences would not accurately represent the variability in the data. For example, if we have a data set with values of 1, 2, and 3, the absolute deviations from the mean would be 0, 1, and 2. This would give us a standard deviation of 1, which does not accurately reflect the variability in the data set. By squaring the differences, we get values of 0, 1, and 4, which when averaged and square rooted, gives us a standard deviation of approximately 1.15, which is a more accurate representation of the spread of the data.

In summary, the use of the square and square root method in the standard deviation formula is essential for accurately measuring the variability in a data set. It allows us to give more weight to larger deviations from the mean, while still keeping the values positive. This helps us to better understand and analyze the data.
 

1. What is the purpose of the Standard Deviation Formula?

The Standard Deviation Formula is a measure of how spread out a set of data is from its mean. It helps to determine the variability within a dataset and is commonly used in statistical analysis.

2. How is the Standard Deviation Formula calculated?

The Standard Deviation Formula is calculated by taking the square root of the variance. The variance is found by taking the sum of the squared differences between each data point and the mean, and then dividing by the total number of data points.

3. Why is the Standard Deviation Formula important in research?

The Standard Deviation Formula is important in research because it helps to quantify the amount of variability in a dataset. This is crucial in analyzing and interpreting data accurately and making informed conclusions.

4. Can the Standard Deviation Formula be used for any type of data?

Yes, the Standard Deviation Formula can be used for any type of data, including numerical, categorical, and even non-numerical data. It is a universal measure of variability that can be applied to different types of data.

5. What is considered a high or low standard deviation?

A high standard deviation indicates that the data points are spread out over a wider range from the mean, while a low standard deviation indicates that the data points are closer to the mean. The specific interpretation of a high or low standard deviation depends on the context and the range of data being analyzed.

Similar threads

  • General Math
Replies
6
Views
1K
Replies
13
Views
3K
  • Calculus and Beyond Homework Help
Replies
3
Views
522
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
926
Replies
33
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
737
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
943
  • Precalculus Mathematics Homework Help
Replies
11
Views
361
Back
Top