Standard deviation revised by removing a sample

In summary, to find the new standard deviation, you need to know the previous standard deviation, the previous mean, and the original number of samples. The equation for finding the new standard deviation involves first finding the old summation of squares and then using it to calculate the new standard deviation. However, if you do not know the original number of samples, it is not possible to determine the new standard deviation.
  • #1
cdux
188
0
but only knowing the previous standard deviation, the previous mean (and the sample to be removed).

does anyone know how to do it?
 
Physics news on Phys.org
  • #2
Hi cdux,

If I'm understanding the question, I think you would also need to know the original number of samples.
 
  • #3
this is correct, you need also to know the sample size
 
  • #4
I had eventually found equations that would do that (what the OP suggests).
 
  • #5
cdux,

Do you mean you can find the revised standard deviation without knowing how many samples there are to begin with? If so, perhaps I'm misunderstanding your original question. Would you post the equations?
 
  • #6
It should be similar to the next to last here:

en.wikipedia.org/wiki/Standard_deviation
(after "Similarly for sample standard deviation:")

after working out a new mean by simply "((num_of_samples X old_mean) -removed_value)/(num_of_samples - 1)" it should be possible to work out a new 's' by solving first to find the "old" summation of the squares and then using it as "result of the summation of the squares minus square of the removed value". (because the main problem is that we don't know the individual squares since we don't know the values but we may be able to find their summation)
 
Last edited:
  • #7
If you know the old standard deviation, old mean, original number of sample, and the sample to remove, it's possible to find the new standard deviation.

There was just some confusion because you did not mention knowing the original number of samples in the problem statement. If you do not know the original number then you cannot determine the new standard deviation.
 
  • #8
oh I'm sorry, you're right about that.
 

1. What is the purpose of removing a sample when calculating standard deviation?

Removing a sample from the data set is typically done to improve the accuracy of the standard deviation calculation. This is because the sample may be an outlier or not representative of the overall data, which can skew the results.

2. How is standard deviation revised by removing a sample?

To revise the standard deviation by removing a sample, the following steps are typically taken: 1) Calculate the mean of the remaining data points, 2) Calculate the squared difference between each data point and the mean, 3) Sum up all of these squared differences, 4) Divide the sum by the total number of data points minus one, 5) Take the square root of the result to get the revised standard deviation.

3. Can removing a sample affect the overall interpretation of the data?

Yes, removing a sample can significantly affect the overall interpretation of the data. The revised standard deviation may be lower or higher than the original, which can impact the conclusions drawn from the data. It's important to carefully consider the reasons for removing a sample and how it may affect the results.

4. Are there any limitations to removing a sample when calculating standard deviation?

Yes, there are limitations to removing a sample when calculating standard deviation. One limitation is that the sample may not be a true outlier and may be important to include in the analysis. Another limitation is that removing a sample can change the shape of the data distribution, which may impact the appropriateness of using standard deviation as a measure of variability.

5. When is it appropriate to remove a sample when calculating standard deviation?

It is appropriate to remove a sample when calculating standard deviation if there is a clear reason for doing so and the sample is truly an outlier or not representative of the overall data. This should be done carefully and with consideration for the potential impact on the results and interpretation of the data.

Similar threads

  • Introductory Physics Homework Help
Replies
3
Views
2K
  • Introductory Physics Homework Help
Replies
1
Views
842
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
905
  • Introductory Physics Homework Help
Replies
12
Views
2K
  • Calculus and Beyond Homework Help
Replies
3
Views
900
  • Chemistry
Replies
1
Views
891
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Introductory Physics Homework Help
Replies
15
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
985
  • Calculus and Beyond Homework Help
Replies
3
Views
593
Back
Top