
#1
Jan2413, 11:33 PM

P: 56

I realize you can transform the data in many ways, but I really can't find a method to solve this particular scenario anywhere online:
Suppose you are given the mean and standard deviation of a set of data. Further suppose you take away (get rid of) "x" elements in the original data set, and now you're given the new mean. Is it possible to find the new standard deviation? It seems like it's something very simple, but I can't seem to come up with a solution. If it is indeed possible, how would I go about doing this? Numbers aren't important, but perhaps a numerical example to illustrate what I'm asking: A set of data has a mean of 20 and a standard deviation of 2. Suppose you take four elements from the set of data, and the new mean is now 50. What is the new standard deviation? 



#2
Jan2513, 12:08 AM

P: 771

Why don't you just calculate the standard deviation of the new dataset?
Do you mean that you don't have the dataset; only the new mean? I would strongly suspect that the question is illposed, then. I could likely remove points in multiple ways to create the same mean but different SD's. 



#3
Jan2513, 01:58 PM

P: 56

Yes sorry for not clarifying that earlier. I only have the mean and s.d. of the set, I don't know the size of the set or individual points. The question is trivial if I have the data set. Does that mean there is not enough information to infer the new standard deviation of the new data set?
I'm not sure how would you go about removing points in multiple ways to get the same mean but different standard deviations, given that the original data has a particular mean and standard deviation already. For instance, in the example I gave there is a big jump in the mean after removing four points => 20 to 50, but the deviation of the original set is 2, is it really possible to alter the mean so drastically in multiple ways? What is there preventing the existence of a unique way? But I'm just interested in the "new standard deviation", if possible to compute. Edit: Upon further reflection, I don't think there is a systematic way in general and my example was flawed. It implied: 20(n + 4)  s = 50n Where n + 4 = total number of elements initially, and s = sum of elements removed: 80 = 30n + s Only positive integer solution for n is n = 1 (s = 50) and n = 2 (s = 20). If n+4 = 5 and the mean is 20, and one element is 50, then I don't think it's possible to have a s.d. of 2. I'm assuming it's likewise for the n = 2, so example was flawed. 



#4
Jan3113, 02:29 AM

P: 571

Transformed Standard Deviation? 



#5
Jan3113, 08:14 AM

P: 1,784

Do you know the values of of the points that were removed?




#6
Jan3113, 04:08 PM

P: 56

Assuming the "example" is plausible (i.e. it is possible to have a n s.t. the given mean and standard deviation are attainable), it seems that you're only able to determine the new standard deviation if only 1 value is removed. =================(my workings, if anyone is interested)================ (I apologize for being incompetent in LaTex, so the work might be difficult to read). r = number of values removed, where r < n. initial mean = (1 / n) sum [i = 1 to n] x_i new mean = 1 / (n  r) * sum[i = 1 to n r] x_i You're given both quantities, as well as r, so it is possible to solve for sum[i = 1 to n r] x_i. Let new mean = u Let old mean = u_0 new s.d. = sqrt [(sum[i = 1 to n  r] (x_i + u)^2 / (n  r))] Expand the exponent and simplifying gives: new s.d. = sqrt[(sum[i = 1 to n  r](x_i^2)  (n  r)u^2) / (nr))] So all you need is: (sum[i = 1 to n  r](x_i^2) sum [i = 1 to n] x_i = sum[1 = 1 to n r] x_i + v Where: v = x_nr + x_n(r1) + x_n(r2)+....+ x_n You can solve for v, since the other two quantities are known. Initial s.d. = sqrt[(sum[i = 1 to n](x_i^2)  (n)u_0^2) / (n))] You're given n, u_0, and initial s.d. so you can solve for: sum[i = 1 to n](x_i^2) (sum[i = 1 to n  r](x_i^2) + (x_nr)^2 + (x_n(r+1))^2 + ...+ (x_n)^2 = sum[i = 1 to n](x_i^2) Since, v = x_nr + x_n(r1) + x_n(r2)+....+ x_n (and you know the value for v) the problem now just involves figuring out: (x_nr)^2 + (x_n(r1))^2 + (x_n(r2))^2 +....+ (x_n)^2 If you can find that then you can find the new standard deviation, or perhaps there might be another trick which makes the problem easier. This means that if one value is removed (and you're not given the value) then it'd just be v^2 and that you can solve for the new standard deviation. If two values are removed and you know: v = x_n1 + x_n, and you have a numeric value for v. Then it comes done to solving: x_n1^2 + x_n^2 which means you somehow need to account for a 2(x_n1 * x_n) when you square v. and it gets harder to solve as r gets bigger. 



#7
Feb1513, 06:04 AM

P: 239

Your answer is no.
To get new sd uniquely you necessarily require the removed group sd along with sizes and means of two groups out of the three groups (old, removed, new). 


Register to reply 
Related Discussions  
Statistics  Standard Deviation, Standard Error and Mean  Set Theory, Logic, Probability, Statistics  3  
Difference between sample standard deviation and population standard deviation?  Precalculus Mathematics Homework  4  
Standard deviation versus absolute mean deviation  Set Theory, Logic, Probability, Statistics  3  
Mean Absolute Deviation/Standard Deviation Ratio  Set Theory, Logic, Probability, Statistics  6  
Changing standard error to standard deviation.  Set Theory, Logic, Probability, Statistics  1 