Two Stats questions for Math nerds (std. deviation, mean, subsets)

In summary, the average class size of a subset of 50 randomly selected classes from a school offering 3200 courses is expected to be 50 with a standard deviation of 2. For a sample of 10 classes at a university, the average class size is 48 with a standard deviation of 12. Assuming the sample is representative and class size is a Gaussian random variable, we can expect the average and standard deviation for samples of 40 and 160 classes to be 48+- 3.84 and 48+- 15.36, respectively. The first solution is relatively simple, while the second solution follows the same logic as the first one.
  • #1
ski
8
0
1. If a school offers 3200 separate courses and a survey of these courses determines that the class size is 50 with a standard deviation of 2, what would one expect for the average and standard deviation of a subset of 50 of these classes selected randomly?

2. In a survey to estimate the average size (number of students) of a class at a university, ten courses are picked at random and the class size of each is determined. The result is 48 students in a class with a standard deviation of 12 students. Assuming that the sample of 10 classes is representative subset of the whole and that class size is a Gaussian random variable, what would one expect the average and standard deviation to be for samples of 40 classes and 160 classes?

It's been so long since I've taken stats... REALLY long :confused: Any help on one or botha these would help me this week :smile:
 
Physics news on Phys.org
  • #2
Here was what's suggested I do:

1.We'd expect an average of 50+- 1.98*2/sqrt(50)
The standard deviation by definition is 2/sqrt(50)

2. Not as sure about this one but following the logic of the last one..
sqrt(10)*12=s
and then stdv= s/sqrt(40)
and s/sqrt(160)
where the mean = STDV*2.021,STDV*1.97 respectively



Is the first solution really that easy?
 
  • #3


1. For the first question, we can use the formula for standard deviation of a sample, which is s = √(∑(x - x̄)^2 / n-1), where s is the sample standard deviation, x is the individual class size, x̄ is the sample mean, and n is the sample size. In this case, n = 50 and we know the sample standard deviation is 2. We also know that the total number of courses is 3200, so the population mean would be 3200/50 = 64. Therefore, we can expect the average of the subset of 50 classes to be around 64 and the standard deviation to be around 2.

2. For the second question, we can use the same formula for standard deviation of a sample. In this case, n = 10, s = 12, and x̄ = 48. To estimate the average and standard deviation for samples of 40 and 160 classes, we need to use the Central Limit Theorem, which states that the sample means of large samples will follow a normal distribution regardless of the distribution of the population. This means that the average of the sample means will be the same as the population mean, and the standard deviation will be the population standard deviation divided by the square root of the sample size. Therefore, for a sample of 40 classes, we can expect the average to be around 48 (same as the population mean) and the standard deviation to be 12/√40 ≈ 1.9. Similarly, for a sample of 160 classes, we can expect the average to be around 48 and the standard deviation to be 12/√160 ≈ 0.95.
 

1. What is standard deviation and how is it calculated?

Standard deviation is a measure of how much the data values vary from the mean (average). It is calculated by finding the average of the squared differences between each data point and the mean, and then taking the square root of that value.

2. How is standard deviation different from mean?

Mean is a measure of central tendency, representing the average of a set of data. Standard deviation, on the other hand, measures the spread or variability of the data. While the mean can be affected by extreme values, the standard deviation takes into account the entire dataset.

3. Can subsets affect the standard deviation of a dataset?

Yes, subsets can affect the standard deviation of a dataset. If the subsets are significantly different from each other, it can increase the overall variability of the data, resulting in a higher standard deviation. However, if the subsets are similar, the standard deviation may not be significantly affected.

4. How can standard deviation be used in data analysis?

Standard deviation can be used to identify outliers in a dataset, to compare the variability between different datasets, and to assess the precision of data. It can also be used to create confidence intervals and to perform hypothesis testing in statistics.

5. Is it possible for a dataset to have a standard deviation of 0?

Yes, it is possible for a dataset to have a standard deviation of 0. This would occur when all the data values are identical, resulting in no variability or spread in the data. However, a standard deviation of 0 is not common and usually indicates that there is a problem with the data or the calculation.

Similar threads

  • Calculus and Beyond Homework Help
Replies
4
Views
1K
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
774
  • Precalculus Mathematics Homework Help
Replies
1
Views
1K
  • Precalculus Mathematics Homework Help
Replies
3
Views
1K
  • General Math
Replies
1
Views
1K
  • Introductory Physics Homework Help
Replies
1
Views
2K
Replies
12
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • General Math
Replies
9
Views
2K
Back
Top