Statistics and Tchebysheff's theorum

  • Thread starter Thread starter major_maths
  • Start date Start date
  • Tags Tags
    Statistics
Click For Summary

Homework Help Overview

The discussion revolves around Tchebysheff's theorem in statistics, specifically focusing on demonstrating that for any set of n measurements, the fraction of data points within a specified interval around the mean is at least (1-1/k²). Participants are exploring the implications of the theorem and its mathematical foundations.

Discussion Character

  • Exploratory, Conceptual clarification, Mathematical reasoning

Approaches and Questions Raised

  • One participant expresses confusion about the problem's requirements and how to initiate a solution. Another participant attempts to manipulate the sum of squared deviations to establish a relationship between the number of measurements exceeding a threshold and the overall fraction of measurements within the interval. There is also a question regarding the correctness of an inequality in the context of the problem.

Discussion Status

The discussion is active, with participants sharing their thoughts and attempts at reasoning through the problem. Some guidance has been offered regarding the mathematical manipulation of the sum of squared deviations, but there is no explicit consensus on the approach or final interpretation of the theorem.

Contextual Notes

Participants are navigating the complexities of statistical definitions and theorems, questioning assumptions about the relationship between sample means and population means. There is an acknowledgment of the need for clarity in the mathematical expressions used in the discussion.

major_maths
Messages
30
Reaction score
0

Homework Statement



Let k[itex]\geq[/itex]1. Show that, for any set of n measurements, the fraction included in the interval [itex]\overline{y}[/itex]-ks to [itex]\overline{y}[/itex]+ks is at least (1-1/k2).

[Hint: s2 = 1/(n-1)[[itex]\sum[/itex](yi-[itex]\overline{y}[/itex])2]. In this expression, replace all deviations for which the absolute value of (yi-[itex]\overline{y}[/itex])[itex]\geq[/itex]ks with ks. Simplify.] This result is known as Tchebysheff's theorem.

2. Homework Equations are the above.

The Attempt at a Solution



I've got no clue what the problem wants, much less how to start a solution.
 
Physics news on Phys.org
major_maths said:

Homework Statement



Let k[itex]\geq[/itex]1. Show that, for any set of n measurements, the fraction included in the interval [itex]\overline{y}[/itex]-ks to [itex]\overline{y}[/itex]+ks is at least (1-1/k2).

[Hint: s2 = 1/(n-1)[[itex]\sum[/itex](yi-[itex]\overline{y}[/itex])2]. In this expression, replace all deviations for which the absolute value of (yi-[itex]\overline{y}[/itex])[itex]\geq[/itex]ks with ks. Simplify.] This result is known as Tchebysheff's theorem.

Let there be [itex]M[/itex] measurements where [itex]| y_i - \overline{y}| \geq ks[/itex]
If in the sum [itex]\sum(y_i -\overline{y})^2[/itex] we replace those [itex]M[/itex] measurements by [itex]ks[/itex] and leave out the other [itex]N-M[/itex] measurements, we get a smaller sum. The smaller sum is [itex]M (ks)^2[/itex]

Hence

[tex]s^2 = \frac{1}{n-1} \sum(y_i - \overline{y})^2 \geq \frac{1}{n-1} M (ks)^2[/tex]

Since [itex]\frac{1}{n-1} > \frac{1}{n}[/itex]

[tex]s^2 \geq \frac{1}{n-1}M(ks)^2 > \frac{1}{n}M(ks)^2[/tex]
[tex]s^2 \geq \frac{1}{n}M(ks)^2[/tex]

The "fraction of measurements" that [itex]M[/itex] constitutes is [itex]\frac{M}{n}[/itex] and the above inequality can be used to bound it.

The original problem concerns the fraction of measurements other than those M measurements, so that fraction is [itex]1.0 - \frac{M}{n}[/itex].
That needs to be bounded by using the bound for [itex]\frac{M}{n}[/itex].
 
Thank you Stephen. That was part of my homework I was struggling with. I wonder which school OP goes :-).

To be really pedantic, should not the last equation have > sign instead of >=?
 
Last edited:
When we take a look at the definition of theorem number two, we see that the theorem refers to the standard deviation of the possible sample means computed from all possible random samples. Theorem number one is similar in that it says for any population, the average value of all possible sample means computed from all possible random samples of a given size from the population equal the population mean. What does that mean? Does that mean that the mean of my sample will automatically be equal to the population mean?
 

Similar threads

  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
9K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
3
Views
1K
  • · Replies 4 ·
Replies
4
Views
3K
Replies
1
Views
4K
  • · Replies 6 ·
Replies
6
Views
4K
Replies
10
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K