-aug.16.data list from table intervals

Click For Summary

Discussion Overview

The discussion revolves around statistical analysis of a frequency table representing height intervals. Participants explore how to derive various statistical measures such as mean, median, mode, variance, and standard deviation from the given data. The conversation includes questions about handling intervals and the implications for calculations.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Participants discuss how to create a list of numbers from height intervals to calculate statistical measures, with some suggesting using midpoints of intervals.
  • There is a question about the calculation of the mean, with differing methods proposed.
  • One participant suggests that the median is 12, while another challenges the mode calculation, indicating that no one has a height of 7.
  • Discussion on the mode leads to a proposal that the mode could be 13 based on frequency.
  • Participants explore the concept of variance and standard deviation, with formulas provided and corrections made regarding the substitution of values in calculations.
  • There is an ongoing discussion about the interquartile range and how to determine quartiles from the frequency table, with some disagreement on the method of calculation.

Areas of Agreement / Disagreement

Participants express differing views on how to calculate the mean, mode, and quartiles from the frequency table. While some calculations are agreed upon, others remain contested, indicating that no consensus exists on certain statistical interpretations.

Contextual Notes

Some calculations depend on the interpretation of intervals and the choice of midpoints, leading to potential variations in results. The discussion highlights the complexity of statistical measures when applied to grouped data.

Who May Find This Useful

This discussion may be useful for students or individuals interested in statistics, particularly those learning about data analysis from frequency tables and the implications of using intervals in calculations.

karush
Gold Member
MHB
Messages
3,240
Reaction score
5
in creating a list of numbers to find mean, median, mode, range and some other questions with this table

height in meters| frequency
$$8 \le h < 10\ \ \ \ \ \ \ \ \ \ 6$$
$$10 \le h < 12\ \ \ \ \ \ \ \ 5$$
$$12 \le h < 14\ \ \ \ \ \ \ \ 7 $$
$$14 \le h < 16\ \ \ \ \ \ \ \ 4$$

how do we make a list when you have intervals? or do you just use the number in between like $$8 \le h < 10$$ would be $$\{9,9,9,9,9,9\}$$
 
Physics news on Phys.org
let me ask a different question would the mean of this table be
$$\frac{4+5+6+7}{4} = \frac{11}{2} $$
 
Hi karush! :)

karush said:
in creating a list of numbers to find mean, median, mode, range and some other questions with this table

height in meters| frequency
$$8 \le h < 10\ \ \ \ \ \ \ \ \ \ 6$$
$$10 \le h < 12\ \ \ \ \ \ \ \ 5$$
$$12 \le h < 14\ \ \ \ \ \ \ \ 7 $$
$$14 \le h < 16\ \ \ \ \ \ \ \ 4$$

how do we make a list when you have intervals? or do you just use the number in between like $$8 \le h < 10$$ would be $$\{9,9,9,9,9,9\}$$

Yes. You would use the number in the middle as you suggest.
karush said:
let me ask a different question would the mean of this table be
$$\frac{4+5+6+7}{4} = \frac{11}{2} $$

So no.
The mean would be $$\frac{6\cdot 9 + 5 \cdot 11 + 7 \cdot 13 + 4 \cdot 15}{6+5+7+4}$$.
 
Last edited:
The mean would be $$\frac{6\cdot 9 + 5 \cdot 11 + 7 \cdot 13 + 4 \cdot 15}{6+5+7+4}$$.

so would the median of this be from

$$\{9,11,13,15\} = 12$$

and the mode be $$7$$ since it has the highest frequency

do I have to start a new OP if I continue to ask more Q on this table?
 
karush said:
...
do I have to start a new OP if I continue to ask more Q on this table?

As long as your additional questions pertain to the data already provided, it is best to ask further questions regarding it here in this topic. :D
 
karush said:
so would the median of this be from

$$\{9,11,13,15\} = 12$$

The median is the height where half is smaller and the other half is taller.
At height 12, you have 6+5=11 people smaller, and 7+4=11 people taller.
So indeed the median is 12.

and the mode be $$7$$ since it has the highest frequency

The mode is the height that occurs most... but no one has height 7. ;)
 
I like Serena said:
The mode is the height that occurs most... but no one has height 7.

so the most frequent is $$12\leq h < 14$$ or 13 for mode.

my next question is standard deviation and variance
from Wikipedia
In statistics and probability theory, standard deviation (represented by the symbol sigma, $$\sigma$$) shows how much variation or dispersion exists from the average (mean), or expected value.

So I would presume $$\sigma$$ here is 2 since that is the size of the intervals

I read variance in Wikipedia but not sure if it applies to this table.
so how is variance derived?
 
karush said:
so the most frequent is $$12\leq h < 14$$ or 13 for mode.

Right. :)
my next question is standard deviation and variance
from Wikipedia
In statistics and probability theory, standard deviation (represented by the symbol sigma, $$\sigma$$) shows how much variation or dispersion exists from the average (mean), or expected value.

So I would presume $$\sigma$$ here is 2 since that is the size of the intervals

I read variance in Wikipedia but not sure if it applies to this table.
so how is variance derived?

Not quite.

Let's start with variance.
In your case the formula is:
$$\text{Variance} = \frac{\sum n_i \times (x_i - \text{mean})^2}{\sum n_i}$$
where $n_i$ is the frequency of each category, $x_i$ is the mid value of each height interval, and $\text{mean}$ is the value you already found.

Variance is often denoted as $\sigma^2$.
Standard deviation (denoted as $\sigma$) is the square root of the variance.
 
I like Serena said:
Let's start with variance.
In your case the formula is:
$$\text{Variance} = \frac{\sum n_i \times (x_i - \text{mean})^2}{\sum n_i}$$
where $n_i$ is the frequency of each category, $x_i$ is the mid value of each height interval, and $\text{mean}$ is the value you already found.

Variance is often denoted as $\sigma^2$.
Standard deviation (denoted as $\sigma$) is the square root of the variance.

by $$\sum n_i $$ would this mean $$6+5+7+4=22$$

If so then
$$\frac{6 \times (6-11.82)^2 +5 \times (5-11.82)^2 +7 \times (7-11.82)^2 +4 \times (4-11.82)^2}{22}=variance$$
or is this composed wrong?
 
  • #10
karush said:
by $$\sum n_i $$ would this mean $$6+5+7+4=22$$

Yes!

If so then
$$\frac{6 \times (6-11.82)^2 +5 \times (5-11.82)^2 +7 \times (7-11.82)^2 +4 \times (4-11.82)^2}{22}=variance$$
or is this composed wrong?

Almost.
But you have substituted the frequencies instead of the mid interval values for $x_i$.
 
  • #11
I like Serena said:
Almost.
But you have substituted the frequencies instead of the mid interval values for $x_i$.
how this?$$\frac{9 \times (9-11.82)^2 +11 \times (11-11.82)^2 +13 \times (13-11.82)^2 +15 \times (15-11.82)^2}{22}=11.307$$ or $$ \sigma^2$$

thus standard deviation would be $$\sqrt{11.307}=3.3626$$
 
  • #12
karush said:
how this?$$\frac{9 \times (9-11.82)^2 +11 \times (11-11.82)^2 +13 \times (13-11.82)^2 +15 \times (15-11.82)^2}{22}=11.307$$ or $$ \sigma^2$$

thus standard deviation would be $$\sqrt{11.307}=3.3626$$

Hold on.
Now you have substituted the mid interval values for the freqencies $n_i$.
Check where it says $n_i$ and where it says $x_i$.

Btw, the meaning of variance is the average of the squared deviations from the mean.
 
  • #13
I like Serena said:
Hold on.
Now you have substituted the mid interval values for the freqencies $n_i$.
Check where it says $n_i$ and where it says $x_i$.

Btw, the meaning of variance is the average of the squared deviations from the mean.

$$\frac{6 \times (9-11.82)^2 +5 \times (11-11.82)^2 +7 \times (13-11.82)^2 +4 \times (15-11.82)^2}{22}=4.60331 $$ or $$ \sigma^2$$

so if correct then $$\sqrt{4.60331} = 2.14553$$ or $$\sigma$$
 
  • #14
Yep. That looks right.
 
  • #15
there's still more ??

Number of Data
$$= 22$$ assume sum of frequenciesInterquartile range
assume we could go off the intervals

so $$Q_1=10 \ \ Q_2=12 \ \ Q_3=14$$

then $$14-10=4$$

range
$$16-8=8$$
 
  • #16
karush said:
there's still more ??

Number of Data
$$= 22$$ assume sum of frequencies

Correct.
Interquartile range
assume we could go off the intervals

so $$Q_1=10 \ \ Q_2=12 \ \ Q_3=14$$

then $$14-10=4$$

You're not supposed to work from the intervals.
$Q_1$ is the height such that 25 percent is below.
Since 25% of 22 persons is 5.5, the $Q_1$ height is somewhere in the interval 8-10, which contains 6 persons.
There can be some discussion where that height actually is when talking about intervals, but let's keep it simple and say that $Q_1=9$, which is the middle of the lowest interval.
Similarly $Q_3$ is the height with 75% below.
Keeping it simple that is the middle of the third interval. So $Q_1=13$.

Anyway, your interquartile range comes out the same.
range
$$16-8=8$$

Right.
 
  • #17
thanks everyone for your help. it was a new topic for me
sure I'll be back with more
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
5K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
11K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K