Brief introduction to the use of statistics

  • Context: High School 
  • Thread starter Thread starter scottdave
  • Start date Start date
  • Tags Tags
    Introduction Statistics
Click For Summary

Discussion Overview

The discussion revolves around the use of statistics, particularly in the context of introductory materials and articles aimed at data scientists. Participants express their views on the clarity and presentation of statistical concepts, as well as their personal experiences with learning statistics.

Discussion Character

  • Exploratory
  • Debate/contested
  • Conceptual clarification

Main Points Raised

  • Some participants share a link to an article on basic statistics concepts, expressing varying levels of agreement with its content.
  • Concerns are raised about the clarity of explanations in the article, particularly regarding the definitions of median, mean, and quartiles.
  • One participant emphasizes the importance of how well introductory texts explain familiar concepts as a measure of their trustworthiness for new material.
  • Another participant critiques the article for being vague, particularly in its treatment of the Poisson distribution compared to the Normal distribution, noting the lack of distinction between discrete and continuous distributions.
  • Some participants suggest combining applied and theoretical approaches to learning statistics, referencing various textbooks and resources they have found helpful.
  • A participant mentions a website that offers problems related to bioinformatics and statistics, suggesting it as a practical resource for applying statistical concepts.

Areas of Agreement / Disagreement

Participants express disagreement regarding the clarity and effectiveness of the article's presentation of statistical concepts. There is no consensus on the quality of the article, with some finding it vague and others not identifying glaring errors.

Contextual Notes

Participants highlight limitations in the article's explanations, such as the lack of context for terms like "IQR" and the vague descriptions of statistical distributions. These points indicate a need for clearer definitions and examples in introductory materials.

scottdave
Science Advisor
Homework Helper
Insights Author
Messages
2,009
Reaction score
974
I'm not really sure where to put this. I came across this nice article on Medium. “The 5 Basic Statistics Concepts Data Scientists Need to Know” by George Seif https://link.medium.com/APtnCOapOV
 
Physics news on Phys.org
What do you like about that article? I haven't read it all (I gave up at "the first quartile is essentially the 25th percentile") but personally I disagree with the way almost all I have read so far is presented.
 
Hi @pbuk I would certainly like to know if something I'm reading or Sharing is wrong. It has been years since I've taken a Statistics course, but I am taking a course which uses some. I was looking for something to brush up and came across this. I didn't notice anything glaringly wrong. Please let me know.
 
What has worked for me is to combine applied and theoretical approaches. I used D.Montgomery's Statistics for Engineers and Schaum's for applied and my old Freund's textbook. Then go over a lot of sites with Stats content.
 
  • Informative
Likes   Reactions: scottdave
Its not that I have noticed anything wrong (although as I say I haven't read it all), it's just the way the material is presented - for instance "[the] Median is used over the mean since it is more robust to outlier values". When is the median used in preference over the mean? What even is the mean? Or the median (it is explained as "the line in the middle!")? What does the symbol "IQR" on the chart mean (it is "inter-quartile range of course, and the author talks about the box-plot being "short" or "tall" (in relation to what?) without referring to this label). And then the wonderful tautology I have already quoted - "the first quartile is essentially the 25th percentile". If I don't know what a quartile is how on Earth am I going to know what a percentile is?

When I am looking for an introductory or refresher text, a key indicator is how well it explains the things I already know. If it does a good job, I am inclined to trust the author to explain new material. If it doesn't, then I turn elsewhere.
 
  • Like
Likes   Reactions: lomidrevo, WWGD and PeroK
pbuk said:
Its not that I have noticed anything wrong (although as I say I haven't read it all), it's just the way the material is presented - for instance "[the] Median is used over the mean since it is more robust to outlier values". When is the median used in preference over the mean? What even is the mean? Or the median (it is explained as "the line in the middle!")? What does the symbol "IQR" on the chart mean (it is "inter-quartile range of course, and the author talks about the box-plot being "short" or "tall" (in relation to what?) without referring to this label). And then the wonderful tautology I have already quoted - "the first quartile is essentially the 25th percentile". If I don't know what a quartile is how on Earth am I going to know what a percentile is?

When I am looking for an introductory or refresher text, a key indicator is how well it explains the things I already know. If it does a good job, I am inclined to trust the author to explain new material. If it doesn't, then I turn elsewhere.
That makes sense. Thanks.
 
Having just very quick view on the article, I found it quite vague. For example:
A Poisson Distribution is similar to the Normal but with an added factor of skewness.
This doesn't explain anything about Poisson distribution. And no single mention that it is a discrete distribution comparing to Normal, which is continuous.
 
lomidrevo said:
Having just very quick view on the article, I found it quite vague. For example:

This doesn't explain anything about Poisson...
After rereading it, I agree. It seems like "here are some things to go learn more about."

In doing some searching I came across this site http://rosalind.info It has some problems to try to solve, related to bioinformatics, many of which use statistics concepts. To solve some, writing a program is helpful, as they give a 5 minute time limit, then the data changes. It kind of reminds me of Project Euler style, in a way.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 43 ·
2
Replies
43
Views
6K
  • · Replies 21 ·
Replies
21
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 24 ·
Replies
24
Views
6K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 17 ·
Replies
17
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K