Help me understand skewness in QQ-plots please

  • Context: Undergrad 
  • Thread starter Thread starter bremenfallturm
  • Start date Start date
Click For Summary
SUMMARY

This discussion focuses on the interpretation of skewness in QQ plots, specifically in relation to the normal distribution N(0,1). The user expresses confusion regarding the representation of left-skewed distributions, mistakenly believing that points below the reference line indicate a higher probability of smaller values. However, the correct interpretation is that these points reflect a tendency towards larger values in the actual distribution. The conversation clarifies that the visual representation in QQ plots can be counterintuitive, especially when assessing skewness.

PREREQUISITES
  • Understanding of QQ plots and their construction
  • Familiarity with normal distribution N(0,1)
  • Basic knowledge of statistical skewness
  • Experience with data visualization techniques
NEXT STEPS
  • Study the properties of skewness in statistical distributions
  • Learn how to create and interpret QQ plots using Python's Matplotlib library
  • Explore the relationship between quantiles and distribution shapes
  • Investigate other methods for assessing normality, such as the Shapiro-Wilk test
USEFUL FOR

Statisticians, data analysts, and anyone involved in data visualization who seeks to deepen their understanding of QQ plots and the interpretation of skewness in statistical distributions.

bremenfallturm
Messages
81
Reaction score
13
TL;DR
I am trying to understand how QQ plots work, but I have a hard time understanding how to interpret skewness. Specifically, it is "the other way around" than I expect. See the post for an explanation.
I am trying to understand how QQ plots work, but I have a hard time understanding how to interpret skewness. Specifically, it is "the other way around" than I expect.

Let me explain.

From what I understand, in a QQ plot, we divide the normal distribution (typically ##N(0,1)##) and the dataset into ##n## quantiles (where ##n## is the number of datapoints). We sort the dataset and plot each datapoint against the normal distribution. For example, if we have 10 ordered datapoints $$a_1, a_2, ...$$, and have created 10 normal quantiles $$n_1,n_2,...$$ we would plot $$(a_1, n_1), (a_2, n_2)$$ and so on.

Now, here is when I don't understand how we interpret the skewness.
Consider the left skewed case for example (https://anasrana.github.io/ems-practicals/qq-plot.html)
1746684655365.webp

If I look at the plot, my first intuition is this: it looks to me that all points below the line (the points between -4 and around -1 of the normal distribution's quantiles) are smaller than expected. This is because they are below the line. Therefore, the points would be drawn from a distribution where smaller values are more probable. Of course, looking at the actual distribution, we can see that it is the other way around.
My second idea is then this: if we have many large datapoints (i.e. in the image above), the graph axes are going to be scaled such that the smaller values fall below the line, and thus, we have a distribution with a tendency towards large datapoints. Does any of this make sense? Could you help me deepen my understanding?
 
Physics news on Phys.org
bremenfallturm said:
If I look at the plot, my first intuition is this: it looks to me that all points below the line (the points between -4 and around -1 of the normal distribution's quantiles) are smaller than expected. This is because they are below the line. Therefore, the points would be drawn from a distribution where smaller values are more probable.
True.
bremenfallturm said:
Of course, looking at the actual distribution, we can see that it is the other way around.
No it isn't. It's exactly as expected.
bremenfallturm said:
My second idea is then this: if we have many large datapoints (i.e. in the image above), the graph axes are going to be scaled such that the smaller values fall below the line, and thus, we have a distribution with a tendency towards large datapoints. Does any of this make sense?
No, none of this makes sense.
 

Similar threads

  • · Replies 24 ·
Replies
24
Views
4K
  • · Replies 22 ·
Replies
22
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 12 ·
Replies
12
Views
5K
  • · Replies 56 ·
2
Replies
56
Views
4K
  • · Replies 65 ·
3
Replies
65
Views
4K
  • · Replies 1 ·
Replies
1
Views
8K
  • · Replies 2 ·
Replies
2
Views
5K