Is It a Skewness or Kurtosis Issue in My Data Distribution?

  • Context: Graduate 
  • Thread starter Thread starter kimberley
  • Start date Start date
Click For Summary
SUMMARY

The discussion centers on the distinction between skewness and kurtosis in a data distribution with 377 data points, an arithmetic mean of 8.56, a standard deviation of 0.77, skewness of -0.37, and kurtosis of 0.4. The Jarque-Bera test statistic of 11.23 indicates a significant departure from normality. The primary concern is whether the 15 data points below 2 standard deviations from the mean represent a skewness issue or a kurtosis issue. The consensus leans towards skewness being the more critical factor affecting the distribution's normality, despite the presence of leptokurtic characteristics.

PREREQUISITES
  • Understanding of statistical concepts: skewness and kurtosis
  • Familiarity with normal distribution properties
  • Knowledge of the Jarque-Bera test for normality
  • Basic statistical analysis skills using software like R or Python
NEXT STEPS
  • Explore the implications of skewness in data distributions
  • Learn about the Jarque-Bera test and its application in assessing normality
  • Investigate methods for addressing skewness in datasets
  • Study the effects of kurtosis on statistical analysis and interpretation
USEFUL FOR

Statisticians, data analysts, researchers conducting experiments, and anyone involved in data distribution analysis will benefit from this discussion.

kimberley
Messages
14
Reaction score
0
I've been conducting a series of natural experiments and examining their distributions for normality/departures therefrom. One distribution, in particular, resulted in a conversation with a friend and some resulting confusion about whether its primary infirmity is a skewness problem or a kurtosis problem. The question at hand is likely to be pedestrian to most of you, but it's an important basic distinction that I obviously need to grasp and, therefore, your comments will be very appreciated.

The distribution at issue has 377 data points (N=377). The arithmetic mean is 8.56. The standard deviation from the arithmetic mean is .77. Skewness is -.37. Kurtosis is .4. The distribution's Jarque-Bera test statistic is 11.23--thus seriously challenging the likelihood of normality at about the Chi-square (.005;2df) critical level.

With these descriptives in mind, the confusion that I speak of relates to a particular feature of the distribution--the number of data points (15) that are below 2 standard deviations from the arithmetic mean. That is, there are 7 data points above 10.1, which is 2 standard deviations ABOVE the arithmetic mean. As noted, however, there are 15 data points that are below 7.02, which is 2 standard deviations BELOW the arithmetic mean. In a normal distribution, with skewness and excess kurtosis of 0, I understand that we'd expect about 19 (377 x .05) total data points beyond +/- 2 standard deviations, with 10 or 11 at each extreme. In this distribution, we have slightly more, with 22 total data points located either above or below 2 standard deviations from the arithmetic mean, but we also have a "non-normal" number of data points at the lower extreme.

In the discussion I reference above, I expressed the view that the normality of this distribution is most challenged by its skewness as opposed to being leptokurtic ("fat tails"). Surely, it is leptokurtic, as shown by its positive kurtosis of .4, and 3 additional data points at the extremes (22 as opposed to 19 in total), but I don't think we would have discussed this distribution if there were 11 and 11 data points +/- 2 standard deviations respectively.

So, with all of this in mind, in the most definitional sense (if not all others) are the 15 data points that are 2 standard deviations below the arithmetic mean a skewness problem or a kurtosis (lepto-"fat tails") problem? Beyond that, I'd also be really interested to know what additional thoughts, if any, you have about this distribution based on the descriptives.

Thank you again.

Kimberley
 
Physics news on Phys.org
Those 15 are a 4th-order problem in addition to contributing to a 3rd-order problem.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
6K
  • · Replies 2 ·
Replies
2
Views
48K
  • · Replies 6 ·
Replies
6
Views
4K