Statistics - What should I conclude about this data?

  • Thread starter Thread starter musicgold
  • Start date Start date
  • Tags Tags
    Data Statistics
Click For Summary

Homework Help Overview

The discussion revolves around interpreting data related to earthquake magnitudes and their associated death tolls. Participants are examining the implications of outliers and the organization of the data presented in a statistics slide.

Discussion Character

  • Exploratory, Assumption checking, Conceptual clarification

Approaches and Questions Raised

  • Participants question the validity of conclusions drawn from the data, particularly regarding the relationship between earthquake magnitude and fatalities. There are discussions about the influence of population density and building codes on the data interpretation. Some suggest excluding outliers and normalizing data based on population.

Discussion Status

The conversation is ongoing, with various perspectives being explored. Some participants have offered insights into potential data normalization methods and the significance of correlation coefficients, while others express caution about drawing general conclusions from the limited dataset.

Contextual Notes

There is a mention of the data being poorly organized and the challenges posed by varying population densities and building codes in different areas. The discussion highlights the complexity of establishing meaningful correlations with a small sample size.

musicgold
Messages
303
Reaction score
19
Hi,

This is not really a homework question. Attached is a slide from a statistics presentation I found on the web.

I am not sure what conclusion I can draw from this data if I ignore the outlier (1906). Saying "higher magnitude earthquakes result in fewer deaths" seems totally counteractive.

Thanks.
 

Attachments

Physics news on Phys.org
musicgold said:
Hi,

This is not really a homework question. Attached is a slide from a statistics presentation I found on the web.

I am not sure what conclusion I can draw from this data if I ignore the outlier (1906). Saying "higher magnitude earthquakes result in fewer deaths" seems totally counteractive.

Thanks.

It does look funny, but the data are not well organized. The high magnitude low death datapoint is for an area with very sparse population, so should be excluded. A better graph would compare quakes in similar population areas with similar building codes.
 
  • Like
Likes   Reactions: 1 person
Ok. Thanks.
 
musicgold said:
I am not sure what conclusion I can draw from this data if I ignore the outlier (1906). Saying "higher magnitude earthquakes result in fewer deaths" seems totally counteractive.

I think the word phrase you are looking for is 'counter intuitive' rather than 'counteractive'.
 
berkeman said:
It does look funny, but the data are not well organized. The high magnitude low death datapoint is for an area with very sparse population, so should be excluded. A better graph would compare quakes in similar population areas with similar building codes.
Might be able to do a bit better than that. If you knew the density of population in each area you could normalise the data by taking the deaths as fraction of population. Looks like population density is influencing the numbers rather more than the severity of the earthquake is. Question is, what area to take around each site? Ideally, it would be some kind of integral wrt radius from epicentre, something like ##\int_{r=0}\frac{density(r)}{1+k r^n}rdr##, but maybe you could just fix on a sufficiently large circle to encompass all deaths.
 
  • Like
Likes   Reactions: 1 person
Also, you could investigate what size of correlation coefficient is significant, for such a small sample size.

Or even whether the notion of "correlation" is meaningful at all, with so few data points. To take a ridiculously extreme example, if you only have two data points, you will always get a correlation of +1 or -1.

If you knew the density of population in each area you could normalise the data by taking the deaths as fraction of population.
Maybe ... but the relevant building codes would be different in a low population density rural area, compared with skyscrapers in a city center. And earthquakes don't necessarily happen where planners think they are most likely to happen.

The danger of going down this route is that you do a lot of research and end up with 7 different "stories" about 7 different events, but you still can't really draw any general conclusions because there events don't have much in common except they were all "earthquakes".
 
  • Like
Likes   Reactions: 1 person

Similar threads

  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 14 ·
Replies
14
Views
5K
  • · Replies 23 ·
Replies
23
Views
4K