- #1

iambasil

- 14

- 0

Hello,

I'm looking at some sporting data (similar to goals in a match) and trying to figure out what distribution applies to their count per match.

Typically, Poisson is used in the industry to model the distribution. When I look at the historical events, poisson isn't too bad, but tends to over estimate the lower numbers and the higher numbers. The reality is that the distribution is a tighter fit around the mean number of 'goals' in the match and the kurtosis of the pdf derived from observations is higher than poisson would suggest.

I've attached a worksheet (had to zip it as it was 125k) with data and analysis I did on this - my conclusion above was based on 4 years worth of data (804 events). However, in the attached I also broke this down into each year - and the general conclusion about the error in Poisson seems to hold true for each and every year.

What I would really like to do is to learn how to create a pdf based on a transformed Poisson - transformed based on my learnings looking at the historical data. I'll actually be applying the distribution to various other scenarios with different criteria, hence a transform of Poisson would be more useful to me than simply using the derived pdf based on historical observations around a single mean (e.g. E(x) is very different when it is raining).

It was many many years ago that I was a student but working on shaking off the rust. Forgive me if I've mis-used words such as 'transform'. Really appreciate your help!

Many thanks,

Basil

I'm looking at some sporting data (similar to goals in a match) and trying to figure out what distribution applies to their count per match.

Typically, Poisson is used in the industry to model the distribution. When I look at the historical events, poisson isn't too bad, but tends to over estimate the lower numbers and the higher numbers. The reality is that the distribution is a tighter fit around the mean number of 'goals' in the match and the kurtosis of the pdf derived from observations is higher than poisson would suggest.

I've attached a worksheet (had to zip it as it was 125k) with data and analysis I did on this - my conclusion above was based on 4 years worth of data (804 events). However, in the attached I also broke this down into each year - and the general conclusion about the error in Poisson seems to hold true for each and every year.

What I would really like to do is to learn how to create a pdf based on a transformed Poisson - transformed based on my learnings looking at the historical data. I'll actually be applying the distribution to various other scenarios with different criteria, hence a transform of Poisson would be more useful to me than simply using the derived pdf based on historical observations around a single mean (e.g. E(x) is very different when it is raining).

It was many many years ago that I was a student but working on shaking off the rust. Forgive me if I've mis-used words such as 'transform'. Really appreciate your help!

Many thanks,

Basil