- #1
LightbulbSun
- 65
- 2
I had a debate tonight with someone about winning projections in baseball. I'll try to condense this as much as possible so it isn't a long read.
So basically I was proposing that if I were MLB Commissioner I would do away with divisions, go back to an AL-NL league format, and have the top 4 winning percentages from each league clinch a postseason berth and play the current three round format.
To balance things out under this new system, I also proposed I would even out the scheduling disparity by having every AL team face every AL team, and every NL team face every NL team an equal number of times (12 games each).
So the dispute came up when I said the 2006 St.Louis Cardinals (83-78) who were World Series Champions that year would of been eliminated from the postseason that year under my system. He argued that I can't assume that because of the new scheduling, and he said their record would improve under the new scheduling, enough to make the postseason under my system.
So what he did to prove his point was to take St.Louis' winning percentages against each NL division from that year and extrapolated it under my scheduling system. So it looked like this:
(.516)(5)(.482)(6)(.677)(5) / 16
5 teams in the East, 6 in the Central, and 5 in the West. The 16 is the total number of teams in the National League.
He said that got him 90 wins, which would of put them in the playoffs under my system, but there are some flaws to this method.
First of all the disparity of games played against each division is almost triple the difference. In 2006, the Cards played 81 games against the Central, 34 against the West and 31 against the East. Under my scheduling system, everything is pretty much neutralized. The Cards would play 60 games against the East, 72 against the Central and 60 against the West. So we're doubling the sample size of games played against for two divisions, while the Central sample size is slightly reduced.
So the .677 WPct against the West you would expect a regression for with a much larger sample size. While you wouldn't expect much change with the Central record (.482) since it occurred under a larger sample size even though an 81 game sample isn't that large either, but it's much larger than a 34 game sample.
So I told him you have to expect a regression for the West record, which is why the 90 win total is skewed because he's assuming that win percentage remains static despite doubling the sample size against the West. He then says that you would also have to expect an increase in win percentage for the Central then, and I don't see why he assumes that?
Who was more correct in their methodology? My conclusion was that even under my scheduling system, the Cards record wouldn't have changed all that much. He's saying it increases their record by 7 games.
So basically I was proposing that if I were MLB Commissioner I would do away with divisions, go back to an AL-NL league format, and have the top 4 winning percentages from each league clinch a postseason berth and play the current three round format.
To balance things out under this new system, I also proposed I would even out the scheduling disparity by having every AL team face every AL team, and every NL team face every NL team an equal number of times (12 games each).
So the dispute came up when I said the 2006 St.Louis Cardinals (83-78) who were World Series Champions that year would of been eliminated from the postseason that year under my system. He argued that I can't assume that because of the new scheduling, and he said their record would improve under the new scheduling, enough to make the postseason under my system.
So what he did to prove his point was to take St.Louis' winning percentages against each NL division from that year and extrapolated it under my scheduling system. So it looked like this:
(.516)(5)(.482)(6)(.677)(5) / 16
5 teams in the East, 6 in the Central, and 5 in the West. The 16 is the total number of teams in the National League.
He said that got him 90 wins, which would of put them in the playoffs under my system, but there are some flaws to this method.
First of all the disparity of games played against each division is almost triple the difference. In 2006, the Cards played 81 games against the Central, 34 against the West and 31 against the East. Under my scheduling system, everything is pretty much neutralized. The Cards would play 60 games against the East, 72 against the Central and 60 against the West. So we're doubling the sample size of games played against for two divisions, while the Central sample size is slightly reduced.
So the .677 WPct against the West you would expect a regression for with a much larger sample size. While you wouldn't expect much change with the Central record (.482) since it occurred under a larger sample size even though an 81 game sample isn't that large either, but it's much larger than a 34 game sample.
So I told him you have to expect a regression for the West record, which is why the 90 win total is skewed because he's assuming that win percentage remains static despite doubling the sample size against the West. He then says that you would also have to expect an increase in win percentage for the Central then, and I don't see why he assumes that?
Who was more correct in their methodology? My conclusion was that even under my scheduling system, the Cards record wouldn't have changed all that much. He's saying it increases their record by 7 games.