Hi All,

I am doing a small data project that consists of classifying Baseball players as being either overvalued or

undervalued. I have two valuations V1, V2 for each of the players, though in different "currencies" and I am trying to see how to express both in the same currency. I have been going over M.Lewis' book " Moneyball" , but I don't want to copy his ideas (or , more accurately, the ideas he describes in the book)

One of the valuations, the first, say v1, is just by salary , the average salary of the previous 3 years. The second one, say v2, though , is a weighted sum of player statistics, and would have "runs" units. The statistics I am considering are : Average, OBA, Number of Hrs , Hr/At bat, etc : the key idea is that the game of Baseball is about runs, a team wins the game by scoring more runs than its opponent; then the statistics that correlate highly with run scoring or run prevention (meaning preventing the opposing team from scoring runs) are weighted highly towards the player's value V2. Then I want to compute a ratio V1/V2 of the two indices. But I want this ratio to be unit-free, meaning both valuations must be expressed in the same units. Unfortunately, V2 is in "runs" units, so I want to transform it in a reasonable way into $ units, which are the units V1 appears in.

My idea was to find a way of transforming the latter score ( the weighted sum of selected statistics ) into the first type, say calling it v2' i.e., to monetize the weighted sum index by regressing one valuation against the other, i.e., regressing V1 against V2 , and using the resulting data ( assuming the regression is significant, i.e., that we are confident - enough that the slope of the regression line is not 0 ).

Does this regression idea make sense? If you are not familiar with Baseball, I think we can do something very similar with Soccer.

Ultimately, the decidion for how fairly a player is valuated would be given by:c

1) if V1/V2' >1 , then the player is overvalued

2) If V1/V2'=1 , then the player is accurately valued

3) If V1/V2' <1 , then the player is undervalued.

Any ideas on how to monetize the index V2 into V2', so that the quotient V1/V2' is unit-free?

I have thought of regressing one index against another

I am doing a small data project that consists of classifying Baseball players as being either overvalued or

undervalued. I have two valuations V1, V2 for each of the players, though in different "currencies" and I am trying to see how to express both in the same currency. I have been going over M.Lewis' book " Moneyball" , but I don't want to copy his ideas (or , more accurately, the ideas he describes in the book)

One of the valuations, the first, say v1, is just by salary , the average salary of the previous 3 years. The second one, say v2, though , is a weighted sum of player statistics, and would have "runs" units. The statistics I am considering are : Average, OBA, Number of Hrs , Hr/At bat, etc : the key idea is that the game of Baseball is about runs, a team wins the game by scoring more runs than its opponent; then the statistics that correlate highly with run scoring or run prevention (meaning preventing the opposing team from scoring runs) are weighted highly towards the player's value V2. Then I want to compute a ratio V1/V2 of the two indices. But I want this ratio to be unit-free, meaning both valuations must be expressed in the same units. Unfortunately, V2 is in "runs" units, so I want to transform it in a reasonable way into $ units, which are the units V1 appears in.

My idea was to find a way of transforming the latter score ( the weighted sum of selected statistics ) into the first type, say calling it v2' i.e., to monetize the weighted sum index by regressing one valuation against the other, i.e., regressing V1 against V2 , and using the resulting data ( assuming the regression is significant, i.e., that we are confident - enough that the slope of the regression line is not 0 ).

Does this regression idea make sense? If you are not familiar with Baseball, I think we can do something very similar with Soccer.

Ultimately, the decidion for how fairly a player is valuated would be given by:c

1) if V1/V2' >1 , then the player is overvalued

2) If V1/V2'=1 , then the player is accurately valued

3) If V1/V2' <1 , then the player is undervalued.

Any ideas on how to monetize the index V2 into V2', so that the quotient V1/V2' is unit-free?

I have thought of regressing one index against another

Last edited: