Correlation to Winning question

  • Context: MHB 
  • Thread starter Thread starter vballer
  • Start date Start date
  • Tags Tags
    Correlation
Click For Summary
SUMMARY

This discussion focuses on analyzing volleyball statistics to determine which metrics correlate with winning games. Key metrics include Points (Pts), Errors (Err), Differential (Diff), and Efficiency Percentage (EFF%). The consensus is that EFF% serves as the best predictor of winning, with values greater than or equal to 0.25 indicating a win, while values less than or equal to 0 predict a loss. Additionally, Diff can also serve as a predictor, though its range for uncertainty is broader.

PREREQUISITES
  • Understanding of statistical correlation and regression analysis
  • Familiarity with volleyball statistics and their significance
  • Knowledge of data analysis tools such as Python or R
  • Experience with data visualization techniques to interpret results
NEXT STEPS
  • Learn how to perform correlation analysis using Python's Pandas library
  • Explore logistic regression techniques to predict binary outcomes
  • Investigate data visualization tools like Matplotlib or Seaborn for better insights
  • Study the concept of confidence intervals to assess prediction reliability
USEFUL FOR

Data analysts, sports statisticians, volleyball coaches, and anyone interested in using statistical methods to improve game performance and strategy.

vballer
Messages
2
Reaction score
0
I have a bunch of data for volleyball and I am trying to figure out how correlated certain stats are to winning a game. Here is a small example of the data set.

[TABLE="width: 500"]
[TR]
[TD]Pts
[/TD]
[TD]Err
[/TD]
[TD]Diff
[/TD]
[TD]EFF%[/TD]
[TD]Win
[/TD]
[/TR]
[TR]
[TD]21
[/TD]
[TD]10
[/TD]
[TD]11
[/TD]
[TD].360
[/TD]
[TD]1
[/TD]
[/TR]
[TR]
[TD]21
[/TD]
[TD]10
[/TD]
[TD]11
[/TD]
[TD].350
[/TD]
[TD]1
[/TD]
[/TR]
[TR]
[TD]18
[/TD]
[TD]11
[/TD]
[TD]7
[/TD]
[TD].250
[/TD]
[TD]1
[/TD]
[/TR]
[TR]
[TD]14
[/TD]
[TD]6
[/TD]
[TD]8
[/TD]
[TD].280
[/TD]
[TD]1
[/TD]
[/TR]
[TR]
[TD]19
[/TD]
[TD]10
[/TD]
[TD]9
[/TD]
[TD].380
[/TD]
[TD]1
[/TD]
[/TR]
[TR]
[TD]17
[/TD]
[TD]6
[/TD]
[TD]11
[/TD]
[TD].300
[/TD]
[TD]1
[/TD]
[/TR]
[TR]
[TD]12
[/TD]
[TD]9
[/TD]
[TD]3
[/TD]
[TD].200
[/TD]
[TD]0
[/TD]
[/TR]
[TR]
[TD]14
[/TD]
[TD]10
[/TD]
[TD]4[/TD]
[TD].100
[/TD]
[TD]0
[/TD]
[/TR]
[TR]
[TD]11
[/TD]
[TD]8
[/TD]
[TD]3
[/TD]
[TD].050
[/TD]
[TD]0
[/TD]
[/TR]
[/TABLE]

I am trying to determine which of these items are most correlated to winning (the last column). As you can see if it is a win the value is 1 and if not then it is a 0. In addition, to determining the most useful stats for determining a win, I would like to know what levels each of the first 4 columns should be in order to generate a win at a certain confidence level.

Any help with this is greatly appreciated.

Thanks
Jamie
 
Physics news on Phys.org
vballer said:
I have a bunch of data for volleyball and I am trying to figure out how correlated certain stats are to winning a game. Here is a small example of the data set.

[TABLE="width: 500"]
[TR]
[TD]Pts[/TD]
[TD]Err[/TD]
[TD]Diff[/TD]
[TD]EFF%[/TD]
[TD]Win[/TD]
[/TR]
[TR]
[TD]21[/TD]
[TD]10[/TD]
[TD]11[/TD]
[TD].360[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]21[/TD]
[TD]10[/TD]
[TD]11[/TD]
[TD].350[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]18[/TD]
[TD]11[/TD]
[TD]7[/TD]
[TD].250[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]14[/TD]
[TD]6[/TD]
[TD]8[/TD]
[TD].280[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]19[/TD]
[TD]10[/TD]
[TD]9[/TD]
[TD].380[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]17[/TD]
[TD]6[/TD]
[TD]11[/TD]
[TD].300[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]12[/TD]
[TD]9[/TD]
[TD]3[/TD]
[TD].200[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]14[/TD]
[TD]10[/TD]
[TD]4[/TD]
[TD].100[/TD]
[TD]0[/TD]
[/TR]
[TR]
[TD]11[/TD]
[TD]8[/TD]
[TD]3[/TD]
[TD].050[/TD]
[TD]0[/TD]
[/TR]
[/TABLE]

I am trying to determine which of these items are most correlated to winning (the last column). As you can see if it is a win the value is 1 and if not then it is a 0. In addition, to determining the most useful stats for determining a win, I would like to know what levels each of the first 4 columns should be in order to generate a win at a certain confidence level.

Any help with this is greatly appreciated.

Thanks
Jamie

I'm not sure that you strictly mean correlated, the best predictor for winning for this data is EFF% where EFF%>=0.25 predicts a win and EFF%<=0 predicts a loss and between 0.2 and 0.25 is a no-man's land.

Diff will also provide a perfect predictor for this data but the no-man's land is relatively wider.

Without knowing what the real question is there is little more that it is worth saying.

CB
 
Let me try this again with the actual data set attached.

Most importantly I am not looking for the actual answer but more how to derive the answer. I guess when you say predictor how did you determine this?

To recap, I have 7 columns of stats and the 8th column is 1 if the game was won and 0 if it was lost. So how do I determine what the best predictors of a win are?

Thanks

View attachment 148
 

Attachments

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • Poll Poll
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
1K
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K