# Proportion of total variation is accounted for by explained variation

• scolty
In summary, the conversation is discussing the calculation of explained variation in a study on the relationship between "emotional stability" and performance in college. The information provided includes the mean, standard deviation, and sample size for "emotional stability" and the college average, as well as the pearson r value. The conversation also mentions the use of ANOVA methods and the possibility of calculating explained variation using the correlation coefficient. However, without more information about the subjects being divided into groups, it may not be possible to accurately calculate this measure.
scolty
Hi, I've come across a question in a stats book which asks the following:

Q: A study was undertaken to find the relationship between "emotional stability" and performance in college. The following results were obtained:
Emotional stability, Mean = 49, Standard Dev = 12
College Average, Mean = 1.35, Standard Dev = 0.5
pearson r = 0.5
n = 60

What proportion of total variation is accounted for by explained variation?

As far as i was aware, i need to know the actual values in order to be able to calculate this (ie summation of (Yprime - Ymean)^2) but the above info is the only supplied information. Have i missed something? If so id appreciate it if someone could point it out. Thanks.

scolty said:
Hi, I've come across a question in a stats book which asks the following:

Q: A study was undertaken to find the relationship between "emotional stability" and performance in college. The following results were obtained:
Emotional stability, Mean = 49, Standard Dev = 12
College Average, Mean = 1.35, Standard Dev = 0.5
pearson r = 0.5
n = 60

What proportion of total variation is accounted for by explained variation?

As far as i was aware, i need to know the actual values in order to be able to calculate this (ie summation of (Yprime - Ymean)^2) but the above info is the only supplied information. Have i missed something? If so id appreciate it if someone could point it out. Thanks.

Partitioning total variance among different sources is accomplished by Analysis of Variance (ANOVA) methods. In your case, there is only one named predictor variable, so you would partition the total variance between "emotional stability" and "other". ANOVA also compares variance within groups with variance between groups.

http://www.sjsu.edu/faculty/gerstman/StatPrimer/anova-a.pdf

Last edited:
scolty said:
As far as i was aware, i need to know the actual values in order to be able to calculate this (ie summation of (Yprime - Ymean)^2) but the above info is the only supplied information. Have i missed something? If so id appreciate it if someone could point it out. Thanks.

The Wikipedia currently has an interesting article on "explained variation" and mentions that some people consider it to be the square roof of the correlation coefficient. You could find the correlation coefficients since you know r.

I don't think you'll make any progress trying to use ANOVA calculations on the given information since you don't know have any information about the subjects being divided into groups.

100r^2 is the %-age of total variance explained by linear regression.

I understand your confusion about the proportion of total variation being accounted for by explained variation. In this context, explained variation refers to the variation in the dependent variable (college performance) that can be explained by the independent variable (emotional stability). The proportion of total variation accounted for by explained variation is also known as the coefficient of determination, or r-squared.

In order to calculate this value, we would need additional information such as the sum of squares for the total variation and the sum of squares for the explained variation. Without this information, it is not possible to accurately determine the proportion of total variation accounted for by explained variation.

It is also worth noting that the Pearson correlation coefficient (r) of 0.5 does not necessarily indicate a strong relationship between emotional stability and college performance. It is important to consider the strength of the relationship in addition to the proportion of variation accounted for.

In summary, without more information, it is not possible to accurately determine the proportion of total variation accounted for by explained variation in this study. It is important to carefully consider the data and statistical measures in order to make meaningful conclusions about the relationship between emotional stability and college performance.

## 1. What is the "proportion of total variation"?

The proportion of total variation refers to the amount of variability in a set of data that can be explained by a particular factor or variable. It is a statistical measure used to assess the strength of the relationship between variables.

## 2. How is the "proportion of total variation" calculated?

The proportion of total variation is calculated by dividing the explained variation (sum of squared differences between the predicted values and the mean) by the total variation (sum of squared differences between the observed values and the mean).

## 3. Why is it important to know the "proportion of total variation"?

Knowing the proportion of total variation can help us understand how much of the variability in a data set can be attributed to a specific factor or variable. This can provide insight into the strength of the relationship between variables and help in making informed decisions.

## 4. What does a high "proportion of total variation" indicate?

A high proportion of total variation indicates a strong relationship between the variables being analyzed. This means that a large portion of the variability in the data can be explained by the factor or variable being studied.

## 5. Can the "proportion of total variation" be negative?

No, the proportion of total variation cannot be negative as it is a ratio of two positive values. It will always be a value between 0 and 1, with higher values indicating a stronger relationship between variables.