Covariance between data stats problem

In summary, in class, the speaker is going over statistics and the formula for covariance between two sets of data. They discuss how the covariance changes when the elements of one set are multiplied by a constant. The speaker is unsure about their use of summation notation but is confirmed that their reasoning is correct.
  • #1
danago
Gold Member
1,123
4
Hi. At the moment in class we are going over statistics :yuck:

Anyway, the formula I've been using for covariance between two sets of data is:

[tex]
s_{xy} = \frac{1}{n}\sum\limits_{i = 1}^n {x_i y_i } - \overline x \overline y
[/tex]


Now, if i was to get a question such as:

"If all the elements of set 'x' are multiplied by 'a', what is the new covariance"

Would this be a valid in mathematical terms:

[tex]
\begin{array}{c}
s_{xy} = \frac{1}{n}(ax_1 y_1 + ax_2 y_2 + ... + ax_n y_n ) - a\overline x \overline y \\
= a[\frac{1}{n}(x_1 y_1 + x_2 y_2 + ... + x_n y_n ) - \overline x \overline y ] \\
= a[\frac{1}{n}\sum\limits_{i = 1}^n {x_i y_i } - \overline x \overline y ] \\
\end{array}
[/tex]

Therefore, if the elements of the set of data are multiplied by a constant, the covariance is also changed by that same factor? Has my working been valid in terms of the summation notation. The reason i ask is because we've never worked much with summations, so I am not 100% sure how to deal with them.

Thanks,
Dan.
 
Physics news on Phys.org
  • #2
Your reasoning is fine. Summations are just sums.
 
  • #3
Alright thanks for the confirmation. I just thought that maybe i was overlooking something, and had possibly made a mathematical error.

Thanks again.
Dan.
 

1. What is covariance and how is it calculated?

Covariance measures the relationship between two variables. It is calculated by multiplying the differences between each variable's data points and their respective means, and then dividing by the total number of data points.

2. How is covariance interpreted?

A positive covariance indicates that the two variables have a direct relationship, meaning that when one variable increases, the other tends to increase as well. A negative covariance indicates an inverse relationship, meaning that when one variable increases, the other tends to decrease.

3. What is the difference between covariance and correlation?

Covariance measures the strength and direction of the relationship between two variables, while correlation measures the strength and direction of the linear relationship between two variables.

4. How is covariance used in data analysis?

Covariance is used to understand the relationship between variables and to determine if there is a linear trend between them. It can also be used to identify which variables have the strongest relationship and can help in selecting the best variables for a regression model.

5. What are the limitations of using covariance?

Covariance does not indicate causation between variables, and it can be influenced by outliers in the data. Additionally, it can be difficult to interpret the magnitude of covariance as it is not standardized. Therefore, it is important to use covariance in conjunction with other statistical measures to fully understand the relationship between variables.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
919
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
959
  • Calculus and Beyond Homework Help
Replies
2
Views
274
Replies
3
Views
731
  • Precalculus Mathematics Homework Help
Replies
6
Views
2K
  • Linear and Abstract Algebra
Replies
13
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
Replies
18
Views
2K
Back
Top