Calculating Covariance in Bivariate Normal Distribution - Step by Step Guide

  • Thread starter jimmy1
  • Start date
  • Tags
    Covariance
In summary, the conversation discusses the need to find the covariance or correlation between two normally distributed dependent random variables, X and Y. The formula for the covariance is given but it requires the correlation coefficient, which is the parameter that needs to be found. The conversation explores different methods of obtaining the correlation coefficient, including using empirical data and theoretical distributions such as the bivariate normal distribution. It is concluded that in order to use any multivariate distribution, knowledge of the correlation or covariance between the random variables is necessary, and the only way to obtain this is through empirical data or given theoretical information.
  • #1
jimmy1
61
0
I have 2 normally distributed dependent random variables, X and Y, and I have the mean and variance of both of them, and I want to find the covariance (or correlation) between X and Y.

Now the formula for the covariance is Cov(XY) = E(XY) - E(X)E(Y). So I tried to calculate E(XY) via the bivariate normal distribution, but it seems that to use the bivariate normal I need to provide the correlation coefficent as a parameter, but this is the parameter that I'm trying to actually find.

So how would I find an expression for the covariance of X and Y? To find E(XY), it seems you need to use P(XY), but to use this bivariate probability you need to provide the covariance (or correlation coefficent). So how do I get around this problem??
 
Physics news on Phys.org
  • #2
You can numerically calculate the covariance by taking multiple observations from the two distributions and multiplying their values. The mean of their product is E(XY). For example, let X = daily humidity and Y = daily temperature. If I measure humidity and temperature over 100 days (or at 100 locations), I will have 100 ordered pairs of (X,Y), from which I can calculate E(XY).
 
  • #3
Yes, with observations from the two distribitions I could calculate E(XY), but I need an expression for the covariance of X and Y, and not really an empirical result. That is given (X,Y) are both normally distributed with mean (m1, m2), and standard deviation (s1,s2) how do I find the covariance of XY, when they are dependent.

What really confuses me is that for all these bivariate distributions you need to supply a correlation coefficent or some sort of covariance parameter, yet there seems to be no way of actually obtaining these covariances for dependent variables?

So how does one actually use any of these bivariate distributions if it's not possible to theoretically get the covariances??
 
  • #4
If you think that two variables are jointly distributed but you only know the marginal distributions, the simplest way to obtain the joint dist. is to calculate the covariance empirically. Other than that, there are copulas: http://en.wikipedia.org/wiki/Copula_(statistics)
 
  • #5
jimmy1 said:
I have 2 normally distributed dependent random variables, X and Y, and I have the mean and variance of both of them, and I want to find the covariance (or correlation) between X and Y.

Now the formula for the covariance is Cov(XY) = E(XY) - E(X)E(Y). So I tried to calculate E(XY) via the bivariate normal distribution, but it seems that to use the bivariate normal I need to provide the correlation coefficent as a parameter, but this is the parameter that I'm trying to actually find.

So how would I find an expression for the covariance of X and Y? To find E(XY), it seems you need to use P(XY), but to use this bivariate probability you need to provide the covariance (or correlation coefficent). So how do I get around this problem??

When dealing with the theoretical distribution only (no empirical data), there is no way around the "problem". A bivariate normal distribution is defined by the covariance (or equivalent alternatives). If you don't have something, it remains undefined.
 
  • #6
mathman said:
When dealing with the theoretical distribution only (no empirical data), there is no way around the "problem". A bivariate normal distribution is defined by the covariance (or equivalent alternatives). If you don't have something, it remains undefined.

So would I be right in concluding that when using any sort of multivariate distribution you have to know the correlation (covariance) between the random variables, and the only way to get these covariances is empirically?
 
  • #7
jimmy1 said:
So would I be right in concluding that when using any sort of multivariate distribution you have to know the correlation (covariance) between the random variables, and the only way to get these covariances is empirically?

Yes - unless there is some given theoretical information you can use.
 
  • #8
http://www.math.ethz.ch/%7Estrauman/preprints/pitfalls.pdf

Where not otherwise stated, we consider bivariate distributions of the random
vector (X; Y)^t.
Fallacy 1. Marginal distributions and correlation determine the joint distribution.
This is true if we restrict our attention to the multivariate normal distribution or
the elliptical distributions. For example, if we know that (X; Y)^t have a bivariate
normal distribution, then the expectations and variances of X and Y and the correlation
r(X; Y) uniquely determine the joint distribution. However, if we only know
the marginal distributions of X and Y and the correlation then there are many possible
bivariate distributions for (X; Y)^t. The distribution of (X; Y)^t is not uniquely
determined by F1, F2 and r(X; Y ). We illustrate this with examples, interesting in
their own right.
 
Last edited by a moderator:

1. What is covariance and why is it important in statistics?

Covariance is a measure of how two variables change in relation to each other. It indicates the direction of the relationship between two variables, whether they have a positive, negative, or no relationship at all. It is important in statistics because it helps to understand the strength and direction of the relationship between two variables, and can be used to make predictions and analyze data.

2. How do you calculate covariance?

The formula for covariance is: cov(X,Y) = Σ [ (xi - μx) * (yi - μy) ] / n-1, where xi and yi are the individual data points, μx and μy are the means of X and Y, and n is the number of data points. Alternatively, you can also use software such as Excel or statistical packages like SPSS to calculate covariance.

3. What is the difference between covariance and correlation?

Covariance and correlation are both measures of the relationship between two variables, but covariance is a measure of the direction and strength of the relationship, while correlation is a standardized measure that also takes into account the units of measurement of the variables. Correlation ranges from -1 to 1, while covariance can range from negative infinity to positive infinity.

4. How is covariance used in data analysis?

Covariance is commonly used in data analysis to understand the relationship between two variables. It can help to identify patterns, trends, and potential cause and effect relationships between variables. It is also used in regression analysis to determine the strength and direction of the relationship between the independent and dependent variables.

5. What are some limitations of using covariance?

Covariance can be affected by the scale of measurement of the variables, making it difficult to compare across different datasets. It also does not indicate causality, so even if a strong covariance is found between two variables, it does not necessarily mean that one causes the other. Additionally, it can be impacted by outliers in the data, making it less reliable in those cases.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
951
  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
2
Replies
43
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
12
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
803
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
834
Back
Top