What does uncorrelatedness and independence imply?

In summary, independence implies uncorrelated, but not the reverse. Higher moments could be different. Wenlong would recommend a book to you titled "Principal Component Analysis" 2nd edition, Springer, I.T. Jolliffe, 2002, ISBN: 0-387-95442-2. This book covers PCA in depth as well as why you would want to use it for various purposes including regression as opposed to say normal least squares. This book will probably answer dozens and dozens of your questions in the best manner.
  • #1
Wenlong
9
0
Dear all,

I'm currently reading papers of statistical modelling. I encountered with the concepts of uncorrelatedness and independence. I understand the definitioins, but I am wondering what is the real effects they can make in statistical analysis?

For example, I have a dataset and I use certain technique to separate the dataset to a set of uncorrelated vectors. On the other hand, I can separated the dataset into a set of independent vectors. What is the difference between these two sets of vectors? What property of uncorrelatedness and independence make such a difference?

Many thanks in advance.

Best wishes
Wenlong
 
Physics news on Phys.org
  • #2
First independence implies uncorrelated, but not the reverse. If you are dealing only with means and standard deviations it makes no difference. However higher moments could be different.
 
  • #3
Hi, Mathman

Thank you very much for your reply.

Actually, I'm using some statistical techniques to analysis the dataset (like Pricipal Component Analysis). Some of the technique result in a set of linearly uncorrelated vectors( basis), the others result in a set of statistical independent vectors. Then by plots and histograms I find they are different from each other. I know that independent implies uncorrelated, and not the vice versa. But what property of independent and uncorrelated vectors essentially make such a difference?

Many thanks in advance.

All bests
Wenlong
 
  • #4
And, why independent is loftier than uncorrelated? In other words, what is the advantage of independence than uncorrelatedness?

Thank you very much.

Wenlong
 
  • #5
Wenlong said:
And, why independent is loftier than uncorrelated? In other words, what is the advantage of independence than uncorrelatedness?

Thank you very much.

Wenlong

Hey Wenlong and welcome to the forums.

I would recommend a book to you titled "Principal Component Analysis" 2nd edition, Springer, I.T. Jolliffe, 2002, ISBN: 0-387-95442-2.

This book covers PCA in depth as well as why you would want to use it for various purposes including regression as opposed to say normal least squares.

This book will probably answer dozens and dozens of your questions in the best manner and I urge you to check it out.
 
  • #6
A simple example might help. Suppose you observed instances of a pair of random variables X and Y, and plotting them reveals a uniform scatter in a circle around the origin. The covariance would be zero, so by definition X and Y are uncorrelated. But they're clearly not independent. Whether the dependence matters depends on what decision you're trying to make.
 
  • #7
Wenlong said:
And, why independent is loftier than uncorrelated? In other words, what is the advantage of independence than uncorrelatedness?

Suppose you are writing a simulation of phenomena that involve random variables X and Y, each with a known distribution. If X and Y are independent random variables, they can be simulated by independently sampling their distributions. If you only know that X and Y are uncorrelated random varibles, you don't know exactly how to simulate them except in special cases where this information is enough to determine their joint distribution.

Analyzing a phenomena by breaking it up into components that are uncorrelated random variables is not a powerful technique unless this analysis tells you how to "put things back together" and use the components to simulate the phenomenon. If you assume the random variables are from special families (like multivariate normals) then knowing correlations does let you how to simulate the phenomena by using the components. However, if you are not in a such a special situation then correlations do not tell you how to re-create the phenomena from the components.
 
  • #8
In data analysis, one often measures the average value of a sample of the variable,Y, for a small range of sample values of another variable, X. This is a way to see if knowledge of X gives you information about Y. If the random variables are independent these conditional expectations are constant, that is they do not depend on the values of X. The values of X tell you nothing about the expected values of Y.

As X varies, the condition expectations of Y given X are a function of X. This function is known as the regression of Y on X. In general this function could have any shape but in the case where X and Y are jointly normal this function is linear. In this special case, the slope of the regression line is the correlation of X and Y times the square root of the variance of Y. So correlation describes the dependence of conditional expectations of Y given X in the case of a bivariate normal.

While the bivariate normal may seem like a special case, in practice many random variables are close to normal and if they are not one can form new random variables by taking sample averages and the distributions of the averages will be closer to normal. If then the two sample averages are jointly normal, their regression line will be linear.

Also in practice one may want to extract the linear part of a complex relationship between X and Y. A linear regression is an attempt to locate the linear part. Sometimes the linear part may only be valid for small values of X but fail miserably for large values. This happens in security prices where large values can mean extreme and unusual events - such as a country defaulting on its debt - when the business as usual relationships no longer hold.

Often the relation ship between two variables may be weak and difficult to detect, but if there are many weakly related variables it may be possible to discover significant relationships between aggregates of the variables - e.g. weighted averages. Principal components is a way of selecting aggregates and works well for nearly jointly normal random variables. Again, these components may only be valid for small values of the underlying random variables. For instance, one might decompose short term stock returns into principal components that are each represented by large baskets of stocks. These components may work well in stable markets but fall apart when that country defaults on its debt.

Abstractly, two random variables can be uncorrelated yet completely dependent. The dependence can be arbitrarily complicated. In practice, one generally hopes for a linear relationship ,at least in some range of the variables, because non-linear relationships are difficult to estimate and require large amounts of data.
 
  • #9
lavinia said:
In data analysis, one often measures the average value of a sample of the variable,Y, for a small range of sample values of another variable, X. This is a way to see if knowledge of X gives you information about Y. If the random variables are independent these conditional expectations are constant, that is they do not depend on the values of X. The values of X tell you nothing about the expected values of Y.

As X varies, the condition expectations of Y given X are a function of X. This function is known as the regression of Y on X. In general this function could have any shape but in the case where X and Y are jointly normal this function is linear. In this special case, the slope of the regression line is the correlation of X and Y times the square root of the variance of Y. So correlation describes the dependence of conditional expectations of Y given X in the case of a bivariate normal.

While the bivariate normal may seem like a special case, in practice many random variables are close to normal and if they are not one can form new random variables by taking sample averages and the distributions of the averages will be closer to normal. If then the two sample averages are jointly normal, their regression line will be linear.

Also in practice one may want to extract the linear part of a complex relationship between X and Y. A linear regression is an attempt to locate the linear part. Sometimes the linear part may only be valid for small values of X but fail miserably for large values. This happens in security prices where large values can mean extreme and unusual events - such as a country defaulting on its debt - when the business as usual relationships no longer hold.

Often the relation ship between two variables may be weak and difficult to detect, but if there are many weakly related variables it may be possible to discover significant relationships between aggregates of the variables - e.g. weighted averages. Principal components is a way of selecting aggregates and works well for nearly jointly normal random variables. Again, these components may only be valid for small values of the underlying random variables. For instance, one might decompose short term stock returns into principal components that are each represented by large baskets of stocks. These components may work well in stable markets but fall apart when that country defaults on its debt.

Abstractly, two random variables can be uncorrelated yet completely dependent. The dependence can be arbitrarily complicated. In practice, one generally hopes for a linear relationship ,at least in some range of the variables, because non-linear relationships are difficult to estimate and require large amounts of data.

Hi, Lavinia

Thank you very much for your reply, clear and easy to understand. Can I ask a further question base on this?

Since I am looking into different analysis techniques, like PCA or ICA. I find that they are using different order moments to calculate the components. For PCA is 2nd moment(cariance) while in ICA is 4th (kurtosis) or even higher.

By observing the result principal components and independent components, I find that ICA is better to represent local features while PCA is good at representing globle trends.

May I then deduce that higher order moment means that it is focus more at local area? Is there any theoretical proof to test and verify my assumption?

Many thanks in advance.

Best wishes
Wenlong
 

FAQ: What does uncorrelatedness and independence imply?

1. What is uncorrelatedness?

Uncorrelatedness refers to the lack of linear relationship between two variables. This means that as one variable changes, the other variable does not necessarily change in a predictable manner.

2. What does it mean for two variables to be independent?

Two variables are considered independent if there is no relationship or influence between them. This means that the value of one variable does not affect the value of the other variable.

3. How are uncorrelatedness and independence related?

Uncorrelatedness and independence are related in that if two variables are independent, they are also uncorrelated. However, two variables can be uncorrelated but not independent if there is a non-linear relationship between them.

4. What are the implications of uncorrelatedness and independence in statistical analysis?

Uncorrelatedness and independence have important implications in statistical analysis as they allow for the use of certain statistical techniques, such as regression analysis, which assume independence between variables. Additionally, uncorrelatedness and independence can help to identify patterns and relationships between variables.

5. Can two variables be correlated but still be independent?

No, two variables cannot be correlated and independent at the same time. Correlation implies some degree of relationship between variables, while independence implies no relationship. However, it is possible for two variables to be uncorrelated and independent.

Back
Top