How to evaluate quality of correlation?

  • Context: Undergrad 
  • Thread starter Thread starter simpleton
  • Start date Start date
  • Tags Tags
    Correlation Quality
Click For Summary
SUMMARY

This discussion focuses on evaluating the quality of correlation using a k by k correlation matrix with R-values ranging from -1 to 1. A threshold of 0.7 is suggested for significance, but the evaluation of higher R-values, such as 0.85, requires consideration of the data's nature and context. The conversation emphasizes that correlation values should not be interpreted in isolation; factors like data distribution and potential non-linear relationships must be taken into account to draw meaningful conclusions.

PREREQUISITES
  • Understanding of correlation coefficients, specifically R-values.
  • Familiarity with correlation matrices and their construction.
  • Knowledge of data visualization techniques to assess graphical relationships.
  • Awareness of data context and domain-specific considerations.
NEXT STEPS
  • Research methods for interpreting correlation coefficients in various contexts.
  • Learn about non-linear correlation techniques, such as Spearman's rank correlation.
  • Explore data visualization tools in R, such as ggplot2, for graphical analysis.
  • Study the implications of residual analysis in regression models.
USEFUL FOR

Data analysts, statisticians, and researchers who need to evaluate and interpret correlation relationships in datasets with multiple features.

simpleton
Messages
56
Reaction score
0
Hi,

I would like to know, how do I evaluate the quality of correlation?

Specifically, I have a set of N datapoints, each represented by k features. I want to know how the k features correlate with each other, and therefore, I created a k by k correlation matrix. I am using R-value, so the values range from -1 to 1. I was told to only look at R-values of at least 0.7, because anything lower than that does not mean much. However, I was wondering, how do I evaluate a correlation value of say, 0.85? If a feature x has a R-value of 0.7 with y and a R-value of 0.9 with z, what quantitative conclusions can I draw and what kind of statements can I make? For example, will I be able to say how much better the correlation between (x,z) is compared to (x,y)?

Thank you very much!
 
Physics news on Phys.org
Hey simpleton.

The answer to your question is highly subjective.

You need to not only take into account the correlation value, but the nature of the data which include the graphical nature as well as the domain in which the sample is represented.

If there is a non-linear trend with what looks to be a seemingly low-residual, then using the linear correlation method will not really emphasize the true relationship between the variables.

The above points vary depending on what kind of data it is, where it came from, what its used for, and what questions its trying to answer.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
6K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
9
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K