How accurate is the correlation coefficient for log-log data?

  • Thread starter Thread starter Old Guy
  • Start date Start date
  • Tags Tags
    Fit
Click For Summary
The discussion focuses on the validity of using the correlation coefficient for log-log transformed data in linear regression analysis. A user has successfully calculated the slope from their log-transformed dataset but questions the appropriateness of the correlation coefficient for assessing goodness of fit. A response clarifies that while the correlation coefficient may not be valid for the original data, the R² statistic is suitable for evaluating the fit of the transformed data. It emphasizes the importance of understanding the transformations applied to the data. Thus, the R² statistic is recommended for assessing goodness of fit in log-log regression contexts.
Old Guy
Messages
101
Reaction score
1
I am familiar with linear regression and the correlation coefficient. My current problem involves a data set that is pretty linear on a log-log plot. I have calculated the slope by taking logs of all my x's and y's, and doing the linear regression on the transformed data set. The resulting slope appears to be correct, and I'm happy with that.

The question is, how good a fit do I have on the data? I planned to simply calculate the correlation coefficient on the transformed data, but a coworker challenged this - said that the transformations alter the measurements in a way that makes the typical correlation coefficient calculation invalid.

Is that correct? And if it is, what is the correct measure of goodness of fit for log-log data? Thanks.
 
Physics news on Phys.org
Old Guy said:
I am familiar with linear regression and the correlation coefficient. My current problem involves a data set that is pretty linear on a log-log plot. I have calculated the slope by taking logs of all my x's and y's, and doing the linear regression on the transformed data set. The resulting slope appears to be correct, and I'm happy with that.

The question is, how good a fit do I have on the data? I planned to simply calculate the correlation coefficient on the transformed data, but a coworker challenged this - said that the transformations alter the measurements in a way that makes the typical correlation coefficient calculation invalid.

Is that correct? And if it is, what is the correct measure of goodness of fit for log-log data? Thanks.

If it's linear with a log-log transform, then the correlation is based on a log-log data set with transformed expectations. As along as it's clear what the data is, I don't see a problem.

EDIT: I forgot to answer your second question. If you have a linear plot in a standard regression on transformed data, the R^2 statistic should be appropriate for the transformed data (but only for the transformed data.)
 
Last edited:
The standard _A " operator" maps a Null Hypothesis Ho into a decision set { Do not reject:=1 and reject :=0}. In this sense ( HA)_A , makes no sense. Since H0, HA aren't exhaustive, can we find an alternative operator, _A' , so that ( H_A)_A' makes sense? Isn't Pearson Neyman related to this? Hope I'm making sense. Edit: I was motivated by a superficial similarity of the idea with double transposition of matrices M, with ## (M^{T})^{T}=M##, and just wanted to see if it made sense to talk...

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 30 ·
2
Replies
30
Views
4K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 7 ·
Replies
7
Views
2K