• MATLAB
honeysyd

I have two sets of dat as below. A problem is that MATLAB and excel returns a different value of correlation coefficient (i.e., 0.801 from Excel and 0.786 from Matlab).

For Matlab, I used the function: corrcoef(DataA, DataB). For excel, I used the function "Correl".

Would you please advise me which one is correct? I found that the excel uses different function from the one used in Matlab. I do not know how to write a function here, so I linked them as below.

http://office.microsoft.com/en-us/excel-help/correl-HP005209023.aspx [Broken]
http://www.mathworks.com.au/help/techdoc/ref/corrcoef.html

Data A
0.56
0.63
0.54
0.59
0.57
0.69

Data B
1.16
1.68
1.24
1.33
1.28
1.52

Last edited by a moderator:

The MATLAB function returns a matrix. Reveal its other entries to us.

Try using MATLAB on the numerical example given in the Excel documentation.

As to "right" vs "wrong", it is always possible for computer code to a have errors. But the difference may also be due to the fact that statistics has subjective aspects. To see that, you must understand the usual scenario for "estimation"

That scenario is:
You have a probability distributions with some unknown parameter (such as the correlation coefficient in a joint distribution of 2 random variables)

You have data.

You do a computation on the data to estimate one of the unknown parameters. The function you use is called "an estimator". For a given parameter, there can be different estimators. Since the data is random, an estimator is a random variable. Different esimatators can have different aspects that argue in favor of using them. It is a subjective decision, which estimator you want to use. The odds are that any estimator will be "wrong" (i.e. not exactly equal to the parameter you are trying to estimate).

To use proper teminology, you should distinguish between an unknown parameter and its estimator and any "standard" related statistic computed from the sample. For example, there is a "population variance", several different "estimators for the population variance" and there is the "sample variance" (which can be defined two different ways, depending on which textbook you use).

Last edited:
honeysyd

Thank you very much Stephen.

It was a great help. Beside your reply, I found another reason.

Data A appears to be two decimal numbers in excel, but actually Data A is four decimal numbers.

When I copy and paste Data A from Excel to Matlab, only two decimal numbers were copied and pasted, while Excel calculated the correlation coefficient with four decimal number of Data A.

I hope this helps others having a same problem.

Number Nine

It's hard to say exactly what is happening, but if both Matlab and Excel are calculating the same statistic, the difference may be due to rounding errors. Excel is notorious for things like this.