- #1
jordanstreet
- 7
- 0
Calculating "match" between two data sets
Hey guys, I'm developing a program for comparing the effects of various terms in a Monte Carlo experiment. Right now I have it so you can visually see the effect of "switching" terms on and off and need a way of quantifying how much two lines "match".
-----
What I need is to be able to compare two data sets and get a number which represents how much the two data sets "match". Here are some methods I have tried and their effectiveness.
1) Average(Absolute Value(Difference between the two sets at each index)) - this gives me a number but the number doesn't really mean anything to me depending on the scale of the two data sets. This lead me to my next attempt
2) Average(Absolute Value(Percentage difference between the two sets at each index)) - this was better but again the percentages could range well over 100%
- also with the above two strategies I would be getting the average difference which I would need to somehow manipulate into a percentage match
3) Correlation coefficient - this looked promising but I then realized this only tells me "how linear" a relationship the two sets had. They could be totally different sets of data but as long as they shared the same linear relationship they would be a 100% correlation
-------
Basically the goal is to calculate a percentage match where 100% would be the two sets are identical and 0% would mean they are infinitely different. Any help would be greatly appreciated. Thanks!
Hey guys, I'm developing a program for comparing the effects of various terms in a Monte Carlo experiment. Right now I have it so you can visually see the effect of "switching" terms on and off and need a way of quantifying how much two lines "match".
-----
What I need is to be able to compare two data sets and get a number which represents how much the two data sets "match". Here are some methods I have tried and their effectiveness.
1) Average(Absolute Value(Difference between the two sets at each index)) - this gives me a number but the number doesn't really mean anything to me depending on the scale of the two data sets. This lead me to my next attempt
2) Average(Absolute Value(Percentage difference between the two sets at each index)) - this was better but again the percentages could range well over 100%
- also with the above two strategies I would be getting the average difference which I would need to somehow manipulate into a percentage match
3) Correlation coefficient - this looked promising but I then realized this only tells me "how linear" a relationship the two sets had. They could be totally different sets of data but as long as they shared the same linear relationship they would be a 100% correlation
-------
Basically the goal is to calculate a percentage match where 100% would be the two sets are identical and 0% would mean they are infinitely different. Any help would be greatly appreciated. Thanks!