# Value of increasing or decreasing of the set

1. Feb 17, 2006

### Alteran

Simple statistical question.

Lets say that we have set of values:
45
40
30
36
11
10
75
102
113
125
137
140
149

you can see that it has a good tendency to increase (even if we have decrease in the middle). But in set:
68
33
21
31
7
7
13
123
21
33
67
64
29
9
5
87
15
20
5
9

we have all mixed up. I am looking for nice method to determine is set is increasing or decresing and how faster.

Thanks

2. Feb 17, 2006

### Alteran

Sorry, I have posted it to a wrong thread. Moderators, please, fell free to move it to statistics.
Thank you.

3. Feb 17, 2006

### Hurkyl

Staff Emeritus
The first thing that comes to mind is counting inversions: that is, pairs of indices (i, j) where i < j, but the i-th element is larger than the j-th element.

To be honest, I wouldn't really say the first set has much of a tendancy to decrease -- it actually looks like two disconnected pieces: a small decreasing portion and a large increasing portion!

4. Feb 17, 2006

### Alteran

About first set: if we will look to this as a whole, then it will increase from 45 to 149. I want to know is it really incresing that way (derive sort of a coeff of that increase/decrease). One solution could be to take first element and the last and compare them, but if set would be:
15
20
30
36
11
10
75
102
123
145
157
199
80

we see overall increase and fast decrease at the end (only 2 values vs 12), but I would say that set is increasing.

5. Feb 17, 2006

### Hurkyl

Staff Emeritus
The reason I mentioned inversions is that I've seen them used as a measure of "near sorting" -- that is, to tell how good of a job an approximate sorting algorithm has done.

It sounds like your problem is close to that: you want to tell how close to sorted (i.e. increasing) your data is. I'm not sure if you want to ascribe any relevance to the actual magnitudes of the numbers, or if their ranking is all that matters.

6. Feb 17, 2006

### 0rthodontist

Maybe you are looking for the correlation coefficient between the data and the indices of the data. try mathworld.wolfram.com for correlation

7. Feb 18, 2006

### Alteran

Thank you for your support. The best thing that helps here is counting inversions. That way I can definetely say, is list in increasing order or not and calculate quality of that increasing: we have maximum number of inversions that could be (if list consist of unique items and sorted) and we have number of inversions in our case, then we can derive a percent.

8. Feb 18, 2006

### 0rthodontist

Which method is better depends on what you are trying to do. What is the reason you want to find out if the ordered set is increasing or decreasing?

9. Feb 20, 2006

### EnumaElish

Inversions is a "qualitative" measure, it will not tell you how much each inversion sets back the sequence. Your orig. post sounded like you need something like the average difference ("difference" being the discrete version of the derivative).

10. Feb 20, 2006

### Hurkyl

Staff Emeritus
Well, the average distance works out to simply being the difference of the first and last numbers, divided roughly by the number of numbers. So, it's not all that useful. (Taken literally anyways)

My first thought on tweaking this measure was to weight an inversion by how far apart the numbers were.

But of course you're right, we can't say for sure what the best thing is until we know the application. (Of course we still might not be able to, but we'll have a better idea)

This I don't understand though. I mean, I get that the actual numbers don't matter, just their relative ranking. But the sequence

10 20 30 40 35 45 55 65

only has one inversion (40:35) , whereas

10 20 30 40 25 35 45 55

has three inversions. (30:25, 40:25, 40:35)

So we do get a measure of how far "back" you get taken.

By the way Alteran, if you have a big data set, there's an O(n log n) algorithm for counting the number of inversions. I don't know it off hand, but it's somewhere in Knuth's Art of Computer Programming.