- #1

ChrisVer

Gold Member

- 3,337

- 440

## Main Question or Discussion Point

Hi, I was wondering (and I am somewhat non-confident about my time measurement)... if I have an object that has several numbers (associated with each event): [itex]A = (A_0, A_1, A_2,...,A_{N-1})[/itex]... what's the best way to find if there are duplicates in them?

At first I thought the simplest piece of code:

If the size of A is [itex]N[/itex] as I gave above, this code is like going to take N*(N-1)/2 time to finish (so it scales with [itex]N^2[/itex]). Thus, having millions of numbers in the array can be problematic.

Another way would be the sorting of the array A and the binary search... The sort of the form:

is taking N*log(N) and the binary search is taking log(N) ... so the search for a duplicate is taking time [itex] (N+1)*log(N) [/itex].

Are there any faster ways? (Like applying some binary function of

? (avoids the binary search as then I will only need the length of the resulting array)

One problem which I haven't yet mentioned is that I don't have the array in an array format (I'll have to create it, so I will get +N more time to do that).

At first I thought the simplest piece of code:

Code:

```
for (int i =0 ; i < A.size() ; i++ ){
current = A[i];
for(int j=i+1 ; j< A.size() ; j++){
rest = A[j];
if(rest == current) { cout << "Duplicate found"; }
}
}
```

Another way would be the sorting of the array A and the binary search... The sort of the form:

Code:

`std::sort( A.begin() , A.end() )`

Are there any faster ways? (Like applying some binary function of

Code:

`std::sort( A.begin() , A.end() , [](int a, int b) { return a==b; }`

One problem which I haven't yet mentioned is that I don't have the array in an array format (I'll have to create it, so I will get +N more time to do that).