Algorithm Complexity: Sorted Arrays into Sorted Array

AI Thread Summary
The discussion centers on the complexity of merging k sorted arrays, each containing n elements, using a naive algorithm. The initial assumption is that the complexity is O(k^2*n) due to the cumulative comparisons required for each merge step. However, it is clarified that merging two lists of size n takes O(n) operations, leading to a more accurate complexity of O(k*n) for the merging process. An optimized approach, similar to merge sort, can achieve a complexity of O(log(k)*k*n). The conversation emphasizes the importance of distinguishing between copying and comparing operations in complexity analysis.
sodiumbromate
Messages
4
Reaction score
0

Homework Statement



We have k >= 1 sorted arrays, each one containing n >= 1 elements (all equal length). We want to combine all of them into a single sorted array with kn elements.
We have a "naive" algorithm: merge the first two arrays, then merge the third array into the result, then merge the fourth array into the result, and so on.
What's the complexity of the naive algorithm, as a function of k and n?2. The attempt at a solution

Assuming k > 1. When comparing first two arrays, worst-case requires (n log n) comparisons.
Adding a third array to the resulting array has a worst-case of (2n log 2n) comparisons.
Similarly, adding a fourth has (3n log 3n) comparisons. This suggests a general worst-case equation for the algorithm: (k-1)n log (k-1)n

I don't think this is right... the next part of the question asks us to talk about a less expensive implementation, but (k-1)n log(k-1)n is already in big theta of n log n, which is plenty efficient.
 
Physics news on Phys.org
You could try some simple cases and count the number of operations. I'm not sure if there's supposed to be a complexity factor for "comparing" versus "copying" data. As an example of a simple case, here are two sets of integers to be merged:

{1, 2, 5, 8, 9} and {3, 4, 6, 7, 10}

What would the complexity be if each set only had 4 integers instead 5? What if each set only had 3 integers instead of 5?
 
Well, merging will happen k-1 times. So the merging complexity would be (k-1)n? Which is in big theta of n? (The merging part of mergesort is linear, not logarithmic, my bad).
 
http://www.wolframalpha.com/input/?i=sum+i=2+to+k+(i*n)

Reason:
The first merge will take O(n+n) operations (go through both lists once). The second merge
will take (2*n + n = 3n) operations (since you will have to go through the long list and once through the 3rd list. This results in an O(k^2*n) complexity.
 
sodiumbromate said:
Well, merging will happen k-1 times. So the merging complexity would be (k-1)n? Which is in big theta of n?
Is this the optimized version or the naive version? In the optimized version where k arrays are merged into an output array, all k n elements are copied to the output array, but you're doing multiple compares for every element copied to the output array. If this is the naive version, you copy n+n data the first time, then 2n + n data the second time, ..., and I assume you're supposed to figure out the number of compares.

I don't know how copies and compares are factored into complexity or big theta.
 
rcgldr said:
I don't know how copies and compares are factored into complexity or big theta.

It doesn't matter, we don't care about constants in asymptotic notation.
 
Max.Planck said:
http://www.wolframalpha.com/input/?i=sum+i=2+to+k+(i*n)

Reason:
The first merge will take O(n+n) operations (go through both lists once). The second merge
will take (2*n + n = 3n) operations (since you will have to go through the long list and once through the 3rd list. This results in an O(k^2*n) complexity.

Where do you get k^2 from? Don't fully follow.
 
Shouldn't it just be kn? That's what the pattern seems to suggest.
 
sodiumbromate said:
Where do you get k^2 from? Don't fully follow.

Easy:

\sum_{i=2}^k in = n\sum_{i=2}^k i = n\frac{k^2+k-2}{2} \in O(k^2*n)
 
  • #10
rcgldr said:
I don't know how copies and compares are factored into complexity or big theta.

Max.Planck said:
It doesn't matter, we don't care about constants in asymptotic notation.
OK, but in one type of optimized case, there are k n copies done and in the worst case, (k) (k-1) n = (k2 - k) n compares done, how do you balance the copies versus compares for O(...) in this case?
 
  • #11
rcgldr said:
OK, but in one type of optimized case, there are k n copies done and in the worst case, (k) (k-1) n = (k2 - k) n compares done, how do you balance the copies versus compares for O(...) in this case?

How did you get these numbers?

Merging two lists of size n takes at most 2*n-1 comparisons. In the first round you do
this for k/2 lists.
 
  • #12
rcgldr said:
OK, but in one type of optimized case, there are k n copies done and in the worst case, (k) (k-1) n = (k2 - k) n compares done, how do you balance the copies versus compares for O(...) in this case?

Max.Planck said:
How did you get these numbers?
By merging k lists at the same time (one type of optimized case), k n elements copied, k-1 compares (worst case) for each element copied. This would be one of the "less expensive" implementations mentioned at the end of the original post.
 
  • #13
You can also merge them in the same way merge sort does, that will give you something like O(log(k)*k*n) worst case.
 
Back
Top