# Algorithm Complexity: Sorted Arrays into Sorted Array

## Homework Statement

We have k >= 1 sorted arrays, each one containing n >= 1 elements (all equal length). We want to combine all of them in to a single sorted array with kn elements.
We have a "naive" algorithm: merge the first two arrays, then merge the third array into the result, then merge the fourth array into the result, and so on.
What's the complexity of the naive algorithm, as a function of k and n?

2. The attempt at a solution

Assuming k > 1. When comparing first two arrays, worst-case requires (n log n) comparisons.
Adding a third array to the resulting array has a worst-case of (2n log 2n) comparisons.
Similarly, adding a fourth has (3n log 3n) comparisons. This suggests a general worst-case equation for the algorithm: (k-1)n log (k-1)n

I don't think this is right... the next part of the question asks us to talk about a less expensive implementation, but (k-1)n log(k-1)n is already in big theta of n log n, which is plenty efficient.

Related Engineering and Comp Sci Homework Help News on Phys.org
rcgldr
Homework Helper
You could try some simple cases and count the number of operations. I'm not sure if there's supposed to be a complexity factor for "comparing" versus "copying" data. As an example of a simple case, here are two sets of integers to be merged:

{1, 2, 5, 8, 9} and {3, 4, 6, 7, 10}

What would the complexity be if each set only had 4 integers instead 5? What if each set only had 3 integers instead of 5?

Well, merging will happen k-1 times. So the merging complexity would be (k-1)n? Which is in big theta of n? (The merging part of mergesort is linear, not logarithmic, my bad).

http://www.wolframalpha.com/input/?i=sum+i=2+to+k+(i*n)

Reason:
The first merge will take O(n+n) operations (go through both lists once). The second merge
will take (2*n + n = 3n) operations (since you will have to go through the long list and once through the 3rd list. This results in an O(k^2*n) complexity.

rcgldr
Homework Helper
Well, merging will happen k-1 times. So the merging complexity would be (k-1)n? Which is in big theta of n?
Is this the optimized version or the naive version? In the optimized version where k arrays are merged into an output array, all k n elements are copied to the output array, but you're doing multiple compares for every element copied to the output array. If this is the naive version, you copy n+n data the first time, then 2n + n data the second time, ..., and I assume you're supposed to figure out the number of compares.

I don't know how copies and compares are factored into complexity or big theta.

I don't know how copies and compares are factored into complexity or big theta.
It doesn't matter, we don't care about constants in asymptotic notation.

http://www.wolframalpha.com/input/?i=sum+i=2+to+k+(i*n)

Reason:
The first merge will take O(n+n) operations (go through both lists once). The second merge
will take (2*n + n = 3n) operations (since you will have to go through the long list and once through the 3rd list. This results in an O(k^2*n) complexity.
Where do you get k^2 from? Don't fully follow.

Shouldn't it just be kn? That's what the pattern seems to suggest.

Where do you get k^2 from? Don't fully follow.
Easy:

$$\sum_{i=2}^k in = n\sum_{i=2}^k i = n\frac{k^2+k-2}{2} \in O(k^2*n)$$

rcgldr
Homework Helper
I don't know how copies and compares are factored into complexity or big theta.
It doesn't matter, we don't care about constants in asymptotic notation.
OK, but in one type of optimized case, there are k n copies done and in the worst case, (k) (k-1) n = (k2 - k) n compares done, how do you balance the copies versus compares for O(...) in this case?

OK, but in one type of optimized case, there are k n copies done and in the worst case, (k) (k-1) n = (k2 - k) n compares done, how do you balance the copies versus compares for O(...) in this case?
How did you get these numbers?

Merging two lists of size n takes at most 2*n-1 comparisons. In the first round you do
this for k/2 lists.

rcgldr
Homework Helper
OK, but in one type of optimized case, there are k n copies done and in the worst case, (k) (k-1) n = (k2 - k) n compares done, how do you balance the copies versus compares for O(...) in this case?
How did you get these numbers?
By merging k lists at the same time (one type of optimized case), k n elements copied, k-1 compares (worst case) for each element copied. This would be one of the "less expensive" implementations mentioned at the end of the original post.

You can also merge them in the same way merge sort does, that will give you something like O(log(k)*k*n) worst case.