Loading [MathJax]/jax/output/CommonHTML/jax.js

CommonLounge Archive

Hands-on: K-sort (with progressive 'hint-by-hint' solution)

July 20, 2018

We’ve just implemented the heap data structure and the heap-sort algorithm. Now, we’ll solve the K-sort problem using heaps. You can work in any programming language of your choice - Python, C++, Java, or anything else you want. Let’s get started!

Problem Statement

We’re given k sorted arrays. We need to combine the sorted arrays into a single sorted array in O(nlogk) time, where n is the total number of elements in all the arrays combined.

Note: The desired complexity is O(nlogk), not O(nlogn).

Constraints: n5000000 and k50.

IO description

Input description: The first line contains the number of test cases T. Each test case begins with a line with a single number k (number of sorted arrays). Each of the next k lines describes a sorted array. The first number in the line is ni, the number of elements in array i. Which is then followed by ni numbers which make the sorted array.

Sample Input:

2
2
3 16 30 126
2 21 113
4
2 95 95
3 16 30 126
2 21 113
1 42

Output description: Output should contain T lines. Each line should contain the answer for the respective test case - the combined sorted array.

Sample output:

16 21 30 113 126
16 21 30 42 95 95 113 126

Explanation:

The first test case consists of 2-sorted arrays that need to be combined. The second test case consists of 4-sorted arrays that need to be combined.

Grading your solution

You can find all the required files to grade your solution here.

First run your solution with the provided input

{COMMAND TO RUN Python/C++/Java CODE} < input > output

Then, grade your output using the grader and answer provided.

python grader.py output answer

Progressive ‘hint-by-hint’ solution

Only look at the hints if you can’t solve the problem for at-least 20-minutes. Read only one hint that you could not figure out on your own. Then spend at-least 20-minutes trying to solve the problem again.

Hint 1:

We are given k sorted arrays. Somehow, we need to use the sorted-ness of the given arrays to do sorting in O(n log k) instead of O(n log n).

Hint 2:

When we were sorting a general array of size n, we had to add all the elements to a heap, and hence the heap operations took log n time each. Now, it’s possible for us to ensure the total size of the heap is at-most k, so each operation takes time log k.

Hint 3:

We essentially need to perform the merge step from merge-sort, but with k arrays instead of 2. We can use a heap to do this efficiently.

Full solution

Here is the official solution for this assignment. The code executes in ~10 seconds. If you implemented your solution in C++ or Java, you should expect your solution to run at-least 5x faster.

A note on the run-time

The above heap-based solution executes in ~10 seconds.

If we simply combine all the elements into a single array and sort the array using quick-sort, then we get the following run-times.

  1. Our quick-sort implementation: 47 seconds.
  2. Our in-place quick-sort implementation: 30 seconds.
  3. Python’s optimized quick-sort implementation: 6 seconds.

As we can see, Pythons optimized implementation for quick-sort is faster than the optimization we achieved because of O(logk) vs O(logn), however we achieve a significant speed-up when we use unoptimized quick-sort implementation, even if we use in-place quick-sort.


© 2016-2025. All rights reserved.