Hands-on: Implementing Heaps and Heap-sort
June 30, 2018
Now that we have learnt about heap and heap-sort, it’s time to implement them. You can work in any programming language of your choice - Python, C++, Java, or anything else you want. Let’s get started!
Implementing Heap and Heap-sort
Python
Below, we provide a short Python code template for you to implement heap and heap-sort.
# heap_sort.py
# calculate index of parent, left child, right child
ROOT = 1
def parent(index):
return # TODO
def left_child(index):
return # TODO
def right_child(index):
return # TODO
# Heap() class
class Heap():
# heap data is stored in self.data
# first item of self.data is None so that the actual first item is at position 1
def __init__(self):
self.data = [None]
def insert(self, element):
# TODO: add element to end of heap
# TODO: upward-swaps
return True
def pop(self):
assert len(self.data) > 1
# store original root
result = self.data[ROOT]
# TODO: move last element to root
# TODO: downward-swaps
# return original root
return result
def heap_sort(array):
heap = Heap()
# insert everything into heap
for element in array:
heap.insert(element)
# pop everything from heap
result = list()
for ii in range(len(array)):
element = heap.pop()
result.append(element)
return result
# test
array = [5, 10, 8, 7, 3, 7, 6, 12, 2, 7]
result = heap_sort(array)
assert result == [2, 3, 5, 6, 7, 7, 7, 8, 10, 12], result
C++, Java, etc
If you’re not using Python, have your implementation accept test cases from standard input, and print results to standard output.
Input description: The first line contains the number of test cases T. This is followed by T lines. Each line starts with a number N, the size of the array. On the same line, the N numbers of the array are listed one after the other.
Sample input:
2
3 227 198 19
10 458 687 945 378 479 421 33 295 922 840
Output description: Output should contain T lines. Each line should just contain the numbers in sorted order, separated by spaces.
Sample output:
19 198 227
33 295 378 421 458 479 687 840 922 945
Notes
The trickiest part about implementing heaps is the upward and downward swaps. Specifically, be careful about when to stop swapping, and which child to swap with (when swapping downwards). Also, debugging these things needs patience.
Grading your solution
Python
You can use the following file to grade your solution.
# grader.py
from __future__ import print_function
import datetime
import random
from heap_sort import heap_sort
npassed, nfailed = 0, 0
for itest in range(1, 13):
print('Test case:', itest)
nelements = random.randint(int((10**(itest/2.))/2), int(10**(itest/2.)))
print('Array size:', nelements)
array = [random.randint(1, 100*nelements) for _ in range(nelements)]
tic = datetime.datetime.now()
submission = heap_sort(array)
toc = datetime.datetime.now()
answer = sorted(array)
correct = len(submission) == len(answer) and (submission == answer)
if correct:
print('PASSED (Runtime = %.2f seconds)' % (toc-tic).total_seconds())
npassed += 1
else:
print('FAILED')
nfailed += 1
print(array)
print(submission)
print(answer)
print('='*100)
print('TOTAL PASSED: %d. TOTAL FAILED: %d.' % (npassed, nfailed))
Then, run the following command:
python grader.py
and you should see the results of running the grader.
C++, Java, etc
You’ll find all the files required to check your solution here: Algorithms - Sorting. In particular, the folder includes 3 files - input
, answer
and grader.py
.
Do the following to check your solution.
{COMMAND TO RUN C++/JAVA CODE} < input > output
python grader.py output answer
Solution
The solution for this assignment can be found here: heap_sort.py. The code takes about 10 seconds to run for input sizes around 700,000. Note that if you implemented the solution in C++ / Java, your code should be executing about > 5x faster.