Introduction
Modern interview questions often tweak the basic heap usage to solve specific selection and ordering problems. Below we explore several common variants and how to adapt a min/max-heap template for each. These include maintaining a running k-th largest element, selecting top-k frequent elements, finding medians in a data stream with two heaps, and merging multiple sorted lists.
Heap-Based Problem Variants
Find K-th Largest Element in a Stream
Modified Template: Use a min-heap of fixed size k to store the k largest elements seen so far. The smallest element in this min-heap is always the k-th largest overall. For each new number in the stream, if the heap has fewer than k elements, simply push the number. Once the heap size reaches k, compare the new number with the heap’s minimum (root). If the new number is larger, pop the smallest and push the new number; otherwise, do nothing. This way, the heap retains only the largest k elements at all times, and the heap’s root gives the current k-th largest.
Example Problem: LeetCode 703 – Kth Largest Element in a Stream. Design a class that continuously maintains the k-th largest number as new integers are added. The solution uses a min-heap of size k to track the top k elements. Initially, insert the first k elements from the stream (if available) into the heap, removing any smaller surplus. Then for each incoming value:
- If the heap has fewer than
kelements (first few additions), push the new value. - Otherwise, compare the new value to the min of the heap (current k-th largest). If the new value is larger, remove the min and add the new value; if it’s smaller, ignore it.
The k-th largest element is then simply the heap’s root (minimum of the stored elements). Each insertion thus takes O(log k) time, and the heap never grows beyond size k.
For example, if k=3 and the stream starts with [4, 5, 8, 2], the initial 3rd largest is 4 (heap contains [4,5,8]). If a new number 3 arrives, it’s smaller than the heap’s min (4), so the heap stays [4,5,8] and the 3rd largest remains 4. If next a 10 arrives, it’s larger than the min (4), so 4 is popped and 10 is pushed, heap becomes [5,8,10] and the 3rd largest updates to 5.
Code (Python):
import heapq
class KthLargest:
def __init__(self, k: int, nums: list[int]):
self.k = k
self.minheap = []
# Build initial heap of at most k largest elements
for num in nums:
heapq.heappush(self.minheap, num)
if len(self.minheap) > k:
heapq.heappop(self.minheap) # pop smallest, keep size k
def add(self, val: int) -> int:
if len(self.minheap) < self.k:
# Fill heap until it has k elements
heapq.heappush(self.minheap, val)
elif val > self.minheap[0]:
# New value is larger than current kth largest, replace smallest
heapq.heapreplace(self.minheap, val)
# The smallest in heap is the kth largest overall
return self.minheap[0]
Top K Frequent Elements
Modified Template: To find the k most frequent items, use a hash map to count frequencies, then a heap to select the top frequencies. One common approach is a min-heap of size k storing (frequency, element) pairs. Iterate over the frequency map and push each pair onto the heap; if the heap grows larger than k, remove the smallest-frequency entry. This way, the heap only contains the k highest frequencies at any time.
Example Problem: LeetCode 347 – Top K Frequent Elements. Given an array of integers, return the k numbers that appear most frequently. The heap solution uses a frequency map and a min-heap to filter out the top k.
- Frequency count: First count occurrences of each number using a dictionary or
collections.Counter. For example, innums = [1,1,1,2,2,3]withk=2, we get counts{1: 3, 2: 2, 3: 1}. - Heap selection: Push each
(count, num)into a min-heap. Once the heap size exceedsk, remove the smallest count. After processing all numbers, the heap will contain[(2, 2), (3, 1)](frequencies 2 and 3) which correspond to the two most frequent elements{1, 2}. - Result extraction: The heap now holds the top
kfrequent elements. Extract the elements from the heap to return them.
Step-by-step example: For nums=[1,1,1,2,2,3], k=2, frequency map is {1:3, 2:2, 3:1}. We push (3,1), (2,2), (1,3) onto the heap. When heap size exceeds 2, smallest frequency pair is removed. Ultimately heap contains [(2,2), (3,1)]. Extracting numbers gives [2, 1].
Code (Python):
import heapq
from collections import Counter
def topKFrequent(nums: list[int], k: int) -> list[int]:
freq = Counter(nums)
heap = []
for num, count in freq.items():
heapq.heappush(heap, (count, num))
if len(heap) > k:
heapq.heappop(heap) # remove smallest frequency
# Extract the elements from the remaining heap entries
return [num for (count, num) in heap]
Median Finder (Two Heaps)
Modified Template: To continuously find the median of a growing stream, maintain two heaps for the two halves of the numbers – a max-heap for the lower half and a min-heap for the upper half. The max-heap (left side) holds values ≤ current median, and the min-heap (right side) holds values ≥ median. The median is either one of their tops or the average. After each insertion, re-balance heaps so their sizes differ by at most one.
Example Problem: LeetCode 295 – Find Median from Data Stream. Design a data structure with addNum(val) and findMedian().
addNum(x):
- Add to max-heap (lower half).
- If max-heap's top > min-heap's top, move max-heap's top to min-heap.
- Re-balance sizes: if one heap has >1 more elements, move its top to the other.
findMedian():
- If sizes equal, median is average of tops.
- Else, median is top of larger heap.
Example: Stream [5, 15, 1, 3].
- Add 5: maxH=[-5], minH=[]. Median: 5.
- Add 15: maxH=[-5], minH=[15]. Median: (5+15)/2=10.
- Add 1: maxH=[-1], minH=[5,15] (after balancing). Then rebalance sizes: maxH=[-1,-5], minH=[15]. Median: 5.
- Add 3: maxH=[-1,-3], minH=[5,15] (after balancing). Median: (3+5)/2=4.
Code (Python):
import heapq
class MedianFinder:
def __init__(self):
self.maxheap = [] # stores negative values
self.minheap = []
def addNum(self, num: int) -> None:
heapq.heappush(self.maxheap, -num)
if self.minheap and -self.maxheap[0] > self.minheap[0]:
val = -heapq.heappop(self.maxheap)
heapq.heappush(self.minheap, val)
if len(self.maxheap) > len(self.minheap) + 1:
val = -heapq.heappop(self.maxheap)
heapq.heappush(self.minheap, val)
elif len(self.minheap) > len(self.maxheap) + 1: # Corrected this line
val = heapq.heappop(self.minheap)
heapq.heappush(self.maxheap, -val)
def findMedian(self) -> float:
if len(self.maxheap) == len(self.minheap):
if not self.maxheap: return 0 # Or handle as an error/None for empty stream
return (-self.maxheap[0] + self.minheap[0]) / 2.0
elif len(self.maxheap) > len(self.minheap):
return -self.maxheap[0]
else:
return self.minheap[0]
Merge K Sorted Lists (Min-Heap)
Modified Template: To merge multiple sorted sequences, a min-heap efficiently tracks the next smallest element among the heads of each list. Use a tuple (value, list_index, element_index) in the heap. Initially, push the first element of each list. Repeatedly pop the heap’s top, append to result, and push that element’s next sibling from its originating list (if any).
Example Problem: LeetCode 23 – Merge k Sorted Lists. Merge k sorted linked lists (or arrays) into one.
Example Lists: L1:[1,3,5], L2:[2,4,6], L3:[0,9]
- Initialization: Heap:
[(0,2,0), (1,0,0), (2,1,0)]. Result:[]. - Pop
(0,2,0). Result:[0]. Push(9,2,1)from L3. Heap:[(1,0,0), (2,1,0), (9,2,1)]. - Pop
(1,0,0). Result:[0,1]. Push(3,0,1)from L1. Heap:[(2,1,0), (3,0,1), (9,2,1)]. - Continue until heap empty.
Final Result: [0,1,2,3,4,5,6,9] (adjusted for example lists).
Code (Python):
import heapq
def mergeKSortedLists(lists: list[list[int]]) -> list[int]:
minheap = []
# Initialize heap with first element from each list
for i, arr in enumerate(lists):
if arr: # list is non-empty
heapq.heappush(minheap, (arr[0], i, 0))
result = []
while minheap:
val, list_idx, element_idx_in_list = heapq.heappop(minheap) # Renamed variables for clarity
result.append(val)
# If there is a next element in the same list, push it
if element_idx_in_list + 1 < len(lists[list_idx]):
next_val = lists[list_idx][element_idx_in_list + 1]
heapq.heappush(minheap, (next_val, list_idx, element_idx_in_list + 1))
return result
# Example usage:
# lists = [[1,3,5,7], [2,4,6,8], [0,9,10,11]]
# print(mergeKSortedLists(lists))
# Output: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]