patternDSA/K-way Merge
Core Pattern → Pattern 23

K-way Merge

One min-heap as a frontier across k sorted lists. Push the smallest from each list. Pop the global min, push its successor. Heap size stays at k. O(N log k) — every element placed exactly once.

Frequency:★★★★☆
·~25 min·Intermediate–Hard·5 problems
The Core Shock

Sequential merge = O(kN) — early lists merged k times. Min-heap frontier = O(N log k) — each element placed exactly once.

Sequential (list1+list2+list3…)
O(kN)
Merge list1 into result. Merge list2 into result. Each element in list1 participates in k-1 merges.
Min-heap frontier
O(N log k)
One heap, always k candidates. Pop min globally. Push successor. Each element placed exactly once.

The insight: you only ever need to compare k candidates at once — the current front of each list. A min-heap of size k always gives you the global minimum in O(log k). Pop it, push the next element from the same list, repeat. Total: N pops × O(log k) each = O(N log k).

The Heap Tuple: (value, list_idx, elem_idx)

The three-part tuple is the entire design of this pattern.

value
Primary sort key — min-heap orders by this. The globally smallest unplaced value is always at heap[0].
list_idx
Which list this value came from. After popping, we advance the pointer in THIS list to get the next candidate.
elem_idx
Where in list[list_idx] we are. The next candidate is list[list_idx][elem_idx + 1].
For linked lists: use (node.val, list_idx, node) — store the node itself instead of elem_idx
For sorted matrix: use (value, row, col) — advancing right = col+1 in same row
Tie-breaking: when values are equal, Python compares list_idx next — always include list_idx to avoid TypeError on equal values

The 3-Step Framework

Three steps, all problems use the same loop.

01
Push the first element of each list

Initialize the heap with one (value, list_idx, elem_idx) tuple per list. Only push if the list is non-empty. This creates a frontier of k candidates — the current minimum of each list.

for i, lst in enumerate(lists):
    if lst:
        heapq.heappush(heap, (lst[0], i, 0))
# Heap size = k (or fewer if some lists empty)
heapq.heapify would also work: [(lst[0],i,0) for i,lst...] followed by heapify. heappush loop is clearer and handles empty lists naturally.
02
Pop min → place → push successor

While heap is not empty: pop (val, list_i, elem_i) — this is the globally smallest remaining element. Add val to result (or count toward kth). Push the next element from the same list: (lists[list_i][elem_i+1], list_i, elem_i+1).

while heap:
    val, li, ei = heapq.heappop(heap)
    result.append(val)
    if ei + 1 < len(lists[li]):
        heapq.heappush(heap,
            (lists[li][ei+1], li, ei+1))
The successor check (ei+1 < len(lists[li])) is crucial — don't push past the end of a list. When a list is exhausted, it simply has no more candidates in the heap.
03
For kth: stop at the kth pop

For 'kth smallest' problems, don't collect all results — just count pops. When count reaches k, return the current val. Same loop, different termination condition. The heap does all the work of finding the global order.

count = 0
while heap:
    val, li, ei = heapq.heappop(heap)
    count += 1
    if count == k: return val  # done!
    if ei+1 < len(lists[li]):
        heapq.heappush(heap, (lists[li][ei+1], li, ei+1))
The kth pop gives the kth smallest across all lists combined. This works because the heap always gives the global minimum — so the sequence of pops is a globally sorted traversal.

Pattern Recognition

'k sorted lists/arrays' → this pattern immediately.

Use K-way Merge When
"Merge k sorted lists/arrays into one"
"Kth smallest across k sorted lists"
"Kth smallest in a sorted matrix"
"Smallest range covering k lists"
K pairs with largest/smallest sums
External sort: merge k sorted file chunks
Find median from sorted matrix
Any problem with multiple sorted sequences to interleave
Common Pitfalls
Forgetting list_idx in tuple — can't know which list to advance
Pushing past end of list (no bounds check on elem_idx+1)
Equal values: heap compares list_idx next — include it to avoid TypeError
Sorted matrix: pushing both right AND down neighbors → duplicates
Smallest range: forgetting to update current_max when pushing new element
Smallest range: breaking when any list exhausted (can't span all k)
K pairs: seeding with all (i,0) pairs instead of (0,j) — wrong traversal
Linked list version: storing node.next instead of the node itself