DSA Interview Course | Crack Data Structures & Algorithms

Topological Sort

Introduction

Concept: Topological sort produces a linear ordering of nodes in a directed acyclic graph (DAG) such that each node comes before all nodes it points to. If there’s an edge U → V, then U appears earlier than V. This is only possible if the graph has no cycles.

Typical Use Cases: Scheduling problems (e.g. course prerequisite ordering, build systems), task dependency resolution. Interview problems like Course Schedule or Build Order are classic examples.

Keywords/Cues: “Prerequisites,” “dependency,” “precede,” “order of tasks,” “before/after relationship,” or explicit mention of a DAG.

Core Idea / Mental Model

Think of ordering tasks by prerequisites – always pick a task with no remaining prerequisites (no incoming edges), do it, then remove it and update prerequisites for others. This repeats until all tasks are ordered. A mental model is “repeatedly remove nodes with no incoming edges.”

Analogy: Building a house: you can’t paint walls before the foundation is laid. Maintain a list of tasks ready to do, do one, and update which new tasks become free.

Example DAG for Topological Sort — Example DAG. One valid topological order: 5 → 4 → 2 → 3 → 1 → 0.

Common methods:

Kahn’s Algorithm (BFS): Track in-degrees. Pick nodes with in-degree 0, output, remove their outgoing edges (reducing neighbors' in-degrees). Detects cycles if no nodes have in-degree 0 but unprocessed nodes remain.
DFS-based: Perform DFS. When finishing a node, add to a stack (or front of list). Produces reverse post-order, a topological order.

Base Template (Python - Kahn's Algorithm)

from collections import deque
def topo_sort(graph):
    # graph: dict mapping node -> list of adjacent nodes
    indegree = {node: 0 for node in graph}
    for u in graph:
        for v in graph[u]:
            indegree[v] += 1

    queue = deque([node for node in graph if indegree[node] == 0])
    topo_order = []

    while queue:
        node = queue.popleft()
        topo_order.append(node)
        for nei in graph[node]:
            indegree[nei] -= 1
            if indegree[nei] == 0:
                queue.append(nei)
    
    if len(topo_order) != len(graph):
        raise ValueError("Graph is not a DAG - cycle detected")
    return topo_order

# Example: graph = {"A":["B","C"], "B":["D"], "C":["D"], "D":[]}
# order = topo_sort(graph) -> ['A', 'B', 'C', 'D'] (or A,C,B,D)

Dry Run (Kahn's): graph = {A:[B,C], B:[D], C:[D], D:[]}

In-degrees: {A:0, B:1, C:1, D:2}. Queue: [A]. Order: [].
Dequeue A. Order: [A]. B in-degree: 0, C in-degree: 0. Enqueue B, C. Queue: [B,C].
Dequeue B. Order: [A,B]. D in-degree: 1. Queue: [C].
Dequeue C. Order: [A,B,C]. D in-degree: 0. Enqueue D. Queue: [D].
Dequeue D. Order: [A,B,C,D]. Queue: [].
Result: [A,B,C,D].

Time and Space Complexity

Time: O(V + E) for both Kahn’s and DFS-based. Each node and edge visited once.
Space: O(V + E) for graph, in-degree map/visited set O(V), queue/stack O(V).

Common Pitfalls

Forgetting cycle checks (topo sort only for DAGs).
Not initializing all nodes (especially isolated ones).
Queue/stack misuse (e.g., using wrong data structure or wrong DFS order).
Incorrectly updating in-degrees.
Assuming a unique topological order (multiple can exist).
Handling disconnected graphs (ensure all components are processed).

Dijkstra’s Algorithm

Introduction

Concept: Finds shortest path distances from a single source to all other nodes in a weighted graph (non-negative edge weights). Produces a shortest-path tree.

Typical Use Cases: Pathfinding in maps (GPS), network routing, puzzle/game AI. Problems like minimum cost path in a grid or cheapest flight itinerary.

Keywords/Cues: “Shortest path,” “least cost,” “minimum distance,” weighted graph with positive weights. If negative weights, Dijkstra is not suitable (use Bellman-Ford).

Core Idea / Mental Model

A greedy extension of BFS for weighted graphs. Explores outward from source, always taking the next “closest” unvisited node. Maintains tentative distances (initially ∞, source 0). Pick node with smallest current distance, finalize it, relax (update) neighbors’ distances.

Mental Model: Water rippling out from source, reaching closest nodes first.
Priority Queue (Min-Heap): Efficiently selects the next closest node (distance, node).
Relaxation: If dist[U] + weight(U,V) < dist[V], update dist[V].

Dijkstra's Algorithm Concept — Dijkstra expands from source, finalizing shortest paths greedily.

Base Template (Python)

import heapq
def dijkstra(graph, start):
    # graph: node -> list of (neighbor, weight)
    dist = {node: float('inf') for node in graph}
    dist[start] = 0
    pq = [(0, start)] # (distance, node)
    visited = set()

    while pq:
        cur_dist, node = heapq.heappop(pq)
        if node in visited: continue
        visited.add(node)

        for neighbor, weight in graph[node]:
            new_dist = cur_dist + weight
            if new_dist < dist[neighbor]:
                dist[neighbor] = new_dist
                heapq.heappush(pq, (new_dist, neighbor))
    return dist

# graph = {'A':[('B',1),('C',4)], 'B':[('C',2),('D',5)], 'C':[('D',1)], 'D':[]}
# dijkstra(graph, 'A') -> {'A':0, 'B':1, 'C':3, 'D':4}

Dry Run (Dijkstra): Graph: A-(1)-B, A-(4)-C, B-(2)-C, B-(5)-D, C-(1)-D. Start: A.

Init: dist={A:0, B:inf, C:inf, D:inf}. PQ=[(0,A)]. Visited={}.
Pop (0,A). Visited={A}. Relax A's neighbors: dist[B]=1, PQ=[(1,B)]; dist[C]=4, PQ=[(1,B),(4,C)].
Pop (1,B). Visited={A,B}. Relax B's neighbors: dist[C]=min(4, 1+2=3), PQ=[(3,C),(4,C)]; dist[D]=1+5=6, PQ=[(3,C),(4,C),(6,D)].
Pop (3,C). Visited={A,B,C}. Relax C's neighbors: dist[D]=min(6, 3+1=4), PQ=[(4,C),(4,D),(6,D)].
Pop (4,C) (stale, C visited).
Pop (4,D). Visited={A,B,C,D}. No unvisited neighbors. PQ=[(6,D)].
Pop (6,D) (stale, D visited). PQ empty.
Result: {A:0, B:1, C:3, D:4}.

Time and Space Complexity

Time: O((V + E) log V) or O(E log V) with a min-heap.
Space: O(V + E) for graph, dist map O(V), priority queue O(V) or O(E), visited set O(V).

Common Pitfalls

Using with negative edge weights (incorrect results).
Not using a heap (O(V²) instead of O(E log V)).
Handling stale entries in priority queue incorrectly.
Improper initialization of distances (source to 0, others to ∞).
Early stopping logic errors if only one target is needed.
Forgetting to reconstruct path if required (need predecessor map).
Not considering disconnected graph components.

Minimum Spanning Tree (MST)

Introduction

Concept: A subset of edges in a weighted, undirected, connected graph that connects all vertices with the minimum possible total edge weight. It's a spanning tree (no cycles, all vertices connected).

Typical Use Cases: Network design (minimal cost cabling), VLSI chip design, clustering. Problems like "connect all points with minimum total distance."

Keywords/Cues: “Connect all... with minimum cost,” “network connecting all nodes,” “minimum infrastructure,” “least total distance.” Do not confuse with shortest path.

Core Idea / Mental Model

MST algorithms are greedy. Two main types:

Kruskal’s Algorithm: Sort all edges by weight. Add edges from smallest weight, skipping if it forms a cycle (use Union-Find). Continue until V-1 edges are added.
Prim’s Algorithm: Start from an arbitrary vertex. Grow the tree by adding the smallest weight edge that connects a new vertex to the current tree. Repeat until all vertices are included.

Mental Model: Pick cheapest connections while avoiding loops, until all nodes are connected. Kruskal's is global; Prim's expands locally.

Graph and MST Example — An MST connects all vertices with minimum total edge weight, avoiding cycles.

Base Template (Python - Prim's Algorithm)

import heapq
def prim_mst(graph, start):
    # graph: node -> list of (neighbor, weight) for undirected graph
    visited = set([start])
    mst_edges = []
    total_cost = 0
    edges_heap = [] # (weight, u, v) where u in visited, v not

    for neighbor, weight in graph[start]:
        heapq.heappush(edges_heap, (weight, start, neighbor))

    while edges_heap and len(visited) < len(graph):
        weight, u, v = heapq.heappop(edges_heap)
        if v in visited: continue # Avoid cycle

        visited.add(v)
        mst_edges.append((u, v, weight))
        total_cost += weight

        for neighbor_of_v, weight_v_n in graph[v]:
            if neighbor_of_v not in visited:
                heapq.heappush(edges_heap, (weight_v_n, v, neighbor_of_v))
    
    return mst_edges, total_cost

# graph = {1:[(2,2),(3,10)], 2:[(1,2),(3,3)], 3:[(1,10),(2,3)]}
# prim_mst(graph, 1) -> ([(1,2,2),(2,3,3)], 5)

Dry Run (Prim's): Graph: 1-(2)-2, 2-(3)-3, 1-(10)-3. Start: 1.

Visited={1}. Heap=[(2,1,2), (10,1,3)]. MST=[], Cost=0.
Pop (2,1,2). Add 2 to visited. MST=[(1,2,2)], Cost=2. Add edges from 2: (3,2,3) to heap. Heap=[(3,2,3), (10,1,3)].
Pop (3,2,3). Add 3 to visited. MST=[(1,2,2),(2,3,3)], Cost=5. Add edges from 3: (10,3,1) - 1 is visited, skip. Heap=[(10,1,3)].
All visited. Heap has (10,1,3) but 3 is visited. Stop.
Result: Edges [(1,2,2),(2,3,3)], Cost 5.

Time and Space Complexity

Time: O(E log V) for Prim’s (with heap) and Kruskal’s (sorting edges).
Space: O(V + E) for graph, heap/edge list, visited set/union-find structure.

Common Pitfalls

Forgetting cycle checks (Kruskal: union-find; Prim's: visited set).
Handling disconnected graphs (MST applies to connected components).
Edge cases like single/two-node graphs.
Not using weights correctly (must pick min-weight valid edge).
Confusing MST with Shortest Path problems.
Incorrect Union-Find implementation for Kruskal.
Ending with incorrect number of edges (MST has V-1 edges).

Interactive Animations

Topological Sort Dijkstra's MST (Prim's)

Select an algorithm and click Start.

Selected Topological Sort. Click Start.

Templates

Graph Algorithm Overview

Topological Sort

Introduction

Core Idea / Mental Model

Base Template (Python - Kahn's Algorithm)

Time and Space Complexity

Common Pitfalls

Dijkstra’s Algorithm

Introduction

Core Idea / Mental Model

Base Template (Python)

Time and Space Complexity

Common Pitfalls

Minimum Spanning Tree (MST)

Introduction

Core Idea / Mental Model

Base Template (Python - Prim's Algorithm)

Time and Space Complexity

Common Pitfalls

Interactive Animations