DSA Interview Course | Crack Data Structures & Algorithms

Introduction to Graph Traversal

Graph traversal algorithms like Breadth-First Search (BFS) and Depth-First Search (DFS) are fundamental tools for solving a wide array of problems in coding interviews. These algorithms provide systematic ways to explore nodes and edges in a graph. Understanding their core mechanics, common modifications, and typical applications such as pathfinding, cycle detection, and component analysis is crucial for success.

Graph Traversal Variants & Patterns

BFS Traversal Variants

Breadth-First Search (BFS) explores a graph or grid level by level using a queue. Common BFS interview patterns include starting from multiple sources simultaneously, traversing grid-based graphs with boundary conditions, and terminating early when a target is found.

Multi-Source BFS (Simultaneous Sources)

Modified Template: Instead of enqueuing a single start node, initialize the BFS queue with all starting nodes at distance 0. The algorithm then proceeds as usual, expanding neighbors level by level from all sources in parallel. Key differences from single-source BFS:

Initialization: Enqueue all source nodes initially and mark them visited. They form level 0 of the traversal.
Parallel Expansion: During BFS, neighbors of any source (or subsequent layer) are enqueued, allowing a "wavefront" to expand from multiple points at once.

This pattern is useful for scenarios like spreading influence (fire, rot, distance) from multiple origins simultaneously.

Example Problem: Rotting Oranges (LeetCode 994) – Given a grid with fresh oranges (1) and rotten oranges (2), find the minimum minutes until all oranges rot (adjacent oranges rot each minute). All initially rotten oranges act as multiple BFS sources spreading rot in parallel.

Walkthrough: Imagine a grid where 2 marks rotten oranges at time 0 and 1 marks fresh ones:

2 1 1  
1 1 1  
1 1 2

Initially, two oranges (top-left and bottom-right) are rotten. We enqueue both positions as starting points (minute 0). At minute 1, all fresh oranges adjacent to any rotten orange become rotten (the rot spreads one step out from each source):

2 2 1  
2 1 1  
1 2 2

At minute 2, the rot spreads one more step from all newly rotten oranges:

2 2 2  
2 2 2  
2 2 2

Now all oranges are rotten. The BFS terminated after 2 minutes, which is the answer. The multi-source BFS ensured all rotten sources expanded simultaneously, covering the grid in the minimum time.

# Python Code
from collections import deque

def orangesRotting(grid):
    rows, cols = len(grid), len(grid[0])
    q = deque()
    fresh = 0
    # Initialize queue with all rotten oranges
    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == 2:
                q.append((r, c))
            elif grid[r][c] == 1:
                fresh += 1
    
    minutes = 0
    dirs = [(1,0),(-1,0),(0,1),(0,-1)]
    
    # BFS from all sources
    while q and fresh > 0:
        for _ in range(len(q)):  # iterate over current layer
            x, y = q.popleft()
            for dx, dy in dirs:
                nx, ny = x+dx, y+dy
                if 0 <= nx < rows and 0 <= ny < cols and grid[nx][ny] == 1:
                    grid[nx][ny] = 2       # rot the fresh orange
                    fresh -= 1
                    q.append((nx, ny))
        minutes += 1
        
    return minutes if fresh == 0 else -1

Here all rotten oranges were enqueued initially, and the BFS expanded from all of them in parallel. This yields the minimum time for the rot to reach all cells.

Grid BFS Traversal (Matrix Traversal)

Modified Template: When performing BFS on a 2D grid, treat each cell as a graph node with up to 4 neighbors (or 8 for diagonals, depending on the problem). The template adds boundary checks and typically uses a visited matrix (or in-place marking) instead of a visited set for efficiency. Differences from basic BFS:

Use loops or direction arrays to iterate over neighbor coordinates (e.g., up, down, left, right).
Check bounds to ensure the next cell is within the grid and not yet visited (or not an obstacle).
Mark cells as visited when enqueuing to avoid repeats.

Grid BFS often solves problems like flood fill, shortest path in a matrix, or connected regions.

Example Problem: Shortest Path in Binary Matrix (variation of BFS in grid) – Given a binary matrix where 0 is open and 1 is blocked, find the shortest path from the top-left to bottom-right cell (8-directional moves allowed). This is an unweighted grid, so BFS finds the shortest distance.

Walkthrough: Consider a grid:

0 0 1  
0 1 0  
0 0 0

Start at (0,0). BFS explores neighbors in 8 directions. Level by level expansion:

Step 1: from (0,0) reach (0,1) and (1,0) (distance 1).
Step 2: from those, reach (1,1) (blocked, skip), (0,2) (blocked), (1,2), (2,0), (2,1) (distance 2).
Step 3: continue until the bottom-right (2,2) is reached. The first time we reach (2,2) is the shortest path (distance 3 in this example).

Key Points: Use a queue of coordinates. Mark visited cells (e.g., set them to 1 or use a separate visited matrix) when enqueuing. Include boundary checks (0 <= nx < n and 0 <= ny < m) and skip cells that are out of bounds, already visited, or blocked. BFS guarantees the first visit to the target is via the shortest route.

Early-Exit BFS (Target Search)

Modified Template: When searching for a specific target node/state in an unweighted graph, BFS can terminate as soon as the target is found (since BFS finds the shortest path to it). The difference from full traversal is the inclusion of a check to break out early:

Each time you dequeue a node, check if it’s the target. If yes, you can immediately return the current depth or solution (no need to explore further levels).
Alternatively, if you are tracking levels externally, you can break out of the loop when the target is reached.

This pattern is common in puzzles and pathfinding where you only need the shortest distance or path to a particular goal, not traversal of the entire graph.

Example Problem: Open the Lock (LeetCode 752) – You start at combination "0000" on a lock and have a target combination to reach. You can rotate wheels to change digits, and certain combinations are "deadends" (forbidden). Find the minimum number of moves to reach the target. BFS explores combinations by simulating wheel turns.

Walkthrough: Model each lock state as a node in a graph, where an edge connects states that differ by one wheel turn. Starting from "0000", BFS generates all combinations 1 move away (e.g., "1000","9000","0100","0900","0010","0090","0001","0009", excluding deadends). These form level 1. The search continues level by level. The moment the target combination is dequeued, BFS can terminate and return the depth (number of moves). For example, if the target is "0202", BFS might find it in 6 moves; once found, we stop without exploring the remaining states at that level or beyond.

# Python Code Snippet
from collections import deque

def minTurns(start, target, deadends):
    dead = set(deadends)
    if start in dead: return -1
    
    q = deque([(start, 0)]) # (state, moves)
    visited = {start}
    
    while q:
        state, moves = q.popleft()
        
        if state == target:
            return moves  # Early exit on target
            
        # Generate neighbors by turning each wheel one step
        for i in range(4): # Assuming 4-digit lock
            original_digit = int(state[i])
            for diff in (-1, 1):
                new_digit = (original_digit + diff + 10) % 10 # Handle wrap around
                new_state = list(state)
                new_state[i] = str(new_digit)
                new_state_str = "".join(new_state)
                
                if new_state_str not in visited and new_state_str not in dead:
                    visited.add(new_state_str)
                    q.append((new_state_str, moves + 1))
                    
    return -1 # Target not reachable

Here we return as soon as state == target. This ensures we don’t waste time exploring states beyond the target’s depth. In an interview, always mention that BFS naturally finds the shortest solution, so early exit is valid. (Bidirectional BFS is another BFS variant where two searches (from start and target) run toward each other to meet in the middle, effectively halving the search depth. This is often used to optimize target search problems like Word Ladder – see Shortest Path patterns below for details.)

DFS Traversal Variants

Depth-First Search (DFS) explores as far as possible along a branch before backtracking. It’s commonly implemented with recursion or an explicit stack. Interview patterns for DFS often involve using extra state during recursion or exploring all possible paths (backtracking).

DFS with Backtracking (Extra State per Path)

Modified Template: In backtracking problems, a DFS recursion carries additional state (such as the current path, running sum, index, etc.) and may undo changes ("backtrack") when returning up the call stack. The base DFS template (dfs(node): mark visited; for each neighbor: if not visited: dfs(neighbor)) is modified to:

Pass an extra parameter in the recursive function (e.g., current path list, current depth, remaining quota, etc.).
Before diving into a recursive call, set up state (mark visited or choose an option). After coming back (backtracking), undo the state (un-mark visited or remove the last choice from the path).

This allows exploring all paths or combinations without permanent effects on other branches. It’s essential for puzzles, pathfinding with constraints, and combinatorial search.

Example Problem: Word Search (LeetCode 79) – Given a 2D board of letters and a word, determine if the word exists in the grid by tracing an adjacent path of letters. We use DFS with backtracking to explore possible paths for the word.

Walkthrough: Consider a board:

A B C  
D E F  
G H I

and the word "ABF". We start DFS from any cell that matches the first letter 'A'. From (0,0) = A, we look for 'B' among neighbors. We go right to (0,1) = B. Next, we need 'F' – from B's neighbors, down to (1,2) = F is a match. We found the word. During this search, we mark cells as visited on the current path to avoid revisiting (e.g., mark (0,0) and (0,1) as used). If a path fails, we backtrack: un-mark the cell and try a different direction.

# Python Code Snippet (Conceptual for Word Search)
def dfs(board, word, r, c, idx, visited):
    if idx == len(word):
        return True # Found the whole word
    
    rows, cols = len(board), len(board[0])
    if not (0 <= r < rows and 0 <= c < cols) or \
       (r,c) in visited or board[r][c] != word[idx]:
        return False
    
    visited.add((r,c)) # Mark as visited for current path
    
    # Explore all 4 directions
    found = (dfs(board, word, r+1, c, idx+1, visited) or
             dfs(board, word, r-1, c, idx+1, visited) or
             dfs(board, word, r, c+1, idx+1, visited) or
             dfs(board, word, r, c-1, idx+1, visited))
             
    visited.remove((r,c)) # Backtrack: un-mark for other paths
    return found

# Main function would iterate through board to start DFS
# for r in range(rows):
#     for c in range(cols):
#         if board[r][c] == word[0]:
#             if dfs(board, word, r, c, 0, set()): return True
# return False

We start dfs from every cell that matches the first letter. The backtracking ensures each cell can be reused in new paths after a path is abandoned. This pattern (DFS + backtracking) is common for solving maze-like puzzles, generating permutations/combinations, and other scenarios where we need to try all possibilities and revert changes when a branch fails.

Recursive DFS with State (Coloring or Markers)

Modified Template: Sometimes we augment DFS with a state that persists across the recursion, such as coloring nodes, tracking a recursion stack, or recording an extra property like depth. This is used to handle constraints that simple visited sets can’t. The template typically involves:

A global or external structure (array/map) to store state per node (e.g., color, timestamp, component id).
The DFS function updates that state when visiting a node and uses it to decide actions for neighbors.

Example Problem: Is Graph Bipartite? (LeetCode 785) – Determine if an undirected graph can be colored with 2 colors such that no edge connects nodes of the same color. We use DFS with a coloring state.

Walkthrough: We iterate over all nodes, and for each uncolored component, start a DFS and assign the first node color 0 (say, Red). For each edge, the neighbor gets the opposite color 1 (Blue). We propagate this assignment through DFS. If we ever encounter a conflict (trying to color a neighbor with a color different from the required one), the graph isn’t bipartite.

For example, take a graph with edges: 0-1, 1-2, 2-3, 3-0 (a 4-cycle). DFS from node 0: color 0=Red. Neighbor 1 gets Blue, neighbor 3 gets Blue. From 1, neighbor 2 gets Red. From 2, neighbor 3 is supposed to get Blue, but 3 is already Blue – no conflict. The coloring is consistent. If we had an odd cycle (say a triangle 0-1-2-0), a conflict would arise.

# Python Code Snippet
def isBipartite(graph):
    N = len(graph)
    colors = {}  # node -> 0 or 1

    def dfs(u, curr_color):
        colors[u] = curr_color
        for v in graph[u]:
            if v not in colors:
                if not dfs(v, 1 - curr_color): 
                    return False
            elif colors[v] == colors[u]:
                return False  # Adjacent same color -> not bipartite
        return True

    for node in range(N):
        if node not in colors:
            if not dfs(node, 0): # Start coloring with color 0
                return False
    return True

Here the extra state is the colors map carried through the DFS recursion. This pattern (DFS with a global map/array for states) is also seen in cycle detection for directed graphs (using states like unvisited/visiting/visited), or marking nodes with IDs in flood fill, etc.

Iterative DFS (Using an Explicit Stack)

Modified Template: This is an implementation variant where recursion is replaced by a stack data structure to avoid deep recursion or to have more control over the traversal order. The algorithm pushes the start node onto a stack and loops while the stack is non-empty, simulating the call stack:

Pop the top of stack as current node, if not visited, mark it and process it.
Push its neighbors onto the stack (often in reverse order of desired processing since stack is LIFO).

This achieves the same order as recursive DFS would (if neighbors are pushed in correct sequence).

Use Case: Iterative DFS is often used when recursion depth might be too large (risking stack overflow), or when you need to modify traversal (for example, DFS that finds connected components iteratively). While not typically a separate “pattern” asked in interviews, it’s good to understand for completeness. For instance, performing DFS on a binary tree iteratively using a stack is a common exercise (in-order, pre-order traversals, etc.). The logic remains the same as recursive DFS, just explicitly managing the stack.

Cycle Detection (Directed & Undirected)

Detecting cycles in graphs is a fundamental task. Approaches differ for undirected vs directed graphs.

Cycle Detection in Undirected Graph (DFS)

Modified Template: In an undirected graph, a cycle exists if a DFS traversal finds an adjacent that is already visited and is not the direct parent of the current node. The DFS algorithm is modified to carry a parent parameter:

Mark nodes as visited during DFS.
For each neighbor, if it’s not visited, DFS on it with current node as parent. If it’s visited and not equal to the parent, then we found a back-edge (cycle).

This works because in undirected graphs, an already visited neighbor that isn’t the node we came from indicates a loop.

Example Problem: Graph Valid Tree (LeetCode 261) – Determine if a given undirected graph with n nodes and edges is a valid tree (which by definition means no cycles and exactly n-1 edges). We can use DFS cycle detection to check for any cycle.

Walkthrough: Suppose n=5 and edges = [[0,1],[1,2],[2,3],[3,1],[4,0]]. A DFS from node 0:

Visit 0, go to 1 (parent 0).
From 1, go to 2 (parent 1).
From 2, go to 3 (parent 2).
From 3, explore neighbors: it sees neighbor 1 which is visited and is not 2 (its parent) – this indicates a cycle (3->1->2->3). We can report a cycle found.

# Python Code Snippet
def hasCycleUndirectedDFS(graph, start_node, n):
    visited = set()
    # graph should be an adjacency list
    def dfs(u, parent):
        visited.add(u)
        for v in graph.get(u, []):
            if v == parent:
                continue  # Skip the edge back to parent
            if v in visited:
                return True   # Cycle found (back-edge to an already visited node that's not parent)
            if dfs(v, u):
                return True
        return False

    # For disconnected graphs, iterate through all nodes
    for i in range(n):
        if i not in visited:
            if dfs(i, -1): # -1 indicates no parent for the starting node of a DFS traversal
                return True # Cycle found in this component
    return False # No cycle found in any component

Here we return True if a cycle is found. In a valid tree, has_cycle would be False and additionally we’d check that all nodes got visited (to ensure connectivity).

Cycle Detection in Undirected Graph (Union-Find)

Modified Template: The Union-Find (Disjoint Set Union) data structure can detect cycles by trying to union connected nodes. If an edge connects two nodes that are already in the same set (i.e., already connected), then adding that edge would create a cycle. Steps:

Initialize Union-Find with all nodes separate.
For each edge (u, v), check the find-set of u and v.
If they are the same, a cycle is detected (this edge is redundant).
If not, union the sets (connect the components).

This is efficient (almost O(1) per edge) and often used in tree validation or finding an extra edge.

Example Problem: Redundant Connection (LeetCode 684) – You’re given a graph that started as a tree (n nodes, n-1 edges) and then had one extra edge added, forming a cycle. Find the extra edge (the redundant one that creates a cycle). Using Union-Find:

Walkthrough: Suppose the edges are [[1,2],[2,3],[3,4],[1,4],[4,5]]. We process edges one by one:

Union(1,2) – no cycle (1 and 2 were separate).
Union(2,3) – no cycle (2 and 3 were separate).
Union(3,4) – no cycle.
Union(1,4) – now, find(1) == find(4) because 1-2-3-4 are connected, so this edge connects nodes already in the same component. Cycle detected! Edge [1,4] is the redundant connection. We return [1,4]. (We could stop here.)

# Python Code Snippet
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n + 1)) # Assuming nodes are 1 to n or 0 to n-1
        self.rank = [0] * (n + 1)

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x]) # Path compression
        return self.parent[x]

    def union(self, x, y):
        rootX = self.find(x)
        rootY = self.find(y)
        if rootX == rootY:
            return False  # Cycle detected or already in same set
        
        # Union by rank
        if self.rank[rootX] < self.rank[rootY]:
            self.parent[rootX] = rootY
        elif self.rank[rootX] > self.rank[rootY]:
            self.parent[rootY] = rootX
        else:
            self.parent[rootY] = rootX
            self.rank[rootX] += 1
        return True

# Example usage for Redundant Connection
# def findRedundantConnection(edges):
#     n = 0
#     for u, v in edges: n = max(n, u, v)
#     uf = UnionFind(n)
#     for u, v in edges:
#         if not uf.union(u, v):
#             return [u, v]
#     return [] # Should not happen based on problem statement

Union-Find provides an alternative to DFS for cycle detection in undirected graphs and is particularly handy in scenarios like this where we want to identify the specific cycle-forming edge.

Cycle Detection in Directed Graph (DFS & Recursion Stack)

Modified Template: In directed graphs, a cycle exists if we encounter a node that is already in the current DFS recursion stack (i.e., an ancestor in the DFS tree that we haven’t finished exploring yet). We modify DFS by maintaining a visiting state:

Use an array or set to mark nodes as visiting (in current path) and visited (completely processed).
When DFS visits a neighbor:
- If the neighbor is visiting, we found a cycle (back-edge to an ancestor currently in stack).
- If the neighbor is not visited, do DFS on it and propagate cycle detection.
After exploring all neighbors of a node, mark it as done (move from visiting to visited).

Example Problem: Course Schedule (LeetCode 207) – You’re given prerequisites for courses in a directed graph. Determine if it’s possible to finish all courses (i.e., no cycle in the prerequisite graph). Use DFS to detect cycles.

Walkthrough: Suppose you have courses and dependencies: 0->1, 1->2, 2->3, 3->1. This has a cycle (1→2→3→1). Using DFS:

Start DFS at course 0: mark 0 visiting, then DFS(1), mark 1 visiting, DFS(2), mark 2 visiting, DFS(3), mark 3 visiting. Now from 3, neighbor is 1 which is already visiting – cycle found. We can stop and report a cycle.

If no cycle is found after DFS from all components, then all courses can be finished.

# Python Code Snippet
def canFinish(numCourses, prerequisites):
    graph = [[] for _ in range(numCourses)]
    for course, prereq in prerequisites:
        graph[prereq].append(course)

    # States: 0 = unvisited, 1 = visiting, 2 = visited
    state = [0] * numCourses

    def hasCycle(node):
        state[node] = 1  # Mark as visiting
        for neighbor in graph[node]:
            if state[neighbor] == 1:  # Cycle detected
                return True
            if state[neighbor] == 0:  # If unvisited, DFS
                if hasCycle(neighbor):
                    return True
        state[node] = 2  # Mark as visited (finished processing)
        return False

    for course in range(numCourses):
        if state[course] == 0:  # If unvisited, start DFS
            if hasCycle(course):
                return False # Cycle detected, cannot finish
    return True # No cycle found

Here dfs returns True if a cycle is detected. We mark nodes as VISITING when we enter them, and VISITED when fully done. A back edge to any VISITING node signals a directed cycle. This is the classic algorithm for cycle detection in directed graphs. (Alternatively, one could perform a topological sort via Kahn’s algorithm and check if all nodes were covered – if not, a cycle exists. That BFS approach is effectively another variant: if the number of courses taken < total courses, cycle detected.)

Topological Sort Variants

Topological sort is an ordering of vertices in a directed acyclic graph (DAG) such that each directed edge (u -> v) implies u comes before v in the order. Variants revolve around how the sort is produced (DFS vs BFS) and handling of multiple valid orders.

Topological Sort via DFS (Post-order Stack)

Modified Template: Perform a DFS on the directed graph and use a stack or list to collect nodes after exploring all their neighbors (post-order). Then reverse that list to get the topological order. Key aspects:

Use a DFS with cycle detection (as above) to ensure no cycle (if a cycle exists, topological sort is not possible).
When a node finishes (all neighbors are visited), push it onto a result stack.
The final stack (reversed) yields one valid ordering. If multiple orders are possible (graph is not a single chain), any DFS order is acceptable unless a specific order is required.

Example Problem: Course Schedule II (LeetCode 210) – Return a possible order to finish all courses given prerequisite pairs, or an empty list if impossible. Using DFS:

Walkthrough: Suppose numCourses = 4, prerequisites = [[1,0],[2,0],[3,1],[3,2]]. The graph is: 0->1, 0->2, 1->3, 2->3. A valid course order is [0,2,1,3] (many others are also valid). DFS approach:

Start at 0: DFS to 1, then to 3. Node 3 has no further neighbors; add 3 to result stack and backtrack. Add 1 to stack after 3.
Still in DFS(0): next neighbor 2 (since 1 is done). DFS to 2, then to 3, but 3 is already visited, so return. Add 2 to stack.
Add 0 to stack. Stack (top -> bottom) might be [0,2,1,3] in reverse finish order. Reverse it to get output order [0,2,1,3].

# Python Code Snippet
def findOrderDFS(numCourses, prerequisites):
    graph = [[] for _ in range(numCourses)]
    for course, prereq in prerequisites:
        graph[prereq].append(course)

    state = [0] * numCourses # 0: unvisited, 1: visiting, 2: visited
    topo_order = []

    def dfs(node):
        state[node] = 1 # Mark as visiting
        for neighbor in graph[node]:
            if state[neighbor] == 1: # Cycle
                return False 
            if state[neighbor] == 0:
                if not dfs(neighbor):
                    return False
        state[node] = 2 # Mark as visited
        topo_order.append(node) # Add to order (will be reversed later)
        return True

    for course in range(numCourses):
        if state[course] == 0:
            if not dfs(course):
                return [] # Cycle detected
    
    return topo_order[::-1] # Reverse for correct topological order

This yields one topological ordering (if it exists). In the example, one such order is [0, 2, 1, 3]. DFS-based topo sort is useful and straightforward; just remember to handle cycles.

Topological Sort via BFS (Kahn’s Algorithm)

Modified Template: Kahn’s algorithm uses indegrees and a queue:

Compute the indegree (number of incoming edges) for every node.
Enqueue all nodes with indegree 0 (no prerequisites).
Repeatedly dequeue a node, add it to the topo order, and reduce the indegree of its neighbors by 1 (simulating "removing" that node’s edges). If any neighbor’s indegree drops to 0, enqueue it.
Continue until the queue is empty. If all nodes are output, you have a valid topological sort; if not (some nodes never got to indegree 0), a cycle exists.

Example Problem: Course Schedule II (revisited) – We can solve the same example with Kahn’s BFS approach.

Walkthrough: For the graph 0->1, 0->2, 1->3, 2->3:

Initial indegrees: deg(0)=0, deg(1)=1, deg(2)=1, deg(3)=2. Start with queue = [0] (only course 0 has no prerequisites).
Dequeue 0, output [0]. Reduce indegree of 1 and 2 (neighbors of 0): now deg(1)=0, deg(2)=0, deg(3)=2. Enqueue [1,2].
Dequeue 1, output [0,1]. Reduce indegree of 3: deg(3)=1. (2 is still in queue.)
Dequeue 2, output [0,1,2]. Reduce indegree of 3: deg(3)=0. Enqueue [3].
Dequeue 3, output [0,1,2,3]. Done. We outputted all 4 courses. (If some courses were not output, it would mean a cycle prevented a complete ordering.)

# Python Code Snippet
from collections import deque

def findOrderBFS(numCourses, prerequisites):
    graph = [[] for _ in range(numCourses)]
    indegree = [0] * numCourses
    for course, prereq in prerequisites:
        graph[prereq].append(course)
        indegree[course] += 1

    q = deque([i for i in range(numCourses) if indegree[i] == 0])
    topo_order = []

    while q:
        node = q.popleft()
        topo_order.append(node)
        for neighbor in graph[node]:
            indegree[neighbor] -= 1
            if indegree[neighbor] == 0:
                q.append(neighbor)
    
    return topo_order if len(topo_order) == numCourses else []

In our example, one BFS topological order obtained is [0,1,2,3] (note that [0,2,1,3] is equally valid; the order of 1 and 2 might differ since both became 0-indegree at the same time).

Insight: Both DFS and BFS methods produce valid topological sorts. BFS (Kahn’s) is often easier to implement without recursion and can naturally detect cycles by count of output nodes. DFS is convenient when you want to leverage recursion and get a reverse post-order. Some problems (e.g., alien dictionary) might impose additional ordering requirements (like lexicographically smallest order), which can be handled by choosing the smallest next node (using a priority queue for Kahn’s, or sorting neighbors for DFS). But the core patterns remain the two above.

Connected Components & Island Counting

This pattern involves finding and counting disjoint connected subgraphs (components) in a graph or grid. Variants differ in the graph structure (generic graph vs grid) and the metrics collected (just count, or size, or other properties of each component).

Connected Components in an Undirected Graph

Modified Template: Use either DFS/BFS or Union-Find to find how many connected components (clusters of connected nodes) an undirected graph has. If using DFS/BFS:

Loop through all nodes. If a node is not yet visited, that means a new component – perform a DFS/BFS from that node marking all reachable nodes as visited, and increment the component count.
Continue until all nodes are visited.

If using Union-Find:

Initialize each node as its own parent. Union each edge’s nodes.
In the end, count how many unique parents (roots) remain, which gives the number of components.

Example Problem: Number of Connected Components in an Undirected Graph (LeetCode 323) – Given n nodes labeled 0..n-1 and a list of undirected edges, return how many connected components the graph has.

Walkthrough: Suppose n = 5, edges = [[0,1],[1,2],[3,4]]. The graph has two components: one connecting {0,1,2} and another connecting {3,4}.

DFS/BFS approach: Start at node 0, visit 1 and 2 (through edges). Marked {0,1,2} visited, count = 1. Then see node 1 and 2 are already visited. Move to node 3 (not visited yet), mark component 2 and visit 4. Now all nodes visited; answer = 2 components.

Union-Find approach: Initially 5 separate sets. Process edge 0-1 (union them), 1-2 (union them, now 0-1-2 are in one set), 3-4 (union them). Now check roots: perhaps root(0)=0 (for {0,1,2}) and root(3)=3 (for {3,4}), and root(4) also 3. We see two distinct roots, so 2 components.

# Python Code Snippet (Union-Find method)
def countComponents(n, edges):
    parent = list(range(n))
    def find(x):
        if parent[x] != x:
            parent[x] = find(parent[x])
        return parent[x]

    def union(x, y):
        rootX = find(x)
        rootY = find(y)
        if rootX != rootY:
            parent[rootY] = rootX
            return 1 # A merge happened
        return 0 # Already in same set

    num_components = n
    for u, v in edges:
        num_components -= union(u, v)
    
    return num_components

# Alternative using DFS/BFS
# def countComponentsDFS(n, edges):
#     adj = [[] for _ in range(n)]
#     for u, v in edges:
#         adj[u].append(v)
#         adj[v].append(u)
#     visited = [False] * n
#     count = 0
#     def dfs(u):
#         visited[u] = True
#         for v in adj[u]:
#             if not visited[v]:
#                 dfs(v)
#     for i in range(n):
#         if not visited[i]:
#             dfs(i)
#             count += 1
#     return count

Either approach works; DFS/BFS is straightforward to implement, while Union-Find is very efficient for repeated merge operations.

Number of Islands (Grid Connected Components)

Modified Template: Treat a 2D grid of 1s and 0s as a graph (often 4-directionally connected). Count how many groups of adjacent 1 cells exist. This is effectively counting connected components in a matrix. Usually solved with DFS or BFS flood-fill:

Loop over all cells. When an unvisited land cell (1) is found, increment count and perform a DFS/BFS to mark all reachable land in that island as visited (e.g., change 1 to 0 or use a visited set).
Continue scanning the grid for any other unvisited 1 to find other islands.

Example Problem: Number of Islands (LeetCode 200) – Given a binary grid (1 = land, 0 = water), count the number of distinct islands.

Walkthrough: Consider a grid:

1 1 0  
1 0 0  
0 0 1

Here, there are 2 islands. Starting at cell (0,0): it's land, start a DFS/BFS. It will visit (0,1) and (1,0) as they are connected land. Mark all as visited. That is island #1. Continue scanning: (0,1),(1,0) skipped as visited; (1,1) is water; … eventually reach (2,2): land not visited, that's a new island (#2). Mark it. No more unvisited land cells remain. Answer = 2.

Key considerations: Use a visited set or modify the grid in-place (e.g., set visited land to 0) to avoid recounting. Ensure you check bounds for neighbors. This pattern is fundamental for flood-fill and has variations (8-directional connectivity, bigger grids, etc.).

# Python Code Snippet (DFS for islands)
def numIslands(grid):
    if not grid: return 0
    R, C = len(grid), len(grid[0])
    count = 0

    def dfs(r, c):
        if not (0 <= r < R and 0 <= c < C and grid[r][c] == '1'):
            return
        grid[r][c] = '0'  # Mark as visited (or use a separate visited set)
        dfs(r + 1, c)
        dfs(r - 1, c)
        dfs(r, c + 1)
        dfs(r, c - 1)

    for i in range(R):
        for j in range(C):
            if grid[i][j] == '1':
                count += 1
                dfs(i, j)
    return count

BFS could be used similarly (using a queue). The result is the number of connected land components.

Max Area of Island (Component Size Calculation)

Modified Template: This is an extension where instead of just counting components, you calculate a property (size, perimeter, etc.) for each component. You still use DFS/BFS to traverse each component fully, but you maintain a counter during the traversal. Differences from basic component count:

Initialize a counter when a new component is found. Increment it for each cell/node visited in that component’s traversal.
Track a global maximum or accumulate results as needed.

In grid problems, this often means counting the cells of each island.

Example Problem: Max Area of Island (LeetCode 695) – Find the size (number of cells) of the largest island in a grid of 1s and 0s.

Walkthrough: Using a similar grid example:

1 1 0  
1 0 0  
0 1 1

Island1 (top-left cluster) has area 3 (cells at (0,0),(0,1),(1,0)). Island2 (bottom-right cluster) has area 2 ((2,1),(2,2)). The answer would be 3. During DFS of each island, we count cells:

Start at (0,0): area = 0. DFS marks and counts (0,0), (0,1), (1,0) -> area becomes 3. Update max = 3.
Later, at (2,1): area = 0. DFS counts (2,1),(2,2) -> area = 2, max remains 3.

# Python Code Snippet (DFS area count)
def maxAreaOfIsland(grid):
    if not grid: return 0
    R, C = len(grid), len(grid[0])
    max_area = 0

    def dfs(r, c):
        if not (0 <= r < R and 0 <= c < C and grid[r][c] == 1):
            return 0
        grid[r][c] = 0  # Mark as visited
        area = 1
        area += dfs(r + 1, c)
        area += dfs(r - 1, c)
        area += dfs(r, c + 1)
        area += dfs(r, c - 1)
        return area

    for i in range(R):
        for j in range(C):
            if grid[i][j] == 1:
                current_area = dfs(i, j)
                max_area = max(max_area, current_area)
    return max_area

The only change from basic island counting is that dfs now returns a size, and we accumulate it. Variants of this pattern can appear as computing per-component metrics (e.g., size, perimeter, enclosing boundary, etc.).

Variations: Other related problems include “Closed Islands” (count islands that are completely surrounded by water, requiring a pre-processing step to ignore edge-connected lands) and “Connected Components in Directed Graph” (strongly connected components, which uses algorithms like Kosaraju or Tarjan). Those are more specialized, but the general idea of iterating over nodes and doing a contained DFS/BFS for each component remains consistent.

Shortest Path Variants

For shortest path problems, the approach depends on whether the graph is weighted or unweighted (or special cases like weights 0/1). BFS is the go-to for unweighted graphs (each edge has equal cost), while Dijkstra’s algorithm is standard for weighted graphs with non-negative weights. Key patterns include multi-source scenarios, bidirectional search, and optimizations for special weight structures.

Unweighted Shortest Path (Single-Source BFS)

Modified Template: For an unweighted graph (or grid), BFS from the start node naturally finds the shortest path to every reachable node. The template is a standard BFS, with the addition that we can keep a distance array or map to record distances (or use level count). Differences from plain BFS:

Maintain a distance for each node (initialize start node distance = 0).
When enqueuing a neighbor, set dist[neighbor] = dist[curr] + 1.
You can terminate early if you reach a target node and only need that shortest distance (early exit as discussed). Otherwise, BFS will eventually compute min distances to all reachable nodes.

Example Problem: Word Ladder (LeetCode 127) – Given a beginWord, endWord, and a dictionary of words, find the shortest transformation sequence from begin to end, changing one letter at a time and using only valid dictionary words. This can be modeled as an unweighted graph: each word is a node, and an edge connects words that differ by one letter. BFS from beginWord yields the shortest transformation length.

Walkthrough: begin = "hit", end = "cog", wordList = ["hot","dot","dog","lot","log","cog"]. Construct the implicit graph of connections (e.g., "hit" -> "hot" -> "dot" -> "dog" -> "cog" is one path). BFS exploration by levels:

Level 0: {"hit"}
Level 1: {"hot"} (hit -> hot)
Level 2: {"dot","lot"} (hot -> dot,lot)
Level 3: {"dog","log"} (dot -> dog, lot -> log)
Level 4: {"cog"} (dog -> cog, or log -> cog) – found endWord.

The shortest path "hit→hot→dot→dog→cog" is 5 words, so the transformation length (number of edges) is 4. BFS ensures that when we first reach "cog", we are at the minimum number of steps.

Note: Implementing this efficiently requires generating neighbors (words differing by one letter) without checking all dictionary words each time. A common trick is precomputing wildcard patterns (e.g., "*it": {"hit"}, "*ot": {"hot","dot","lot"}, etc.) to find adjacent words in O(Alphabet) time. The BFS loop itself is standard.

# Python Pseudo-code
from collections import deque

def wordLadderLength(beginWord, endWord, wordList):
    wordSet = set(wordList)
    if endWord not in wordSet:
        return 0

    q = deque([(beginWord, 1)]) # (word, length_of_sequence)
    visited = {beginWord}

    while q:
        current_word, length = q.popleft()
        if current_word == endWord:
            return length
        
        # Find neighbors (differ by one letter)
        for i in range(len(current_word)):
            original_char = current_word[i]
            for char_code in range(ord('a'), ord('z') + 1):
                char = chr(char_code)
                if char == original_char:
                    continue
                
                next_word = list(current_word)
                next_word[i] = char
                next_word_str = "".join(next_word)
                
                if next_word_str in wordSet and next_word_str not in visited:
                    visited.add(next_word_str)
                    q.append((next_word_str, length + 1))
    return 0 # endWord not reachable

Here getOneLetterNeighbors would yield all words in the wordList that differ by one char. The first time we see endWord, we return the distance. This pattern of BFS for shortest path in unweighted scenarios applies to grids (minimum steps in a maze), to puzzles (like open lock), and more.

Multi-Source BFS for Shortest Distance Map

Modified Template: Similar to the multi-source BFS discussed earlier, but here the goal is often to compute the minimum distance of every node from the nearest source (rather than a single target). The template:

Initialize queue with all source nodes (distance 0).
BFS outward, assigning distance = current_distance+1 to neighbors when first encountered.

This effectively fills a distance matrix or array where each cell/node gets the distance to the closest source.

Example Problem: 01 Matrix (LeetCode 542) – Given a binary matrix of 0s and 1s, find the distance of each cell to the nearest 0. Here every 0 in the grid is a source with distance 0, and we want the shortest distance to a 0 for each cell.

Walkthrough: Input matrix:

1 0 1  
0 1 1  
1 1 1

Treat all 0 cells as starting points. Initially, push coordinates of the two 0s (0,1) and (1,0) into the queue with distance 0. The BFS wave will compute distances:

Cells adjacent to a 0 get distance 1. In this case, positions like (0,0) and (1,1) (adjacent to the 0 at (1,0)), and (0,2) and (1,2) (adjacent to the 0 at (0,1)) will be set to 1.

Cells two steps away get distance 2, etc. For example, (2,2) ends up with distance 2 (nearest 0 is at (1,0) or (0,1) via two steps), (2,0) gets 1 (adjacent to (1,0)).

The output distance matrix would be:

1 0 1  
0 1 1  1 1 1

Key idea: By starting from all zeros, the first time we reach a cell, it is via the shortest path from the closest 0. This multi-source BFS is more efficient than running BFS from each source separately. It’s widely used in grid problems like this, as well as scenarios like finding nearest guard/exit in a maze from multiple start points, etc.

Bidirectional BFS (Two-End Search)

Modified Template: In cases where the state space is very large and you have a single source and a single target, bidirectional BFS can greatly reduce the search time. The idea is to run two BFS searches simultaneously – one forward from the source, and one backward from the target (or another forward from target in an undirected scenario) – and stop when the two meet in the middle. Differences:

Maintain two frontiers (two queues, two visited sets/distances for forward and backward searches).
Alternate expanding one step from one side, then one step from the other, or expand the smaller frontier to balance.
Check at each expansion if a node from one side is already visited by the other side – if so, a connection is found and the shortest path length is the sum of depths from both sides.

Example Problem: Word Ladder (revisited) – The standard BFS solution can be slow if the word list is large (because the branching factor is high). Bidirectional BFS from beginWord and endWord cuts the search depth roughly in half.

Walkthrough: Using the earlier example (hit -> cog), a bidirectional approach would:

Start from "hit" and "cog" simultaneously. Initially: forward frontier = {"hit"}, backward frontier = {"cog"}.
Expand forward: {"hit"} -> {"hot"}. Expand backward: {"cog"} -> {"dog","log"} (words that lead into "cog").
Forward frontier = {"hot"}, backward = {"dog","log"}. No intersection yet.
Expand forward: {"hot"} -> {"dot","lot"}. Expand backward: {"dog","log"} -> from "dog": {"dot","cog","dog"}... actually "dot" appears, from "log": {"lot","cog","log"}... "lot" appears. Now backward frontier = {"dot","lot"} (assuming we mark visited to avoid repeats).
We see that "dot" and "lot" are in forward’s visited or frontier (forward visited = {"hit","hot","dot","lot"} by now). The searches meet at "dot"/"lot". We can reconstruct that path length: ("hit"->"hot"->"dot") + ("dot"->"dog"->"cog") = 5 words, which matches the answer. We found the connection at half the depth.

Notes: Implementing bidirectional BFS requires careful bookkeeping (two visited dicts with levels). Typically you alternate one step at a time (or always expand the smaller frontier for efficiency). When a meeting is found, you calculate the total distance. The complexity can drop roughly from O(b^d) to O(b^(d/2) + b^(d/2)) for branching factor b and depth d, which is a significant improvement for large d.

Use Cases: Besides word ladder, two-end BFS is useful in puzzles (like sliding puzzle, where start and target configurations are known), or scenarios like finding the shortest transformation between two states.

Dijkstra’s Algorithm (Weighted Shortest Path)

Modified Template: For graphs with positive weights, BFS is not sufficient (edges have different costs). Dijkstra’s algorithm is the classic solution for single-source shortest paths in a weighted graph with non-negative weights:

Use a min-priority queue (min-heap) that stores (distance, node). Initialize with (0, start).
Pop the smallest distance node, if not visited (or if this distance is the current best).
Relax all its outgoing edges: if dist[u] + w(u,v) < dist[v], update dist[v] and push (dist[v], v) into the heap.
Continue until the queue is empty or until all reachable nodes have their shortest distance finalized.

Example Problem: Network Delay Time (LeetCode 743) – You are given times (edges with travel times) for a directed weighted graph and a starting node K. Return how long it takes for a signal to reach all nodes (i.e., the longest shortest-path from K to any node), or -1 if some node is unreachable. Dijkstra’s can find shortest paths from K to all nodes, then we take the max.

Walkthrough: Suppose we have 4 nodes, starting at 1, and times (edges): 1→2 (1), 1→3 (4), 2→3 (2), 2→4 (5), 3→4 (1). The graph weights:

From 1: the direct distances known initially: dist(1)=0, dist(2)=1, dist(3)=4, dist(4)=∞.
Pop node 1 (dist 0). Relax edges: update dist(2)=1, dist(3)=4. Push (1,2),(4,3).
Pop node 2 (dist 1). Relax edges from 2: to 3, possible new dist = 1+2=3 which is better than current 4, so update dist(3)=3 (decrease key in heap); to 4, new dist = 1+5=6, update dist(4)=6. Push (3,3),(6,4).
Pop node 3 (dist 3). Relax edges from 3: to 4, possible new dist = 3+1=4 which is better than current 6, update dist(4)=4. Push (4,4).
Pop node 4 (dist 4). No outgoing edges (or none that improve anything). Done.

Distances: {1:0, 2:1, 3:3, 4:4}. The time for the signal to reach all nodes is max(dist) = 4.

# Python Code Snippet
import heapq

def networkDelayTime(times, n, k):
    adj = {i: [] for i in range(1, n + 1)}
    for u, v, w in times:
        adj[u].append((v, w))

    dist = {i: float('inf') for i in range(1, n + 1)}
    dist[k] = 0
    
    min_heap = [(0, k)]  # (distance, node)
    
    while min_heap:
        d, u = heapq.heappop(min_heap)
        
        # If we found a shorter path already, skip
        if d > dist[u]:
            continue
            
        for v, weight in adj[u]:
            if dist[u] + weight < dist[v]:
                dist[v] = dist[u] + weight
                heapq.heappush(min_heap, (dist[v], v))
                
    max_delay = 0
    for node_dist in dist.values():
        if node_dist == float('inf'):
            return -1 # Node not reachable
        max_delay = max(max_delay, node_dist)
        
    return max_delay

After running this, we can take max(dist[1:]) to get the time to reach all nodes (if any node distance is inf, then return -1). Dijkstra’s runs in O((E+V) log V) and is the standard for weighted graphs.

Variants: If edges can have weight 0 or 1, a specialized deque-based BFS (often called 0-1 BFS) can be used to optimize (using a deque: push 0-weight neighbors to front, 1-weight to back). For graphs with possible negative weights (but no negative cycles), Bellman-Ford or SPFA algorithms are used (though these are less common in coding interviews unless specifically looking for understanding of negatives). For extremely large graphs or specific heuristic-driven searches, A* algorithm might come up, but typically BFS and Dijkstra cover the majority of shortest path interview scenarios.

Interactive Graph Traversal Animations

Rotting Oranges (Multi-Source BFS) Word Search (DFS Backtracking)

Grid (e.g., 2,1,1;1,1,0;0,1,1):

Minute: 0, Fresh: ?, Queue: []

Enter grid (e.g. 2,1,1;1,1,0;0,1,1).

Top Interview Questions for Graphs

Top Graph Traversal Variants & Patterns