DSA Interview Course | Crack Data Structures & Algorithms

Introduction

Binary search is a highly efficient search algorithm that finds the position of a target value within a sorted array. It operates on the principle of "divide and conquer" by repeatedly dividing the search interval in half. This approach yields a logarithmic time complexity (O(log n)), making it significantly faster than a linear scan (O(n)) for large datasets.

Due to its efficiency and fundamental nature, binary search is a staple topic in coding interviews and a crucial technique for optimizing search operations, especially when dealing with sorted lists or identifying thresholds under monotonic conditions.

Where it Appears

Variations of binary search are common in interview questions. Examples include searching in a rotated sorted array, finding the first or last occurrence of a number, or locating a peak in a "mountain" array. These problems typically involve exploiting a sorted order or a monotonic relationship to efficiently narrow down the search space.

Key Trigger Cues – When to Consider Binary Search

Sorted or Monotonic Data: The input data is sorted, or the problem involves finding a boundary in a monotonic sequence (e.g., an increasing or decreasing function). If you can compare a midpoint to decide which half contains the answer, binary search is a strong candidate.
Large Input Size (Efficiency Matters): When the dataset is large, an O(n) linear scan is often too slow. Binary search offers an O(log n) solution, which is vastly more efficient. For instance, searching 1,000,000 elements takes ~20 steps with binary search versus 1,000,000 with linear search.
Looking for a Specific Value or Threshold: If the problem asks to find a specific target, boundary, or condition within an ordered range (e.g., "find the first index where condition X is true"), binary search is often the way to go.

Core Idea / Mental Model

At its heart, binary search leverages the sorted order of elements to eliminate half of the search space in each step. Instead of a linear, one-by-one scan, it jumps to the middle of the current range. This midpoint acts as a pivot to decide which half of the data to continue searching in. By repeatedly discarding half of the remaining possibilities, the algorithm rapidly converges on the target.

Real-world Analogy

Imagine looking for a word in an alphabetical dictionary. You wouldn’t read every page. Instead, you'd open the book near the middle. If your target word (e.g., "ostrich") comes after that page, you focus on the latter half; if before, the earlier half. By repeatedly halving the relevant section, you quickly find the word. This is precisely how binary search operates on a sorted list.

Thought Process

Maintain two pointers (or indices), often named low and high, to define the current search range. Compare the target with the element at the midpoint of this range:

If the mid element matches the target, you've found it.
If the mid element is less than the target, the target must be in the right half (since the array is sorted). Discard the left half by moving low to mid + 1.
If the mid element is greater than the target, the target must be in the left half. Discard the right half by moving high to mid - 1.

Repeat this on the shrinking range until the target is found or the range becomes empty (target not in array).

Base Template (Iterative Python)

Below is a canonical iterative implementation of binary search in Python. It uses left and right indices to track search bounds and narrows the range based on midpoint comparisons:

def binary_search(arr, target):
    left, right = 0, len(arr) - 1   # Initialize search bounds
    while left <= right:           # Continue while valid search space
        mid = (left + right) // 2  # Choose the midpoint
        if arr[mid] == target:
            return mid             # Found target at index mid
        elif arr[mid] < target:
            left = mid + 1         # Target is larger, search right half
        else:
            right = mid - 1        # Target is smaller, search left half
    return -1                      # Target not found

Dry Run Example: Suppose arr = [1, 3, 5, 7, 9] and target = 7.

Initial bounds: left = 0, right = 4.
Iteration 1: mid = (0+4)//2 = 2. arr[2] = 5. Since 5 < 7, target is in the right half. Update left = mid + 1 = 3.
Iteration 2: Now left = 3, right = 4. mid = (3+4)//2 = 3. arr[3] = 7. Target found at index 3. Algorithm returns 3.

Interactive Animation

Target:

Checking mid at index 4 (value: 50).

Time & Space Complexity

Time Complexity: O(log n)

In the worst and average cases. The algorithm halves the search range at each step, so comparisons grow logarithmically with input size. For 1 million elements, it's ~20 checks. Best case is O(1) if the target is the first midpoint.

Space Complexity: O(1)

For the iterative version, as it only uses a few variables. A recursive implementation would incur an O(log n) space cost due to the recursion call stack.

Common Pitfalls

While conceptually simple, binary search implementation can be tricky. Here are common pitfalls:

1. Unsorted Data: Applying binary search to unsorted or non-monotonic data will fail. It relies on sorted order to discard halves.

Tip: Always ensure data is sorted or the condition is monotonic. If not, binary search isn't suitable.

2. Off-by-One Errors (Boundary Issues): Incorrect loop boundaries (left < right vs left <= right) or mid-calculation updates can lead to missed targets or infinite loops.

Tip: Define your search interval carefully (e.g., inclusive left <= right). Test edge cases like empty/small arrays.

3. Midpoint Calculation Overflow: In languages like C++/Java, low + high can overflow for large numbers. Python handles large integers, but it's good practice to know the fix: mid = low + (high - low) // 2.

Tip: Use the safe midpoint formula, especially in languages with fixed-size integers.

4. Neglecting Edge Cases: Forgetting empty arrays, single-element arrays, or when the target is outside the array's range can cause errors.

Tip: Test these scenarios. Ensure a clear "not found" indicator (e.g., -1). Be mindful if first/last occurrences of duplicates are needed.

5. Excessive Recursion Depth: Recursive binary search can hit recursion limits on very large arrays in some languages due to stack frame accumulation.

Tip: Prefer iterative binary search unless recursion is specifically required. Ensure a base case to prevent infinite recursion.