Introduction
A Trie (pronounced "try") is a tree-based data structure specialized for storing and retrieving strings by their prefixes. Each node in a trie represents a character of a string, and paths from the root form words. Tries are typically applied in interview questions that involve efficient prefix lookups or dictionary-word storage.
They are often used to handle scenarios like autocomplete suggestions, spell-checking, or prefix-based searches, where many strings share common starting substrings. In such problems, using a trie can drastically improve query performance.
When to consider a Trie:
Look for keywords or cues such as “prefix”, “autocomplete”, “dictionary of words”, “starts with”, or “longest prefix matching”. Whenever a problem involves checking or storing a large set of strings and repeatedly querying them by prefixes, a trie is likely the ideal approach.
Core Idea / Mental Model
Think of a trie as a tree of letters that grows downward for each word. The root node is an empty starting point, and each edge to a child represents adding a character. As you spell out a word from the root, you traverse a path in the tree. Words with common prefixes share the initial part of the path, meaning the trie merges shared prefixes rather than storing them repeatedly.
A helpful way to picture a trie is like an autocomplete decision tree: each step down the tree narrows the possibilities. By the time you traverse to a node corresponding to a prefix, all words with that prefix reside in its subtree. This structure lets tries quickly answer questions like "what words start with X?".
Base Template (Python Implementation)
Below is a basic implementation of a Trie in Python. Each node contains a dictionary of children and a flag indicating if a word ends at that node.
class TrieNode:
def __init__(self):
self.children = {}
self.is_end_of_word = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word: str):
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char]
node.is_end_of_word = True
def search(self, word: str) -> bool:
node = self.root
for char in word:
if char not in node.children:
return False
node = node.children[char]
return node.is_end_of_word
Code Explanation:
insert(word): Traverses characters, creating child nodes if needed. Marks the final node as is_end_of_word = True.
search(word): Traverses characters. If a path is missing, word not found. If path exists, returns the is_end_of_word status of the final node.
trie = Trie()
trie.insert("app")
trie.insert("apple")
trie.search("app") # True
trie.search("ap") # False (prefix, not a full word)
trie.search("apple") # True
trie.search("apply") # False
Interactive Animation (Trie Construction & Search)
The animation below demonstrates how a Trie is constructed by inserting words and how search operations traverse the structure. Example words: "ape", "apple", "ban".
- Start: Empty trie (root node).
- Insert "ape": Root -> 'a' -> 'p' -> 'e' (mark 'e' as end-of-word).
- Insert "apple": Root -> 'a' -> 'p' (shared). New path from 'p': -> 'p' -> 'l' -> 'e' (mark final 'e' as end-of-word).
- Insert "ban": Root -> 'b' (new branch) -> 'a' -> 'n' (mark 'n' as end-of-word).
Time & Space Complexity
Let 𝑚 be the length of the word/prefix and 𝑁 be the number of words, 𝐿 be the average length of words.
- Insertion: O(𝑚) - proportional to word length.
- Search (exact match): O(𝑚)
- Prefix Check (startsWith): O(𝑚) - proportional to prefix length.
Space Complexity: O(𝑁 · 𝐿) in the worst case (no shared prefixes). Each node can have up to 𝐾 children pointers (𝐾 = alphabet size). Tries save space with overlapping prefixes. Using dictionary for children is more space-efficient for sparse children than fixed-size arrays.
Common Pitfalls
- Not initializing child links: Ensure
node.children = {}(or equivalent) in constructor. - Forgetting to mark word endings: Crucial to set
is_end_of_word = True. - Mishandling prefix queries: Prefix check shouldn't rely on
is_end_of_word. - Unnecessary node creation: Only create new nodes if path doesn't exist.
- Ignoring space-time tradeoff: Tries are fast but can be memory-intensive. Mention this.
- Case sensitivity and input normalization: Clarify input constraints (e.g., lowercase only, or normalize).
- Handling duplicates or counts: Basic tries don't count. Add a counter if frequency is needed.