Data Structures & Complexities
As engineers, we spend a lot of time reasoning about which data structure to use. The choice isn't arbitrary—it directly impacts how efficiently our code runs. This post is a cheat sheet I keep coming back to when working through algorithmic problems or designing systems.
Arrays
Arrays are the workhorses of programming. They're ordered, support random access by index, and allow duplicates. If you need to grab an element at a specific position, arrays have you covered.
Time Complexities
| Operation | Complexity |
|---|---|
| Access by index | O(1) |
| Set by index | O(1) |
| Insert at end | O(1) |
| Insert at middle | O(n) |
| Delete from end | O(1) |
| Delete from middle | O(n) |
| Search (unsorted) | O(n) |
| Sort | O(n log n) |
When to Use Arrays
Arrays shine when you need fast random access. If you're constantly reaching for elements by their position, an array will almost always outperform linked structures.
Questions to Ask
Before settling on an array approach, consider these alternatives:
-
Does it need sorting? Sorting adds O(n log n) overhead. Sometimes the sort itself is the real solution.
-
Can multiple pointers help? The two-pointer technique unlocks efficient solutions for many problems:
- Fast and slow pointers—when you need to find a meeting point or detect cycles
- Both starting at zero—for problems involving subarrays
- Opposite ends—when searching for pairs that sum to a target
- Multiple pointers when dealing with k-sorted arrays
-
Would a sliding window work? Whether you need a fixed-size window or one that expands and contracts dynamically, this pattern solves a surprising number of subarray problems.
-
Can a hash set help? Converting to a set makes arrays O(1) for lookups and guarantees uniqueness.
-
Can a hash map help? The right key-value mapping can turn an O(n²) brute force into O(n):
- What becomes the key? An index? A value? Some computed property?
- What becomes the value? Count? Index? A list of indices?
-
Would a stack help? Perfect for LIFO problems—matching brackets, undo operations, or DFS traversal.
-
Would a queue help? FIFO semantics for BFS or any order-based processing.
-
Would a tree help? When the relationship between elements matters, a tree structure might be the right abstraction.
-
Does divide and conquer apply? Breaking the problem into k subproblems and combining results.
-
Does backtracking apply? Explore paths, hit a dead end, then try another direction.
Sets (Hash Sets)
Sets are unordered collections with no duplicates. They're optimized for one thing: checking whether something exists.
Time Complexities
| Operation | Complexity |
|---|---|
| Membership check | O(1) |
| Add element | O(1) |
| Union | O(m + n) |
| Intersection | O(min(m, n)) |
When to Use Sets
Reach for a set when you're frequently asking "is this in the collection?"—especially when you don't care about order or need to eliminate duplicates.
Hash Maps
Hash maps store key-value pairs and give you O(1) access to values by their keys.
Time Complexities
| Operation | Complexity |
|---|---|
| Get by key | O(1) |
| Set by key | O(1) |
| Delete by key | O(1) |
| Iteration | O(n) |
| Copy | O(n) |
When to Use Hash Maps
Hash maps are the right choice when you need to look things up by something other than position. The classic example: counting frequencies, tracking seen values, or memoization.
Linked Lists
Linked lists trade random access for efficient insertions and deletions. Each node points to the next (or previous) node in the sequence.
Single Linked List
class Node:
def __init__(self, val):
self.val = val
self.next = None
| Operation | Complexity |
|---|---|
| Access by index | O(n) |
| Insert at head | O(1) |
| Insert after node | O(1) |
| Insert at tail | O(n) |
| Delete node | O(n) (need to find it first) |
| Delete head | O(1) |
Double Linked List
Adds a prev pointer, making reverse traversal and deletion from the tail O(1).
| Operation | Complexity |
|---|---|
| Access by index | O(n) |
| Insert at head | O(1) |
| Insert at tail | O(1) |
| Delete node | O(1) |
| Delete tail | O(1) |
When to Use Linked Lists
Frequent insertions and deletions at known positions. The trade-off: you lose O(1) random access. For most modern use cases, arrays with dynamic resizing (like Python lists) perform just as well or better due to cache locality.
Binary Trees
Each node has at most two children—left and right.
Time Complexities (worst case)
| Operation | Complexity |
|---|---|
| Search | O(n) |
| Insert | O(n) |
| Delete | O(n) |
The worst case happens with degenerate trees that look like linked lists.
Binary Search Trees (BST)
BSTs maintain a sorted order: left subtree values are smaller, right subtree values are larger.
| Operation | Complexity |
|---|---|
| Search | O(h) |
| Insert | O(h) |
| Delete | O(h) |
Here, h is the tree height. A balanced BST has h = log(n). An unbalanced one degrades to O(n).
When to Use BSTs
You need sorted data with efficient search, insert, and delete. The sorted order is the key benefit—range queries become natural.
Balanced BST (AVL, Red-Black)
Self-balancing trees automatically maintain near-perfect balance after every operation.
| Operation | Complexity |
|---|---|
| Search | O(log n) |
| Insert | O(log n) |
| Delete | O(log n) |
Why This Matters
When hash collisions pile up, lookup degrades to O(n). Balanced BSTs guarantee O(log n) regardless of input distribution. Many hash table implementations use them as a fallback for collision handling.
Tries (Prefix Trees)
Tries excel at string operations. Each node represents a prefix, and paths down the tree spell out words.
| Operation | Complexity |
|---|---|
| Insert word | O(m) |
| Search word | O(m) |
| Starts with prefix | O(m) |
Here, m is the length of the word or prefix.
When to Use Tries
Autocomplete, spell checking, IP routing, any prefix-based searching. The shared prefixes mean memory efficiency for large dictionaries.
Stacks
LIFO—last in, first out. Push adds to the top, pop removes from the top.
| Operation | Complexity |
|---|---|
| Push | O(1) |
| Pop | O(1) |
| Peek | O(1) |
When to Use Stacks
DFS traversal, expression evaluation, undo mechanisms, matching parentheses.
Queues
FIFO—first in, first out. Enqueue adds to the back, dequeue removes from the front.
| Operation | Complexity |
|---|---|
| Enqueue | O(1) |
| Dequeue | O(1) |
| Peek | O(1) |
When to Use Queues
BFS traversal, task scheduling, any scenario where order matters and you process items sequentially.
Heaps
Heaps maintain the heap property: parents are always larger (max-heap) or smaller (min-heap) than their children. The root is always the maximum (or minimum) element.
Priority queues are typically implemented as heaps.
When to Use Heaps
Finding the kth largest/smallest element, implementing priority queues, merge k sorted arrays.
Putting It Together
Choosing a data structure isn't just about the operations—it's about the access patterns in your specific problem. A few heuristics:
- Random access by position? Array.
- Fast membership checking? Set or Hash Map.
- Maintaining sorted order? BST or balanced BST.
- Prefix matching? Trie.
- Frequent insertions/deletions at ends? Linked list or deque.
- Always need the min/max? Heap.
The best engineers I know don't memorize these tables. They understand why each structure has the complexity it does, which makes the choice obvious once you understand the problem.
Further Reading
- Python Time Complexity —CPython's official complexity reference
- Big O Notation Explained
- Data Structures Visualized
- LeetCode's Linked List Deep Dive
