Data Structures & Complexities

As engineers, we spend a lot of time reasoning about which data structure to use. The choice isn't arbitrary—it directly impacts how efficiently our code runs. This post is a cheat sheet I keep coming back to when working through algorithmic problems or designing systems.


Arrays

Arrays are the workhorses of programming. They're ordered, support random access by index, and allow duplicates. If you need to grab an element at a specific position, arrays have you covered.

Time Complexities

OperationComplexity
Access by indexO(1)
Set by indexO(1)
Insert at endO(1)
Insert at middleO(n)
Delete from endO(1)
Delete from middleO(n)
Search (unsorted)O(n)
SortO(n log n)

When to Use Arrays

Arrays shine when you need fast random access. If you're constantly reaching for elements by their position, an array will almost always outperform linked structures.

Questions to Ask

Before settling on an array approach, consider these alternatives:

  • Does it need sorting? Sorting adds O(n log n) overhead. Sometimes the sort itself is the real solution.

  • Can multiple pointers help? The two-pointer technique unlocks efficient solutions for many problems:

    • Fast and slow pointers—when you need to find a meeting point or detect cycles
    • Both starting at zero—for problems involving subarrays
    • Opposite ends—when searching for pairs that sum to a target
    • Multiple pointers when dealing with k-sorted arrays
  • Would a sliding window work? Whether you need a fixed-size window or one that expands and contracts dynamically, this pattern solves a surprising number of subarray problems.

  • Can a hash set help? Converting to a set makes arrays O(1) for lookups and guarantees uniqueness.

  • Can a hash map help? The right key-value mapping can turn an O(n²) brute force into O(n):

    • What becomes the key? An index? A value? Some computed property?
    • What becomes the value? Count? Index? A list of indices?
  • Would a stack help? Perfect for LIFO problems—matching brackets, undo operations, or DFS traversal.

  • Would a queue help? FIFO semantics for BFS or any order-based processing.

  • Would a tree help? When the relationship between elements matters, a tree structure might be the right abstraction.

  • Does divide and conquer apply? Breaking the problem into k subproblems and combining results.

  • Does backtracking apply? Explore paths, hit a dead end, then try another direction.


Sets (Hash Sets)

Sets are unordered collections with no duplicates. They're optimized for one thing: checking whether something exists.

Time Complexities

OperationComplexity
Membership checkO(1)
Add elementO(1)
UnionO(m + n)
IntersectionO(min(m, n))

When to Use Sets

Reach for a set when you're frequently asking "is this in the collection?"—especially when you don't care about order or need to eliminate duplicates.


Hash Maps

Hash maps store key-value pairs and give you O(1) access to values by their keys.

Time Complexities

OperationComplexity
Get by keyO(1)
Set by keyO(1)
Delete by keyO(1)
IterationO(n)
CopyO(n)

When to Use Hash Maps

Hash maps are the right choice when you need to look things up by something other than position. The classic example: counting frequencies, tracking seen values, or memoization.


Linked Lists

Linked lists trade random access for efficient insertions and deletions. Each node points to the next (or previous) node in the sequence.

Single Linked List

class Node:
    def __init__(self, val):
        self.val = val
        self.next = None
OperationComplexity
Access by indexO(n)
Insert at headO(1)
Insert after nodeO(1)
Insert at tailO(n)
Delete nodeO(n) (need to find it first)
Delete headO(1)

Double Linked List

Adds a prev pointer, making reverse traversal and deletion from the tail O(1).

OperationComplexity
Access by indexO(n)
Insert at headO(1)
Insert at tailO(1)
Delete nodeO(1)
Delete tailO(1)

When to Use Linked Lists

Frequent insertions and deletions at known positions. The trade-off: you lose O(1) random access. For most modern use cases, arrays with dynamic resizing (like Python lists) perform just as well or better due to cache locality.


Binary Trees

Each node has at most two children—left and right.

Time Complexities (worst case)

OperationComplexity
SearchO(n)
InsertO(n)
DeleteO(n)

The worst case happens with degenerate trees that look like linked lists.


Binary Search Trees (BST)

BSTs maintain a sorted order: left subtree values are smaller, right subtree values are larger.

OperationComplexity
SearchO(h)
InsertO(h)
DeleteO(h)

Here, h is the tree height. A balanced BST has h = log(n). An unbalanced one degrades to O(n).

When to Use BSTs

You need sorted data with efficient search, insert, and delete. The sorted order is the key benefit—range queries become natural.


Balanced BST (AVL, Red-Black)

Self-balancing trees automatically maintain near-perfect balance after every operation.

OperationComplexity
SearchO(log n)
InsertO(log n)
DeleteO(log n)

Why This Matters

When hash collisions pile up, lookup degrades to O(n). Balanced BSTs guarantee O(log n) regardless of input distribution. Many hash table implementations use them as a fallback for collision handling.


Tries (Prefix Trees)

Tries excel at string operations. Each node represents a prefix, and paths down the tree spell out words.

OperationComplexity
Insert wordO(m)
Search wordO(m)
Starts with prefixO(m)

Here, m is the length of the word or prefix.

When to Use Tries

Autocomplete, spell checking, IP routing, any prefix-based searching. The shared prefixes mean memory efficiency for large dictionaries.


Stacks

LIFO—last in, first out. Push adds to the top, pop removes from the top.

OperationComplexity
PushO(1)
PopO(1)
PeekO(1)

When to Use Stacks

DFS traversal, expression evaluation, undo mechanisms, matching parentheses.


Queues

FIFO—first in, first out. Enqueue adds to the back, dequeue removes from the front.

OperationComplexity
EnqueueO(1)
DequeueO(1)
PeekO(1)

When to Use Queues

BFS traversal, task scheduling, any scenario where order matters and you process items sequentially.


Heaps

Heaps maintain the heap property: parents are always larger (max-heap) or smaller (min-heap) than their children. The root is always the maximum (or minimum) element.

Priority queues are typically implemented as heaps.

When to Use Heaps

Finding the kth largest/smallest element, implementing priority queues, merge k sorted arrays.


Putting It Together

Choosing a data structure isn't just about the operations—it's about the access patterns in your specific problem. A few heuristics:

  1. Random access by position? Array.
  2. Fast membership checking? Set or Hash Map.
  3. Maintaining sorted order? BST or balanced BST.
  4. Prefix matching? Trie.
  5. Frequent insertions/deletions at ends? Linked list or deque.
  6. Always need the min/max? Heap.

The best engineers I know don't memorize these tables. They understand why each structure has the complexity it does, which makes the choice obvious once you understand the problem.


Further Reading