Data Stores Under the Hood: MySQL

Most of us use MySQL daily without thinking too hard about what's happening under the hood. We've got a table, we write some SQL, we get results. But when you're building systems that need to scale—or when something goes wrong at 2 AM—understanding the internals makes a huge difference.

This is a look at MySQL's architecture, focusing on what matters for application developers.


The Engine

MySQL has pluggable storage engines. InnoDB is the default (and the one you should almost always use). It replaced MyISAM around MySQL 5.5 and brings ACID compliance, row-level locking, and crash recovery.

Other engines exist—Memory, MyISAM, Archive—but they're niche use cases. Stick with InnoDB unless you have a specific reason not to.


Storage Structure

InnoDB stores data in B+ trees. If you squint, it looks like this:

        [50]
       /    \
    [20,30] [70,90]
    / | \   / | \
  10 25 40 60 80 95
   ↓  ↓  ↓  ↓  ↓  ↓
 [data] ... sequential linked list of leaf nodes ...

Key characteristics:

  • Non-leaf nodes hold only keys (indexes), acting as a hierarchical navigation structure
  • Leaf nodes store the actual data, plus a pointer to the next leaf (forming a linked list)
  • This linked list makes range queries efficient—you can scan a sequence of pages without bouncing around the tree
  • Primary key lookups are fast: log₂(n) depth, typically 3-4 levels for millions of rows

Row vs Column Orientation

MySQL is row-oriented. Each row contains all its columns together on disk. This makes reads of complete rows fast (good for OLTP) but analytical queries scanning millions of rows less efficient.

For analytical workloads, consider columnar stores like ClickHouse or MySQL's own HeatWave.


Indices

Indices are how MySQL finds data fast. Without them, every query would be a full table scan.

Clustered Index

The primary key becomes the clustered index. Data is physically stored in the order of this index. This means:

  • Looking up by primary key is a single B+ tree traversal
  • Insertions are not in insertion order if you use a random UUID primary key (causes page splits)
  • Every InnoDB table has exactly one clustered index
CREATE TABLE users (
  id BIGINT PRIMARY KEY,  -- This becomes the clustered index
  email VARCHAR(255),
  created_at TIMESTAMP
);

Secondary Indices

Any additional indexes you create are secondary indices. They point back to the primary key, not the row directly.

CREATE INDEX idx_email ON users(email);

This means looking up by email requires two B+ tree traversals: one to find the primary key, one to fetch the row. For high-frequency queries, this matters.

Covering Indices

A covering index includes all columns needed by a query, so MySQL never touches the actual table.

-- If this query runs frequently:
SELECT email, created_at FROM users WHERE email = 'user@example.com';

-- This index would be a covering index:
CREATE INDEX idx_email_covering ON users(email, created_at);

ACID Transactions

InnoDB supports full ACID transactions:

PropertyWhat it meansInnoDB behavior
AtomicityTransactions are all-or-nothingUndo logs roll back partial changes
ConsistencyDB moves from valid state to valid stateForeign keys, constraints enforced
IsolationConcurrent transactions don't interfereMVCC with configurable isolation levels
DurabilityCommitted data survives crashesWrite-ahead logging (redo logs)

Isolation Levels

MySQL's default is REPEATABLE READ, but you can configure per-session:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
LevelDirty readsNon-repeatable readsPhantom reads
Read UncommittedPossiblePossiblePossible
Read CommittedPreventedPossiblePossible
Repeatable Read (default)PreventedPreventedPossible*
SerializablePreventedPreventedPrevented

*InnoDB's gap locking makes phantom reads rare even at REPEATABLE READ.


MVCC (Multi-Version Concurrency Control)

MVCC is what makes InnoDB's concurrent reads possible. Instead of locking rows for readers, it keeps older versions in an undo log.

When you start a transaction, you see a snapshot of the data at that point—even if other transactions are modifying rows concurrently. Writers don't block readers.

The magic happens in two hidden columns:

  • DB_TRX_ID — transaction ID that last modified the row
  • DB_ROLL_PTR — pointer to the undo log entry

Old versions get cleaned up by the purge thread when they're no longer needed for active transactions.


Replication

MySQL replicates using a leader-follower model. The leader writes to a binary log; followers read from it and apply the changes.

Statement-Based vs Row-Based

Two replication formats exist:

FormatProsCons
SBR (statement)Smaller log size, deterministic functions workNon-deterministic queries can desync
RBR (row)Always consistentLarger logs, harder to audit

Most production setups use RBR or MIXED (MySQL picks per statement).

Replication Methods

Position-based (traditional):

Leader: writes to mysql-bin.000001
Follower: "start from position 1234 in that file"

GTID-based (modern, recommended):

Leader: assigns a globally unique ID (GTID) to each transaction
Follower: "replay transaction GTID-1234"

GTIDs make replication easier to manage—no need to track file positions, failover is cleaner.

Semi-Synchronous Replication

Default MySQL replication is asynchronous: the leader commits and doesn't wait for followers.

Semi-synchronous adds a wait:

-- Leader waits until at least one follower acknowledges
SET GLOBAL rpl_semi_sync_master_wait_for_slave_count = 1;

This trades some performance for stronger durability guarantees.

Multi-Leader

MySQL doesn't natively support multi-leader well. Group Replication is the built-in option (MySQL 8.0+), or you use external tools like Tungsten Replicator.

For most cases: stick with single leader. Multi-leader conflict resolution is harder than it looks.


Failure Recovery

When MySQL crashes, recovery happens in stages:

  1. Redo log replay — Committed transactions get reapplied
  2. Undo log rollback — Incomplete transactions get rolled back
  3. Purge — Old row versions get cleaned up

This is why you set innodb_log_file_size appropriately and why SSDs help—redo log I/O is sequential and critical to recovery speed.


Schema Design Considerations

Auto-increment vs UUID Primary Keys

-- Auto-increment (recommended for most cases)
id BIGINT AUTO_INCREMENT PRIMARY KEY

-- UUID (watch out for random inserts)
id CHAR(36) PRIMARY KEY  -- Causes B+ tree page splits

Foreign Keys

InnoDB enforces foreign keys. Love them or hate them, they ensure referential integrity. Some teams disable them for performance (PostgreSQL has similar debates).

Character Sets

-- Always use utf8mb4, not utf8
-- utf8 in MySQL is a 3-byte subset; utf8mb4 is true 4-byte UTF-8
CREATE TABLE t (...) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

What's Not There

A few things MySQL doesn't have that you might want:

FeatureWorkaround
No materialized viewsBuild at application layer or use triggers
No JSON-first designJSON column type exists, but it's not MongoDB
Limited GIS featuresPostGIS is better for serious geospatial
No native partitioning intelligenceUse sharding carefully

Practical Takeaways

  1. Index strategically — Every index has a write cost. Index columns you query, not columns you insert.

  2. Use auto-increment primary keys unless you have a compelling reason not to.

  3. Understand your isolation level — READ COMMITTED can avoid more deadlocks than REPEATABLE READ.

  4. Use EXPLAIN — Before shipping a query to production:

    EXPLAIN FORMAT=JSON SELECT ... FROM users WHERE email = 'test@example.com';
    
  5. Watch your buffer pool — InnoDB caches everything in the buffer pool. If it doesn't fit, you hit disk constantly.

MySQL is mature, battle-tested, and has a massive ecosystem. Understanding its internals helps you use it well—and debug it when it misbehaves.