Your Database Doesn't Trust the Server. That's Why It Writes Everything Twice.
What Every Backend Engineer Should Know About Write-Ahead Logs
When Systems Lie to You
Let's say your app submits a critical order update to your PostgreSQL database.
The API returns 200 OK. You move on.
But then the server crashes.
You restart the database and check the orders table. No record.
What happened?
It turns out, the database acknowledged your write before it had truly committed the data to disk.
That’s the pain point that Write-Ahead Logs (WAL) solve.
Ignoring WAL = Risky Assumptions
If a system writes data directly and crashes before syncing to disk:
Records can be half-written or corrupted
Acknowledged writes may vanish
Recovery might leave the system inconsistent
This isn't just a rare edge case. It's a recurring risk in any stateful system. Imagine a payment system that marks an order as paid but loses the transaction record after a crash. Or a logistics platform that acknowledges a shipment but forgets to actually persist the dispatch event.
The higher your write throughput, the more frequently this risk shows up, and the more expensive the consequences.
But how does PostgreSQL solve this problem?
Thanks to our partners who keep this newsletter free for the reader.
CodeRabbit → Free AI Code Reviews in VS Code
CodeRabbit brings AI-powered code reviews directly into VS Code, Cursor, and Windsurf. Get free, real-time feedback on every commit, before the PR, helping you catch bugs, security vulnerabilities, and performance issues early.
Per-commit reviews: Identify issues faster and avoid lengthy PR reviews
Context-aware analysis: Understand code changes deeply for more accurate feedback
Fix with AI and get AI-driven suggestions to implement code changes
Multi-Layered Reviews: Benefit from code reviews both in your IDE (free) and in your PR (paid subscription)
The solution was the WALs
Write-Ahead Logging flips the order of operations. Instead of writing data directly, the system logs the intent to change first. Let's see a classic example; I bet you’ve used it without knowing.
Here's the general flow (PostgreSQL example):
Receive a change request (e.g. INSERT)
Write the change to a WAL buffer
Flush the WAL buffer to disk (fsync)
Acknowledge the write to the client
Later, apply changes to the actual data files
Step 3 is the critical point. Until the WAL is safely flushed to disk, nothing is considered durable.
Nothing is acknowledged until the log is safely on disk.
Even if the system crashes right after step 3, PostgreSQL is able to replay the WAL on startup and restore the exact state before the crash.
This mechanism transforms potential data loss into a recoverable state.
WAL Is Your Database's Black Box
When a plane crashes, investigators use the flight recorder (aka "the black box") to reconstruct what happened.
WAL is the equivalent for your database.
It captures every intended change, insert, update, and delete in a sequential, append-only format.
Even if your heap or index pages are inconsistent due to a crash, the WAL gives the system a source of truth to rebuild from.
That's how PostgreSQL achieves ACID durability guarantees. Without it, every crash would be a data loss lottery.
What's Inside a WAL Entry?
In PostgreSQL, a WAL entry for an INSERT
may include:
Table/relation ID
Page number and offset
The tuple data being inserted
WAL entries are not SQL statements. They are low-level descriptions of the physical or logical changes needed to replay the operation.
This structure ensures that recovery doesn't require full query re-execution, just a scan of the WAL entries.
PostgreSQL stores WAL data in files typically 16MB in size by default. These files are named using a 24-character hexadecimal format like:
This name breaks down into:
00000001
: Timeline ID00000003
: WAL segment log ID00000065
: Segment within the log
These filenames are critical for:
Streaming replication (so replicas request the right segment)
Archiving (for point-in-time recovery)
Monitoring (to detect lag or growth)
If you see a folder full of these files growing rapidly, that's either a high write workload, a replication lag, a misconfiguration, or a broken archiving process. Keep an eye on it.
WAL Enables Replication
Because WAL is a clean, ordered log of all changes, it's ideal for streaming replication:
PostgreSQL sends WAL segments to replicas over a replication connection
Replicas replay the WAL to stay in sync
MongoDB uses its oplog the same way
Kafka takes this idea further; its entire architecture is built around logs
Instead of building separate change streams, systems reuse their WALs to power real-time data distribution.
Things That Can Break
WAL helps, but it's not bulletproof. Watch for:
fsync disabled: Some configurations or test environments skip disk flushing, giving a false sense of safety.
Slow disks: Commit performance is gated by how fast you can flush WALs.
WAL bloat: Without cleanup or archiving, logs can consume large amounts of storage.
Missing archive strategy: For point-in-time recovery, you must safely copy old WAL segments to backup storage.
WAL gives you a recovery plan. But it only works if you configure and monitor it properly.
Key Takeaways
WAL logs every change before applying it, which ensures durability and crash safety
Writes are acknowledged only after the log hits persistent storage
WAL is used for crash recovery, replication, and backup
PostgreSQL, MongoDB, Kafka, and many other systems rely on WAL-like designs
You must verify your environment uses fsync, manages disk I/O, and handles WAL archiving
Once you start to see your system as a stream of logged changes, replication, failover, and recovery become easier to reason about.
WALs aren't just a recovery mechanism; they're a design principle.
That's why your database writes everything twice.
Until next time,
— Raul
System Design Classroom is a reader-supported publication. To receive new posts and support my work, consider becoming a paid subscriber.
A must know topic if you want to understand how databases work, and very well written! thanks Raul.
Very helpful content. Thanks a lot.