Staff Prep 02: Reading EXPLAIN ANALYZE Like a Query Planner
Back to Part 01: Schema Design. Today we go inside the query planner. EXPLAIN ANALYZE is the most powerful diagnostic tool in your Postgres arsenal. Most engineers know it exists. Staff engineers can read it fluently — they spot sequential scans, bad estimates, and nested loop anti-patterns without hesitation. That is what we are building today.
EXPLAIN vs EXPLAIN ANALYZE: the difference
EXPLAIN shows the plan the planner intends to execute — estimated costs,
estimated row counts, node types. It does not actually run the query. Use it when the query is
too expensive to run just for diagnostics.
EXPLAIN ANALYZE actually executes the query and shows both the plan and the
reality — actual rows returned, actual time spent per node. The gap between estimated and actual
is where performance bugs hide.
-- Basic usage
EXPLAIN SELECT * FROM orders WHERE user_id = 42;
-- With actual execution stats
EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 42;
-- Full detail: buffers, WAL, settings
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.amount, u.name
FROM orders o
JOIN users u ON u.id = o.user_id
WHERE o.created_at > NOW() - INTERVAL '30 days';
Reading the output: top-down, inside-out
EXPLAIN output is a tree. The deepest (most indented) nodes run first. Results bubble up. You read it bottom-up for execution order, but you read it top-down for the overall plan structure. Here is a real example:
Hash Join (cost=2841.00..9428.50 rows=15000 width=72)
(actual time=45.231..180.447 rows=14823 loops=1)
Hash Cond: (orders.user_id = users.id)
-> Seq Scan on orders (cost=0.00..5500.00 rows=180000 width=40)
(actual time=0.018..55.432 rows=180000 loops=1)
Filter: (created_at > (now() - '30 days'::interval))
Rows Removed by Filter: 20000
-> Hash (cost=1500.00..1500.00 rows=50000 width=32)
(actual time=30.124..30.124 rows=50000 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 3200kB
-> Seq Scan on users (cost=0.00..1500.00 rows=50000 width=32)
(actual time=0.009..18.203 rows=50000 loops=1)
Planning Time: 1.234 ms
Execution Time: 182.891 ms
Key things to read here:
- cost=start..total — start cost is how long before first row appears; total is full execution cost. Units are arbitrary planner cost units, not milliseconds.
- rows=estimated vs actual —
rows=15000estimated,rows=14823actual. Close. When these diverge by 10× or more, the planner has bad statistics. - Seq Scan — full table scan. Not always bad — on a small table, a seq scan beats an index scan. On a 10M-row table with a 0.1% selectivity filter, it is a problem.
- Hash Join vs Nested Loop vs Merge Join — each optimal in different scenarios. Nested Loop is good for small inner sets; Hash Join for large unsorted sets; Merge Join for pre-sorted large sets.
The four Node types you must know
Sequential scan (seq scan)
Reads every row in the table. Expected on small tables or when selectivity is low (fetching >10–20% of rows). A red flag when fetching a tiny fraction of a large table.
Index scan
Uses a B-tree index to find rows. Each matching row requires a heap fetch (the actual table row). Fast for high-selectivity queries. If you are fetching many rows, each heap fetch is a random I/O — can be slower than a seq scan past ~5–10% of the table.
Index only scan
Satisfied entirely from the index — no heap fetch needed. Only possible when all columns in the SELECT are in the index (covering index). The fastest scan type for read-heavy queries.
-- This query can use an Index Only Scan if the index covers all columns
CREATE INDEX idx_orders_user_amount
ON orders (user_id, amount)
INCLUDE (created_at); -- covering: no heap fetch needed
EXPLAIN SELECT user_id, amount, created_at
FROM orders
WHERE user_id = 42;
-- -> Index Only Scan using idx_orders_user_amount
Bitmap heap scan
Used when many rows match the index condition. Postgres first builds a bitmap of matching heap pages (Bitmap Index Scan), then fetches those pages in order (Bitmap Heap Scan). Reduces random I/O by sorting the page fetches. Common for medium-selectivity queries.
Diagnosing bad row estimates
The planner uses table statistics (pg_statistic) to estimate row counts. Stale stats
lead to bad estimates, which lead to wrong join strategies and wrong index choices.
-- Check when stats were last updated
SELECT schemaname, tablename, last_analyze, last_autoanalyze, n_live_tup, n_dead_tup
FROM pg_stat_user_tables
WHERE tablename = 'orders';
-- Force a stats update if stale
ANALYZE orders;
-- Increase stats target for a column with high cardinality
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
ANALYZE orders;
The default statistics target is 100. For columns used in WHERE clauses with many distinct values
(like status or category_id), bumping to 500 gives the planner better
histogram data and more accurate estimates.
Reading actual vs estimated: spotting the problem
-- Bad: estimates wildly off
Nested Loop (cost=0.00..5.00 rows=1 width=40)
(actual time=0.012..8450.234 rows=50000 loops=1)
-- Planner thought 1 row, got 50,000
-- Chose Nested Loop (good for small sets), terrible for 50k rows
-- Should have used Hash Join instead
-- Root cause: missing statistics on a join column, or
-- skewed distribution (99% of rows have user_id = NULL)
When estimated rows are 1 and actual rows are 50,000, the planner chose Nested Loop (optimal for
tiny sets) but got hammered with a large set. The fix: update statistics, add a partial index, or
use query hints as a last resort (enable_nestloop = off at session level).
The BUFFERS option: finding I/O bottlenecks
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 42;
-- Output includes:
-- Buffers: shared hit=245 read=1820
-- hit = read from shared_buffers (memory) — free
-- read = read from disk — expensive
-- High read count = your working set exceeds shared_buffers
If shared hit is low and read is high, you have two options:
increase shared_buffers (RAM permitting) or reduce the data scanned (better indexes).
Practical workflow: optimising a slow query
-- Step 1: Run EXPLAIN ANALYZE BUFFERS on the slow query
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.id, o.amount, u.name, u.email
FROM orders o
JOIN users u ON u.id = o.user_id
WHERE o.status = 'pending'
AND o.created_at > NOW() - INTERVAL '7 days'
ORDER BY o.created_at DESC
LIMIT 50;
-- Step 2: Look for:
-- a) Seq Scans on large tables
-- b) Estimated rows << actual rows
-- c) High Buffers: read count
-- d) Nested Loop with large actual row counts
-- Step 3: Create targeted indexes
CREATE INDEX CONCURRENTLY idx_orders_status_created
ON orders (status, created_at DESC)
WHERE status = 'pending'; -- partial index: only pending rows
-- Step 4: Re-run EXPLAIN ANALYZE and verify the plan changed
Quiz: test your understanding
Before moving on, answer these in your head (or out loud):
- What is the difference between
EXPLAINandEXPLAIN ANALYZE? When would you use one over the other? - You see
Seq Scan on orders (rows=1000000)in a query that filters byuser_id = 42. What could cause Postgres to choose a seq scan over an index scan? - The planner estimates 5 rows but actually returns 80,000. What are the likely causes and how do you fix them?
- What is a Bitmap Heap Scan and when does Postgres prefer it over a plain Index Scan?
- What does
Buffers: shared hit=10 read=5000tell you about your database's memory configuration?
Next up — Part 03: Indexing Strategy. B-tree internals, composite index ordering, partial indexes, and when NOT to index.