Reading Postgres EXPLAIN ANALYZE: the guide I wish I had three years ago
← Back
April 4, 2026Database8 min read

Reading Postgres EXPLAIN ANALYZE: the guide I wish I had three years ago

Published April 4, 20268 min read

I stared at EXPLAIN ANALYZE output for a year without really understanding it. I knew "Seq Scan bad, Index Scan good" but nothing else. Then a senior DBA walked me through a real slow query for 30 minutes and everything clicked. Here is what I learned — the three numbers that matter, the nodes to look for, and the patterns that signal specific performance problems.

How to run EXPLAIN ANALYZE

sql
-- Basic: plan only (no actual execution)
EXPLAIN SELECT * FROM orders WHERE user_id = $1;

-- Full analysis: executes the query, shows actual vs estimated
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE user_id = $1;

-- For production: wrap in a transaction to rollback writes
BEGIN;
EXPLAIN (ANALYZE, BUFFERS)
UPDATE orders SET status = 'completed' WHERE id = $1;
ROLLBACK;

The three numbers that matter

text
Seq Scan on orders  
  (cost=0.00..45231.00 rows=892000 width=156)  
  (actual time=0.042..1823.291 rows=892000 loops=1)

     ^-- estimated        ^-- actual
     cost (in Postgres cost units)
                    ^-- rows Postgres predicted
                                              ^-- rows actually returned

1. Actual vs estimated rows: When actual rows are much higher than estimated (by 10x or more), Postgres made a bad plan decision based on stale statistics. Fix: ANALYZE tablename; to update statistics.

2. actual time: The real wall-clock time. Look at the innermost node with the highest time — that's your bottleneck.

3. loops: If loops > 1, actual time is per-loop. Multiply: actual time × loops = total time for that node.

Reading a real slow query plan

text
Hash Join  (cost=18234.56..89234.12 rows=456 width=200)
           (actual time=1823.45..2134.23 rows=12 loops=1)
  Hash Cond: (orders.user_id = users.id)
  Buffers: shared hit=234 read=89432            <- IO: 89K pages read from disk
  ->  Seq Scan on orders                         <- PROBLEM: full table scan
        (cost=0.00..54321.00 rows=892000 width=100)
        (actual time=0.05..1234.56 rows=892000 loops=1)
        Filter: (status = 'completed')
        Rows Removed by Filter: 123456
  ->  Hash  (cost=1234.56..1234.56 rows=10000 width=100)
             (actual time=45.23..45.23 rows=10000 loops=1)
        Buckets: 16384  Batches: 1  Memory Usage: 1234kB
        ->  Seq Scan on users
              (cost=0.00..1234.56 rows=10000 width=100)
              (actual time=0.05..34.23 rows=10000 loops=1)

What this tells me:

  • The orders table scan is the bottleneck (1234ms of the 2134ms total)
  • 892K rows scanned, only 12 rows returned — 99.9% filter ratio means the index is missing
  • 89,432 disk pages read — IO-bound, not CPU-bound
  • Fix: CREATE INDEX ON orders (status) WHERE status = 'completed';

The patterns that indicate specific problems

Pattern 1: Seq Scan + high rows removed by filter

text
Seq Scan on orders
  Rows Removed by Filter: 891988  <- scanning 892K to get 12 rows
Fix: Add an index on the filtered column(s)

Pattern 2: Massive rows estimate mismatch

text
(cost=... rows=10 ...)              <- Postgres estimated 10 rows
(actual time=... rows=89432 ...)    <- Actually got 89K rows
Fix: ANALYZE tablename; or consider extending pg_statistic with CREATE STATISTICS

Pattern 3: Nested Loop Join with many loops

text
Nested Loop
  -> Seq Scan on users (rows=10000)
  -> Index Scan on orders   <- Runs 10000 times!
       (actual time=0.05..1.2 loops=10000)
Total: 10000 * 1.2ms = 12 seconds
Fix: Add index on orders.user_id or increase work_mem for hash join

Pattern 4: Hash Batches > 1 (spilling to disk)

text
Hash  (Batches: 8  Memory Usage: 4096kB)  <- Batches > 1 = spilling to disk
Fix: SET work_mem = '64MB'; (or increase per session for the heavy query)

Quick diagnostic checklist

  1. Find the innermost node with the highest actual time
  2. Check if it is a Seq Scan — if so, check the filter ratio
  3. Check if rows estimate matches actual rows — big mismatch = stale stats
  4. Check shared read in BUFFERS — high read count = IO bound (needs index or more cache)
  5. Check Hash Batches — anything > 1 means work_mem is too low for this query

The online tool explain.dalibo.com renders EXPLAIN output as a visual tree with color-coding. Paste your output there and the bottleneck becomes visually obvious.

Share this
← All Posts8 min read