Parallel Query and work_mem: Making Postgres Use All Those Cores
When your dashboard query takes 18 seconds on a 16-core box, you’re not CPU-bound—you’re under-tuned. Postgres can parallelize scans, joins, and aggregates, but it won’t do it aggressively unless you set the right knobs and give operators enough memory. This guide shows the exact GUCs to change, what they do, and before/after EXPLAIN (ANALYZE, BUFFERS) plans so you can copy/paste and see the difference today.
Why this matters
- Modern VMs have 8–64 vCPUs; a single serial plan wastes most of them.
- Under-sized
work_memforces disk sorts/hash spills and kills parallel speedups. - Parallelism only kicks in if the planner thinks it’s worth the setup cost. Your settings teach it to be bolder.
The core concepts (fast and clear)
1) Parallel query basics
Postgres uses a leader process plus parallel workers. Operators that can go parallel include:
- Parallel Seq Scan
- Parallel Bitmap Heap Scan
- Parallel Hash Join / Hash Aggregate
- Gather / Gather Merge to recombine results
Parallelism triggers when:
- The relation is big enough (thresholds below).
- The planner’s parallel costs justify worker startup.
- The query shape supports parallel-safe functions.
2) work_mem is per operator, per worker
work_mem isn’t global. A single query can allocate multiple work areas times the number of parallel workers. If you set work_mem = 256MB and the plan runs 6 parallel hash joins, you can exhaust RAM fast. Balance is everything.
The practical knobs (what to actually set)
| Knob | What it does | Typical starting point* | Notes |
|---|---|---|---|
max_parallel_workers | Global pool of parallel workers | min(#cores, 2 × #cores_logical/3) | Don’t exceed core count. |
max_parallel_workers_per_gather | Per-operator worker cap | 4–8 | Bigger tables ⇒ more workers help up to memory/I/O limits. |
parallel_leader_participation | Let leader do work | on | Usually a free ~10–20% boost. |
min_parallel_table_scan_size | Minimum table size to consider parallel scan | 32MB–128MB | Lower it to encourage parallel scans on mid-sized tables. |
min_parallel_index_scan_size | Same for index scans | 16MB–64MB | Similar idea. |
parallel_setup_cost | Cost penalty to start parallel workers | 100–500 | Reduce to be more eager. |
parallel_tuple_cost | Per-tuple cost in parallel paths | 0.01–0.03 | Reduce slightly to tip plans toward parallel. |
work_mem | Memory per sort/hash operator per worker | 32–128MB | Enough to avoid spills; see sizing math below. |
*Use these as starting points, then measure. Defaults vary by version and distro; always SHOW your current values.
Session-only testing (safe):
-- In psql or your session:
SET max_parallel_workers = 16;
SET max_parallel_workers_per_gather = 8;
SET parallel_leader_participation = on;
SET min_parallel_table_scan_size = '64MB';
SET min_parallel_index_scan_size = '32MB';
SET parallel_setup_cost = 100;
SET parallel_tuple_cost = 0.01;
SET work_mem = '64MB';
Before/After: real plan patterns you should expect
We’ll use a realistic rollup over a big fact table:
-- 200M-row events, daily rollup last 30 days
EXPLAIN (ANALYZE, BUFFERS)
SELECT date_trunc('day', created_at) AS day,
count(*) AS n
FROM events
WHERE created_at >= now() - interval '30 days'
GROUP BY 1
ORDER BY 1;
Before (serial, spilling)
You’ll often see:
Finalize GroupAggregate
-> Sort (actual time=...; Disk: 4 x temporary files)
-> HashAggregate
-> Seq Scan on events (rows=30,000,000)
Planning Time: ...
Execution Time: 18.2 s
Red flags:
Seq Scannot parallel.SortwithDisk:temp files ⇒work_memtoo small.- No
Gatheranywhere.
After (parallel hash aggregate, no spills)
With the session settings above:
Gather Merge (workers planned: 8, launched: 8)
-> Incremental Sort
-> Parallel HashAggregate
-> Parallel Seq Scan on events
Planning Time: ...
Execution Time: 4.1 s
Buffers: shared hit=..., read=...
Green flags:
Parallel Seq Scan+ 8 workers.Parallel HashAggregateavoids disk.Gather Mergecombines worker output efficiently.
Expect 2–6× speedups for scan/aggregate heavy queries on healthy storage. Diminishing returns beyond 8–12 workers are normal due to memory, I/O, and gather overhead.
Another common win: star-schema join
EXPLAIN (ANALYZE, BUFFERS)
SELECT u.country, count(*) AS purchases
FROM fact_purchases p
JOIN dim_users u ON p.user_id = u.id
WHERE p.created_at >= date_trunc('month', now())
GROUP BY u.country;
Bad pattern:
Hash Join (spill)
-> Seq Scan on fact_purchases
-> Hash on dim_users (Disk: temp files)
Execution Time: 12.7 s
Good pattern after tuning:
Gather
-> Parallel HashAggregate
-> Parallel Hash Join
-> Parallel Seq Scan on fact_purchases
-> Parallel Hash (dim_users fits in memory)
Execution Time: 3.9 s
What changed? Larger work_mem let the hash tables stay in memory, and the planner chose Parallel Hash Join. The Gather overhead is tiny compared to the savings.
Sizing work_mem without blowing up RAM
Rule of thumb:
Peak RAM ≈ concurrent_active_queries
× parallel_workers_per_query
× parallel_work_areas_per_worker
× work_mem
concurrent_active_queries: not connections; think queries doing heavy ops at once.parallel_work_areas_per_worker: usually 1–3 (hash join + aggregate + sort).- Start with
work_mem = 32–64MBfor mixed workloads; go128MBfor batch/reporting windows. - Use pg_stat_statements + logs to confirm spills:
SET log_temp_files = 0; -- log any temp file usage
-- run workload
-- check logs for "temporary file" lines (spills)
If you see frequent temp files, bump work_mem slowly (2× at a time) and re-measure.
Quick win checklist (copy/paste)
- Discover current reality
SHOW max_parallel_workers; SHOW max_parallel_workers_per_gather; SHOW parallel_leader_participation; SHOW min_parallel_table_scan_size; SHOW parallel_setup_cost; SHOW parallel_tuple_cost; SHOW work_mem; - Encourage parallelism
SET max_parallel_workers = 16; -- match-ish to CPU SET max_parallel_workers_per_gather = 8; SET parallel_leader_participation = on; SET min_parallel_table_scan_size = '64MB'; SET min_parallel_index_scan_size = '32MB'; SET parallel_setup_cost = 100; SET parallel_tuple_cost = 0.01; - Give operators room
SET work_mem = '64MB'; - Re-plan and measure
EXPLAIN (ANALYZE, BUFFERS)on your top N slow queries.- Look for
Paralleloperators and no temp disk notes. - Track Execution Time, Rows, and Buffers deltas.
- Promote to
postgresql.confonce stable.- Reconsider values for production concurrency.
- Monitor with
pg_stat_bgwriter,pg_stat_statements, and system RAM.
Best practices & common pitfalls
Best practices
- Target the top queries. Don’t global-tune blindly. Fix the 80/20.
- Partition large facts by time to keep working sets warm and parallelizable.
- Indexes still matter. Parallel scans don’t replace proper index strategies.
- Use
Gather Mergewisely. Sorting per worker + merge can beat sorting after gather.
Pitfalls
- RAM blowups from aggressive
work_mem× many workers × many queries. - Tiny tables won’t go parallel unless you lower the
min_parallel_*thresholds. - Parallel-unsafe functions (certain PL/pgSQL or volatile functions) force serial plans.
- I/O bottlenecks. If storage is slow, more workers can dogpile and thrash.
Internal link ideas (for your site)
- “Understanding Hash Joins and When They Spill”
- “Effective
autovacuumfor Large Fact Tables” - “Partitioning Strategies for Time-Series in Postgres”
- “From OLTP to Analytics: When to Add a Columnar Sidecar (ClickHouse/duckdb)”
Conclusion & takeaways
Parallel query in Postgres isn’t magic—it’s a negotiation between planner costs, object sizes, and memory. Nudge the planner with realistic parallel costs, cap worker counts sensibly, and size work_mem so hash/sort operators stay in RAM. Measure with EXPLAIN (ANALYZE, BUFFERS). Expect 2–6× gains on scan/aggregate heavy queries when you get it right.
Your immediate next step: test the session settings above on one heavy report query, verify Parallel operators appear, and confirm no temp files. Then tune upward, one notch at a time.
Image prompt
“A clean, modern Postgres architecture diagram showing a leader process coordinating 8 parallel workers scanning a large table, parallel hash aggregate, and a Gather Merge node—minimalistic, high-contrast, 3D isometric style with labeled operators.”
Tags
#Postgres #ParallelQuery #work_mem #DatabasePerformance #QueryTuning #EXPLAINANALYZE #DataEngineering #SQL #Scaling #PostgreSQL



Leave a Reply