Avoiding the “Too Many Parts” Trap

Avoiding the “Too Many Parts” Trap: Insert Strategies that Scale on S3 (ClickHouse)

Meta description (159 chars):
Practical ClickHouse guide to fix “Too many parts” on S3: batch sizes, async inserts, backfill windows, and merge-safe settings for high-throughput ingestion.


The problem (and why you hit it on S3)

You kick off a big backfill from S3. Ten minutes later: DB::Exception: Too many parts (N). Merges are slower than inserts. Your cluster’s busy creating thousands of tiny parts per partition; background merges can’t keep up. Result: throttled inserts, rising storage ops, and dashboards slowing to a crawl. ClickHouse is fast—if you feed it fewer, bigger parts and let merges breathe. (ClickHouse)


Mental model: parts, merges, and the S3 twist

  • Every INSERT creates a new part (per partition key). Background threads later merge small parts into larger ones; that’s how MergeTree stays efficient. (ClickHouse)
  • On S3-backed storage, insert block size ⇒ initial part size. Bigger blocks mean fewer parts, fewer merges, and less I/O churn. (ClickHouse)
  • The “Too many parts” error is gated by MergeTree settings:
    • parts_to_delay_insert (slow inserts above this)
    • parts_to_throw_insert (throw exception above this) (ClickHouse)

Key: You don’t “fix” the error by just raising thresholds—you reduce part creation and let merges catch up. (ClickHouse)


The short list: what actually works

1) Batch like you mean it

ClickHouse has constant overhead per insert. Batch size is the single most important lever. Aim for 10k–100k rows per insert (more is fine if memory allows). (ClickHouse)

2) Use server-side batching with async inserts

When many clients send small payloads (logs/metrics), enable:

SET async_insert = 1, wait_for_async_insert = 1;  -- durable batching
-- For fire-and-forget throughput (less strict durability):
-- SET wait_for_async_insert = 0;

Async inserts buffer on the server and coalesce small writes into bigger parts—exactly what you want under fan-in. (ClickHouse)

3) Right-size insert blocks for S3

For INSERT … SELECT (e.g., from S3 table functions), tune block formation to emit fewer, bigger parts:

INSERT INTO analytics.events
SELECT *
FROM s3('https://bucket/logs/2025/11/*.parquet', 'AWS_KEY', 'AWS_SECRET')
SETTINGS
    max_insert_threads = 8,
    min_insert_block_size_rows = 0,
    min_insert_block_size_bytes = 256000000; -- ~256 MiB target

Larger blocks produce larger initial parts; ClickHouse spends less time merging and more time serving queries. (Adjust sizes to your dataset and memory headroom.) (ClickHouse)

4) Windowed backfills by partition

Because one part is created per partition in the block, broad scans spanning many partitions explode part counts. Backfill per month/day (your PARTITION BY) and run jobs in parallel across partitions, not across the whole table. (ClickHouse)

5) Treat thresholds as safety rails, not a cure

For planned backfills you can temporarily relax:

ALTER TABLE analytics.events
    MODIFY SETTING parts_to_delay_insert = 500, parts_to_throw_insert = 2000;

…but revert after the load. The real cure is bigger, fewer parts. (ClickHouse)

6) Don’t rely on OPTIMIZE … FINAL as a broom

It forces heavy merges and isn’t a general-purpose fix—reserve for admin cases, not routine ingestion. (ClickHouse)


A concrete pattern: S3 → ClickHouse, safely

Table design (example)

CREATE TABLE analytics.events
(
  event_time   DateTime64(3, 'UTC'),
  user_id      UInt64,
  event_type   LowCardinality(String),
  status       UInt16,
  payload      JSON
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(event_time)
ORDER BY (user_id, event_time);

Why it helps:

  • Monthly partitions make ops (drop/move/TTL) manageable and constrain backfill windows.
  • Sorted by (user_id, event_time) aligns with common filters and keeps merges efficient.

Backfill playbook (repeatable)

  1. Partition windows. Load one YYYYMM at a time.
  2. Bigger blocks. Use min_insert_block_size_*/max_insert_threads to emit fat parts. (ClickHouse)
  3. Async inserts for many producers. (ClickHouse)
  4. Watch part churn as you go:
-- Parts per partition
SELECT partition, sum(rows) rows, count() parts
FROM system.parts
WHERE table = 'events' AND active
GROUP BY partition
ORDER BY partition; -- shrinking 'parts' = healthy

And for async behavior:

SELECT *
FROM system.asynchronous_insert_log
ORDER BY event_time DESC
LIMIT 50;

(Use system.part_log and system.parts to verify merge progress over time.) (ClickHouse)


S3 specifics: storage, cache, and read path

  • Store MergeTree data on S3 via an S3 disk or policy; you can also enable a local filesystem cache to cut repeated GET latency. (ClickHouse)
  • ClickHouse reads/writes parts while the OS/page cache and optional uncompressed block cache keep hot data close; still, the best latency win is fewer, larger parts. (ClickHouse)

Tuning checklist (copy/paste)

  • Batching
    • Aim for 10k–100k rows per insert (or higher), or use async_insert. (ClickHouse)
  • Block sizing for S3 loads
    • Tune min_insert_block_size_rows/bytes, max_insert_threads; verify memory headroom. (ClickHouse)
  • Backfill strategy
    • Load by partition windows; avoid cross-month scans in one job. (ClickHouse)
  • Thresholds
    • parts_to_delay_insert, parts_to_throw_insert—use sparingly for migrations. (ClickHouse)
  • Ops visibility
    • Track system.parts, system.part_log, system.asynchronous_insert_log. (ClickHouse)
  • Don’ts
    • Don’t lean on OPTIMIZE … FINAL to “clean up” routine ingestion. (ClickHouse)

Troubleshooting: symptom → action

SymptomLikely causeFast action
Too many parts (…); merges slower than insertsSmall inserts, wide partition spanIncrease batch/block size; run by partition; consider async inserts. (ClickHouse)
Inserts throttled but not failingHit parts_to_delay_insertTemporarily raise delay threshold during backfills; revert after. (ClickHouse)
Part count stable but not shrinkingMerges starvedPause producers or window tighter; avoid OPTIMIZE FINAL as routine fix. (ClickHouse)
Many tiny parts on S3Small insert blocksIncrease min_insert_block_size_bytes and threads (watch memory). (ClickHouse)

Best practices & common pitfalls

Best practices

  • Prefer server-side batching (async_insert=1) when you can’t batch in clients. (ClickHouse)
  • Backfill with bounded time windows matching PARTITION BY. (ClickHouse)
  • Set realistic batch targets and validate with system.parts during a dry run. (ClickHouse)

Pitfalls

  • Cranking thresholds and calling it done—it hides symptoms and bites later. (ClickHouse)
  • Loading a full year in one INSERT … SELECTexplodes partitions/parts. (ClickHouse)
  • Cleaning with OPTIMIZE … FINAL after every load—expensive and unnecessary. (ClickHouse)

Internal link ideas (official docs)

  • Bulk inserts: why batch size dominates. (ClickHouse)
  • Async inserts: mechanics & monitoring. (ClickHouse)
  • S3 insert/read performance: block → part mechanics. (ClickHouse)
  • MergeTree parts & system tables: system.parts, part_log. (ClickHouse)
  • “Too many parts” thresholds: parts_to_delay_insert, parts_to_throw_insert. (ClickHouse)
  • Avoid Optimize FINAL: when not to use it. (ClickHouse)

Summary & call to action

On S3, the fastest path out of “Too many parts” is simple: bigger blocks ⇒ bigger parts ⇒ fewer merges. Batch aggressively (or enable async inserts), backfill by partition windows, and treat part thresholds as guardrails—not duct tape. Measure part counts and merge progress as you load.

Try this today: pick one hot table, run a one-month backfill with the block settings above, and watch system.parts before/after. If parts/partition drop and insert latency flattens, you’re on the right track. (ClickHouse)


Image prompt

“A clean, modern diagram of ClickHouse on S3 showing client inserts and S3 INSERT … SELECT flows forming large in-memory blocks, creating fewer large parts per partition, background merges consolidating parts, and async inserts buffering small writes — minimalistic, high-contrast, isometric 3D.”

Tags

#ClickHouse #TooManyParts #BulkInserts #S3 #AsyncInserts #MergeTree #DataEngineering #Backfill #Performance #OLAP

Leave a Reply

Your email address will not be published. Required fields are marked *