Amazon DynamoDB – Data/ML Engineer Blog

Amazon DynamoDB for Data Engineers: A 2026 Playbook for Modeling, Capacity & Cost

Meta description:
A practical DynamoDB guide for mid-level data engineers: access-pattern modeling, partition keys, GSIs, capacity modes, adaptive capacity, streams, and cost tuning.

Why this matters (a quick story)

Your team just shipped a feature. Traffic spikes like Black Friday, and suddenly a few keys are melting while the rest of the table naps. Reads jitter, costs creep, and dashboards lag. DynamoDB can handle this—if you design for access patterns, distribute load, and choose the right capacity/caching/replication levers.

This playbook shows how.

Core concepts, clearly

The DynamoDB mental model

Table → Items → Attributes with a partition key (and optional sort key). Data is stored across internal partitions and replicated across Availability Zones; you don’t manage nodes. (AWS Documentation)
Two capacity modes: On-Demand (pay-per-request, scales automatically) and Provisioned (you set RCUs/WCUs, optionally with autoscaling). (AWS Documentation)
Adaptive capacity & burst: unused throughput is banked for short spikes (up to ~5 minutes). This masks microbursts and uneven access—within limits. (AWS Documentation)

Read/write accounting (the unit math you actually need)

RCU: 1 strongly consistent read per second for 4 KB (eventually consistent ≈ half).
WCU: 1 write per second for 1 KB.
Transactions cost 2× the normal units. (AWS Documentation)

Indexes (when and how)

GSIs give alternate PK/SK; up to 20 per table (default quota).
LSIs reuse the table PK, change only SK; up to 5 per table. (AWS Documentation)
GSI updates are asynchronous; if a GSI can’t keep up, base table writes can be throttled (GSI back-pressure). Plan capacity. (AWS Documentation)

Streams, caching, and multi-Region

DynamoDB Streams captures item-level changes in near real-time (ordered per item). Great for CDC, triggers, and outbox patterns. (AWS Documentation)
DAX (Accelerator) adds microsecond read latency for read-heavy, eventually consistent access patterns. (AWS Documentation)
Global Tables replicate multi-Region, multi-active. Conflicts resolve last-writer-wins by internal timestamp—design to avoid conflicts. (AWS Documentation)

Practical constraints

Max item size = 400 KB. Store big blobs in S3 and keep pointers in DynamoDB. (AWS Documentation)
Query/Scan page size is 1 MB; paginate with LastEvaluatedKey. (AWS Documentation)
Cost optimization: choose On-Demand for spiky/unpredictable, Provisioned for steady with autoscaling; consider Standard vs Standard-IA table classes for storage-heavy, infrequently accessed data. (AWS Documentation)

A realistic example: single-table design for Orders

Goal: Quickly retrieve a customer’s recent orders, and list all orders by status for a region.

Keys & items (entity-type prefixes keep access patterns clear):

PK = CUST#<customerId>
SK = ORDER#<yyyy-mm-ddThh:mm:ss>
GSI1: GSI1PK = REGION#<code>, GSI1SK = STATUS#<status>#<orderId>

Sample item

{
  "PK": "CUST#42",
  "SK": "ORDER#2025-11-20T09:00:00Z",
  "orderId": "o-9812",
  "region": "us-east-1",
  "status": "SHIPPED",
  "totalUSD": 129.50,

  "GSI1PK": "REGION#us-east-1",
  "GSI1SK": "STATUS#SHIPPED#o-9812"
}

Python (boto3) sketch

import boto3
ddb = boto3.resource("dynamodb")
table = ddb.Table("orders")

# Write
table.put_item(Item={
  "PK": "CUST#42",
  "SK": "ORDER#2025-11-20T09:00:00Z",
  "orderId": "o-9812",
  "region": "us-east-1",
  "status": "SHIPPED",
  "totalUSD": 129.50,
  "GSI1PK": "REGION#us-east-1",
  "GSI1SK": "STATUS#SHIPPED#o-9812"
})

# Read recent orders for a customer
resp = table.query(
  KeyConditionExpression="#pk=:pk AND begins_with(#sk,:prefix)",
  ExpressionAttributeNames={"#pk":"PK","#sk":"SK"},
  ExpressionAttributeValues={":pk":"CUST#42",":prefix":"ORDER#"},
  Limit=50, ScanIndexForward=False  # newest first
)

# Read all SHIPPED orders in region via GSI
resp2 = table.query(
  IndexName="GSI1",
  KeyConditionExpression="#gpk=:gpk AND begins_with(#gsk,:gsk)",
  ExpressionAttributeNames={"#gpk":"GSI1PK","#gsk":"GSI1SK"},
  ExpressionAttributeValues={":gpk":"REGION#us-east-1",":gsk":"STATUS#SHIPPED#"}
)

Capacity estimation (back-of-napkin)

Avg item size ≈ 1.2 KB.
Writes: 600/s → base writes ≈ ceil(1.2/1) × 600 = 1 × 600 = 600 WCU.
- Plus GSI (projected attributes): each write also updates the GSI → another ~600 WCU budgeted on the GSI. (AWS Documentation)
Reads: 100/s, eventually consistent, 2 KB each → ceil(2/4) × 100 / 2 = 50 RCU. (AWS Documentation)

When uncertain or spiky, start On-Demand then evaluate switching to Provisioned + autoscaling once patterns stabilize. (AWS Documentation)

Best practices (and the pitfalls that burn teams)

1) Model access patterns first

DynamoDB rewards predictable queries. If you need exploratory analytics, mirror data to something like S3/Glue/Athena or Elasticsearch via Streams. (AWS Documentation)

2) Design high-cardinality, uniform partition keys

Avoid “celebrity keys” (hot spots). Consider hashing/sharding keys (USER#<hash(uid)>#<uid>) when a natural key is skewed. Adaptive capacity helps, but don’t rely on it to mask systemic skew. (AWS Documentation)

3) Keep items small & cohesive

Stay under 400 KB; store large blobs in S3 and keep metadata in DynamoDB. (AWS Documentation)

4) Be intentional with indexes

Each GSI multiplies write cost and backfill time. Start with the minimal set, use projection to include only the attributes each query needs, and monitor for GSI back-pressure. (AWS Documentation)

5) Prefer Query over Scan, and always paginate

Scans chew RCUs and return in 1-MB pages; use LastEvaluatedKey. (AWS Documentation)

6) Choose the right capacity mode

On-Demand for spiky/unpredictable traffic; Provisioned + autoscaling for steady workloads that you can forecast (often cheaper at scale). Re-evaluate periodically. (AWS Documentation)

7) Use Streams for CDC and triggers

Ordered per item; fan-out via Lambda or Kinesis consumers. Consider idempotency keys in consumers. (AWS Documentation)

8) Multi-Region? Respect LWW conflict resolution

Global Tables use last-writer-wins. Prevent conflicts with routing (single-writer region per item/tenant) when possible. (AWS Documentation)

9) Don’t forget TTL

Auto-expire noise (sessions, temp states); TTL deletes don’t consume WCUs, and you can stream expirations if you need archives. (AWS Documentation)

10) Cache smartly with DAX

For read-heavy, low-latency paths with predictable keys, DAX removes read load. Not a replacement for good keys or GSI design. (AWS Documentation)

Quick comparison table

Decision	Use When	Upsides	Watch-outs
On-Demand	Spiky/unpredictable traffic	No capacity planning; scales fast	Cost can exceed provisioned for steady workloads (AWS Documentation)
Provisioned + Autoscaling	Steady/forecastable traffic	Cost control; predictable	Under-provision → throttling; plan GSIs too (AWS Documentation)
GSI	New query pattern across whole table	Flexible access	Write amplification; back-pressure throttling (AWS Documentation)
DAX	Ultra-low-latency reads	Microsecond reads; fewer RCUs	Eventual consistency; cache invalidation strategy (AWS Documentation)
Global Tables	Worldwide low-latency & DR	Multi-active, multi-Region	LWW conflicts; routing discipline required (AWS Documentation)

Internal link ideas (for your site)

“Choosing Partition vs Sort Keys: Real Patterns from Orders, Carts, Inventory”
“Designing Projections on GSIs: Cost vs Latency”
“Streams to S3/Athena: Building a Real-Time Analytics Sink”
“From On-Demand to Provisioned: A Step-by-Step Cost Migration Plan”
“TTL + Streams: Auto-Archiving Expired Items”

External references (official only)

Capacity modes; RCUs/WCUs; transactions; and guidance. (AWS Documentation)
Partitions & distribution. (AWS Documentation)
Adaptive capacity & burst behavior. (AWS Documentation)
Item size and using S3 for large attributes. (AWS Documentation)
Streams overview & Lambda triggers. (AWS Documentation)
Index quotas & index best practices. (AWS Documentation)
GSI back-pressure. (AWS Documentation)
Global Tables & LWW conflict resolution. (AWS Documentation)
Query/Scan 1-MB page limit. (AWS Documentation)
DAX overview. (AWS Documentation)
Table classes (Standard vs Standard-IA). (Amazon Web Services, Inc.)

Summary & call-to-action

DynamoDB shines when you design for your queries, spread load with smart keys, and budget capacity intentionally (including GSIs). Add Streams for CDC, DAX for ultra-fast reads, and Global Tables only when your routing strategy can avoid conflicts. Start with On-Demand, measure, then tune.

Want a hands-on follow-up? Tell me your top 3 access patterns and traffic profile, and I’ll sketch a single-table schema, capacity plan, and index set you can try immediately.

Image prompt (for your designer/AI tool)

“A clean, modern architecture diagram showing a DynamoDB table with partition & sort keys, a GSI, DAX cache, Streams to Lambda → S3, and Global Tables across two Regions — minimalistic, high contrast, 3D isometric style.”

Pitch ideas (DynamoDB topics that will rank and convert)

“On-Demand vs Provisioned in DynamoDB: Break-Even Math and Real Cost Curves” – keywords: DynamoDB on-demand vs provisioned, cost optimization, autoscaling.
“Designing High-Cardinality Keys: 12 Patterns to Avoid Hot Partitions” – DynamoDB partition key design, adaptive capacity, skew. (AWS Documentation)
“Global Tables Without Regret: Routing Strategies to Tame LWW Conflicts” – DynamoDB global tables, multi-Region, conflict resolution. (AWS Documentation)
“GSI Back-Pressure Explained: How to Size, Project, and Monitor Indexes” – DynamoDB GSI capacity, projections, throttling. (AWS Documentation)
“Streams Playbook: From Change Data Capture to Real-Time Analytics” – DynamoDB Streams, Lambda, CDC pipeline patterns. (AWS Documentation)
“DAX in Production: When Caching Beats More RCUs” – DynamoDB Accelerator, microsecond latency, caching patterns. (AWS Documentation)
“Beyond 400 KB: Hybrid S3 + DynamoDB Patterns” – DynamoDB item size limit, S3 pointers, TTL lifecycle. (AWS Documentation)
“Single-Table Design, Pragmatically: When to Bend the Rule” – DynamoDB data modeling, access patterns, trade-offs.

If you pick one, I’ll deliver the full SEO article in the same style—tight paragraphs, real code, official references only.

Data/ML Engineer Blog