Aurora Serverless v2 Cost & Performance Tuning: ACU Guardrails, Alarms, and Real Incidents
You moved to Aurora Serverless v2 for elasticity—and then the bill spiked or latency went sideways. Classic. This guide gives you hard guardrails (min/max ACU by workload), the exact CloudWatch metrics and alarms to wire up, and a set of real incident patterns so you can prevent “mystery spend” and 3 a.m. pages.
Why this matters (the quick hook)
Aurora Serverless v2 scales in fine-grained Aurora Capacity Units (ACUs). It’s brilliant—until:
- a background job pegs ACUUtilization all afternoon,
- an RDS Proxy prevents scale-down,
- or a migration throttles concurrency and you scale up without getting faster.
You need sane capacity guardrails and alarms that fire early—not a Cost Explorer autopsy next month.
Concept & Architecture (what actually drives cost/perf)
- ACU = CPU + ~2 GiB RAM + networking. Your real ceiling is often memory per connection and query mix, not just vCPU. (AWS Documentation)
- You set a min/max ACU window per cluster. Scaling speed depends on the gap between min and max; wider windows can take longer to traverse. (AWS Documentation)
- Billing maps to ACU usage. You’re charged for the capacity you consume; ServerlessDatabaseCapacity is the metric that ties directly to charges on the bill. (AWS Documentation)
- Minimum ACU: newer versions allow lower floors (even 0 ACU for some versions); historically the practical floor was 0.5 ACU. Check your engine/version before assuming scale-to-zero. (AWS Documentation)
- Key metrics to watch:
- ServerlessDatabaseCapacity (current ACUs)
- ACUUtilization (% of allocated ACU actually used)
- Standard engine health: CPUUtilization, DatabaseConnections, VolumeRead/WriteIOPS, FreeableMemory, Deadlocks, EngineUptime. (AWS Documentation)
Guardrails: min/max ACU by workload pattern
Use this as a starting point; adjust after a week of real traffic.
| Workload pattern | Examples | Min ACU | Max ACU | Notes you won’t regret later |
|---|---|---|---|---|
| Dev/Test (spiky, idle often) | CI, feature branches | 0–0.5 | 2–4 | Aim for lowest floor your version allows; accept a small warm-up hit. (AWS Documentation) |
| Read-heavy APIs, predictable daily rhythm | Catalog, content | 1–2 | 8–16 | Keep floor above connection storm; right-size poolers to allow scale-down. (AWS Documentation) |
| Write-heavy microservices | Orders, events | 2–4 | 16–32 | Cap max to protect neighbors; use backpressure in producers. |
| Analytics/ETL bursts | Hourly transforms | 0.5–2 | 32–64 | Put jobs behind a circuit breaker; stagger heavy scans. |
| Multi-tenant SaaS (noisy neighbor risk) | Mixed | 2–8 | 32–128 | Per-tenant rate limits + query governors; consider separate clusters. |
Rule of thumb: set min to cover steady concurrency (connections × memory/connection), and set max to the point where adding ACU actually reduces p95 latency; beyond that, you’re buying heat, not speed. (AWS Documentation)
Alarms that catch both spend and pain (CloudWatch)
Wire these per cluster. Thresholds are conservative starting values.
- Capacity near ceiling (pre-saturation)
- Metric:
ServerlessDatabaseCapacity - Alarm: >= 80% of max ACU for 10 min (e.g., max=32 ACU → alarm at 25.6)
- Why: you’re one burst away from throttling or timeouts. (AWS Documentation)
- Metric:
- Inefficient over-provision (wasted spend)
- Metric:
ACUUtilization - Alarm: <= 25% for 30 min when
ServerlessDatabaseCapacity >= 2 ACU - Why: you’re scaled up but not using it—investigate idle connections or a stuck pool. (AWS Documentation)
- Metric:
- Runaway cost
- Metric:
ServerlessDatabaseCapacity(or your ACU cost SLO via metric math) - Alarm: Hourly avg ACU > SLO (e.g., 6 ACU) for 2 hours
- Why: maps directly to your bill; ties back to the billing note in docs. (AWS Documentation)
- Metric:
- Can’t scale down (sticky connections)
- Metric combo:
DatabaseConnectionshigh ANDACUUtilization < 30%for 30 min - Why: RDS Proxy or app pools keeping the floor high; costs creep. (DEV Community)
- Metric combo:
- Classic performance health
CPUUtilization > 80%for 10 min → check hot queries/indexesFreeableMemory < 1–2 GiBsustained → risk of OOM/evictionsVolumeReadIOPS/WriteIOPSspikes without capacity increase → suboptimal plans or scans. (Repost)
Tip: Keep 60s periods and 3–10 evaluation periods for most alarms; you want trends, not flapping.
Real incidents (and how to avoid them)
1) “Scaled up, still slow”: single-threaded DDL
Symptom: ServerlessDatabaseCapacity climbs to near max, p95 worsens.
Root cause: a blocking ALTER TABLE or SERIALIZABLE workload; extra ACUs can’t increase concurrency.
Guardrail: maintenance window + lock-timeout + throttle DDL; cap max ACU during migrations so you don’t buy useless capacity. (AWS Documentation)
2) The RDS Proxy cost trap
Symptom: After moving to Serverless v2 + Proxy, ACUs sit at 2–4 even when traffic is idle.
Root cause: persistent connections keeping memory hot; cluster won’t drop to min.
Fix: lower Proxy idle timeouts, enable connection borrowing limits, and set a low min ACU for non-prod. Real-world write-up here. (DEV Community)
3) “We hit the roof”: capacity pinned at max
Symptom: ServerlessDatabaseCapacity = max for 10+ minutes; errors appear.
Root cause: max too low for burst (e.g., new feature launch).
Fix: raise max temporarily; add producer backpressure and rate limits; create an alarm on 80% of max so you see it earlier. (AWS Documentation)
4) “Why won’t it scale down?”
Symptom: low CPU, low IOPS, but capacity won’t budge.
Root cause: lingering app/Proxy pools or background schedulers.
Fix: enforce pool size ceilings; stop chatty health checks; verify your engine/version actually allows your desired min ACU (0 or 0.5). (AWS Documentation)
Quick setup: CLI snippets you can paste
Set capacity window (Postgres example):
aws rds modify-db-cluster \
--db-cluster-identifier my-aurora-slsv2 \
--serverless-v2-scaling-configuration MinCapacity=1,MaxCapacity=16 \
--apply-immediately
Get current capacity (ACUs) via CloudWatch:
aws cloudwatch get-metric-statistics \
--namespace AWS/RDS \
--metric-name ServerlessDatabaseCapacity \
--dimensions Name=DBClusterIdentifier,Value=my-aurora-slsv2 \
--statistics Average --start-time $(date -u -v-15M +%FT%T) \
--end-time $(date -u +%FT%T) --period 60
(Use this to validate that what you pay matches what you see; docs state the billing ties to this metric.) (AWS Documentation)
Alarm: capacity near ceiling (80% of max)
(replace MAX_ACU):
aws cloudwatch put-metric-alarm \
--alarm-name "aurora-capacity-80pct" \
--metric-name ServerlessDatabaseCapacity \
--namespace AWS/RDS \
--dimensions Name=DBClusterIdentifier,Value=my-aurora-slsv2 \
--statistic Average --period 60 --evaluation-periods 10 \
--threshold $(python - <<'PY'\nprint(0.8*float("MAX_ACU"))\nPY) \
--comparison-operator GreaterThanOrEqualToThreshold \
--treat-missing-data notBreaching \
--alarm-actions arn:aws:sns:us-east-1:123456789012:prod-alerts
Alarm: waste watch (low utilization at elevated capacity)
aws cloudwatch put-composite-alarm \
--alarm-name "aurora-wastewatch" \
--alarm-rule 'ALARM("ACUUtilLow") AND ALARM("CapAtOrAbove2")'
(Back the two child alarms with ACUUtilization <= 25 and ServerlessDatabaseCapacity >= 2.) (AWS Documentation)
Best practices & common pitfalls
Do this
- Right-size min ACU by connections. If you expect high connection counts, set
MinCapacity >= 1to avoid churn and connection flaps. (AWS Documentation) - Cap max during risky ops (migrations, schema changes) so you don’t buy useless ACUs.
- Use PI (Performance Insights) to find top wait events before assuming “need more ACU.” (AWS Documentation)
- Separate spiky batch from latency-sensitive OLTP (even if same schema) to avoid max-pinning.
- Review weekly: plot
Avg ACU,p95 latency, anderror rate; adjust window ifACUUtilizationis <25% or >70% for long stretches. (AWS Documentation)
Avoid this
- Assuming scale-to-zero. Verify your engine/version; many clusters still bottom out at 0.5 ACU. (AWS Documentation)
- Letting Proxy/App pools keep you “warm forever.” Idle connections = stuck capacity. (DEV Community)
- Believing “more ACU = faster.” If you’re lock-bound or IO-bound, you’ll just pay more to be slow.
Conclusion & takeaways
- Guardrails: pick a floor that matches steady concurrency; a ceiling that actually reduces latency.
- Alarms: watch near-ceiling, waste, and sticky connections—you’ll catch 80% of surprises.
- Incidents repeat: DDL locks, Proxy stickiness, and max-pinning. Have playbooks.
Treat ACU like a budget: allocate deliberately, measure relentlessly, and cap greedily.
Image prompt
A clean, modern diagram of an Aurora Serverless v2 cluster with a min/max ACU band, CloudWatch alarms on ServerlessDatabaseCapacity and ACUUtilization, and an RDS Proxy component—minimalist, high-contrast, 3D isometric style.
Tags
#Aurora #ServerlessV2 #ACU #CloudWatch #RDSProxy #FinOps #PostgreSQL #MySQL #CostOptimization #DataEngineering
Sources for accuracy
- Aurora Serverless v2 overview, metrics and billing mapping, scaling behavior. (AWS Documentation)
- ACU definition (~2 GiB RAM per ACU) and capacity range/version notes (min 0–0.5, 0.5-step increments). (AWS Documentation)
- CloudWatch metrics and utilization guidance (ACUUtilization, ServerlessDatabaseCapacity). (AWS Documentation)
- RDS Proxy + Serverless v2 “cost trap” (sticky connections, won’t scale down). (DEV Community)
- General troubleshooting metrics (CPU, connections, IOPS). (Repost)
If you want, I can tailor the alarm JSON/Terraform to your exact cluster names and SNS topics—just say the word.













Leave a Reply