From RDS to Aurora: A Migration Checklist for Mid-Size Teams

Version checks, storage limits (128–256 TiB), endpoint switching, and load tests

Introduction — why this matters

You’re hitting scaling ceilings on Amazon RDS: storage growth, replica lag, spiky latency during maintenance. Aurora promises faster failovers, storage auto-scaling, and cheaper read scaling. Great—but migrations hurt when basics are missed: engine/version mismatches, wrong storage assumptions, sloppy DNS cutovers, and unproven throughput. This checklist keeps you honest and your outage window boring.

The high-level plan (read this first)

Decide the target (Aurora MySQL vs Aurora PostgreSQL; Serverless v2 vs provisioned).
Prove compatibility (engine + version parity, features, extensions).
Validate limits (cluster storage up to 256 TiB on current Aurora; per-table/instance constraints; connections). (Amazon Web Services, Inc.)
Build a green environment (cluster + replicas + parameters + endpoints). (AWS Documentation)
Replicate data (physical clone, DMS, logical replication, or Blue/Green). (AWS Documentation)
Load test and rehearse (sysbench/pgbench + production-like data).
Cut over with DNS hygiene (CNAME, low TTL, rollback plan).
Observe and optimize (perf schema/pg_stat*, Aurora metrics).

Step-by-step checklist

1) Choose the right Aurora flavor

Aurora MySQL v3 (MySQL 8.0 compatible). If you’re on RDS MySQL 5.7, do the app & SQL uplift before migrating. (AWS Documentation)
Aurora PostgreSQL: confirm needed extensions (PostGIS, pgcrypto, etc.) exist in the target Aurora engine version.
Storage model: Aurora auto-scales storage at the cluster level—current max 256 TiB (was 128 TiB). RDS engine instances typically cap at 64 TiB. Plan growth accordingly. (Amazon Web Services, Inc.)
Serverless v2 can work for bursty, cost-sensitive loads; steady heavy traffic prefers provisioned. Check any regional/feature constraints (e.g., Limitless Aurora Postgres requirements). (AWS Documentation)

2) Version & feature parity (blockers surface here)

Create a small matrix and refuse to proceed until it’s green:

Area	Source (RDS)	Target (Aurora)	Check
Engine + major	e.g., MySQL 8.0.28	Aurora MySQL v3.* (8.0-compatible)	Verify behavior diffs (temp tables, keywords). (AWS Documentation)
Extensions/Features	e.g., pgcrypto, logical decodes	Aurora Postgres version supports them?	Confirm availability/replacements
Parameter diffs	`innodb_flush_log_at_trx_commit`, `sql_mode` / `work_mem`	Aurora params	Create a parameter group twin
Authentication	IAM auth / SSL	Same	Enforce TLS, rotate certs

Tip: Read the “v2 vs v3” diffs for Aurora MySQL; there are subtle behavior changes. (Amazon Docs)

3) Limits & quotas (don’t wing it)

Cluster storage: Aurora MySQL & Postgres now support up to 256 TiB per cluster (pay for what you use). (Amazon Web Services, Inc.)
Per-table size: Aurora MySQL max table size 64 TiB (practically limited by engine/DDL). Track large tables explicitly. (AWS Documentation)
RDS baseline: most RDS engines max 64 TiB storage per instance—useful for sizing deltas and growth runway math. (AWS Documentation)

4) Build the green (target) environment properly

Create writer and reader endpoints; add custom endpoints to route hot read pools by workload. Remember: custom endpoints aren’t saved in snapshots; recreate after restores. (AWS Documentation)
Match parameter groups, option groups, subnets, security groups, and maintenance windows.

5) Data movement options (pick one and practice)

RDS → Aurora Read Replica (for MySQL): attach Aurora replica to RDS, then promote when lag ≈ 0. Near-zero downtime if app is replica-aware. (AWS Documentation)
RDS Blue/Green Deployments: create mirrored green env; switch over in minutes. Understand that “switch over” ≠ “promote a read replica.” Follow the doc workflow to avoid invalid states. (AWS Documentation)
AWS DMS: full load + CDC into Aurora (engine-agnostic; good for heterogeneous features or heavy transforms). (AWS Documentation)

6) Endpoint switching & DNS hygiene (cutover)

Use CNAMEs (e.g., db.prod.myco → Aurora writer endpoint).
Lower TTL (e.g., set to 30–60s at least 24h before).
Drain writes: set app to read-only for final sync if you’re not using blue/green or replication-based approaches.
Rollback: keep old writer on standby with replication paused; cut back via DNS if needed.
If you opt for Blue/Green, do the official switch over step—do not manually “Promote” the green instance or you’ll break the deployment state. (AWS Documentation)

7) Load testing (prove it before you flip)

Your pre-cutover test must answer: Can Aurora sustain peak P95 latency and steady-state throughput with headroom?

Workload tools:
- MySQL → sysbench oltp_read_write with realistic table cardinalities and secondary indexes.
- PostgreSQL → pgbench custom scripts mirroring your transaction mix.
Scale reads: point read pools to the reader or custom endpoints to validate connection poolers. (AWS Documentation)
Observe: CloudWatch metrics (CommitLatency, Deadlocks, AuroraReplicaLag, DatabaseConnections, AuroraVolumeBytesLeftTotal). That last one exposes remaining storage—watch it during the rehearsal. (AWS Documentation)
Concurrency: validate pool sizes and max connections (Aurora may allow different ceilings than RDS instance families).

8) Performance hardening (pre- and post-cutover)

MySQL: confirm redo/flush settings, innodb_buffer_pool_size (if not Serverless), and long-query killer in app tier.
PostgreSQL: cost settings, work_mem, autovacuum aggressiveness on hot partitions; verify extension versions.
SQL fixes: resolve full table scans with missing compound indexes; test plan stability under Aurora’s optimizer variants.
Maintenance: pick a patch window out of your peak; Aurora failovers are faster than RDS but still visible in tail latency—test it.

Practical mini-runbook (copy/paste friendly)

Quick storage sanity

# RDS (source) storage ceiling reminder (most engines):
# 64 TiB per instance (use to sanity-check explosive growth)
# Ref: AWS docs

(AWS Documentation)

-- Big tables inventory (MySQL)
SELECT table_schema, table_name,
       ROUND((data_length+index_length)/POWER(1024,3),2) AS size_gb
FROM information_schema.tables
ORDER BY size_gb DESC
LIMIT 20;

Blue/Green or replica approach rehearsal

# If using DNS cutover, prep a friendly CNAME and shrink TTL
# (do this at least a day before)
# Route 53 -> Hosted zone -> A/AAAA or CNAME: db.prod.myco
# Set TTL = 60

# App connection string should point to your friendly CNAME, not raw endpoint
DB_HOST=db.prod.myco

Reader routing proof

# Aurora exposes a reader endpoint for read-only traffic
# Validate the pooler routes to it and your reports/BI keep working.
# Ref: Aurora endpoints doc

(AWS Documentation)

Sysbench smoke (MySQL example)

sysbench oltp_read_write --mysql-host=db-green.myco \
  --mysql-user=bench --mysql-password=*** \
  --tables=24 --table-size=2000000 --threads=128 --time=600 prepare

sysbench oltp_read_write --mysql-host=db-green.myco \
  --mysql-user=bench --mysql-password=*** \
  --tables=24 --table-size=2000000 --threads=128 --time=600 run

Post-cutover validation

Error budget unchanged, P95/99 latencies within SLO.
Replica lag ≈ 0, readers healthy, connections balanced.
No rapid storage consumption; AuroraVolumeBytesLeftTotal stable. (AWS Documentation)

Best practices & common pitfalls

Best practices

Rehearse the switch at least twice with production-sized data.
Abstract the endpoint behind your own DNS. You’ll thank yourself on the next move.
Right-size readers and use custom endpoints to isolate heavy analytical/reporting workloads from web/API reads. (AWS Documentation)
Use Blue/Green when supported to minimize downtime and risk. (AWS Documentation)
Track version-specific behavior (e.g., Aurora MySQL v3 nuances vs MySQL 8.0 community). (AWS Documentation)

Pitfalls

Assuming 128 TiB is still the max. As of mid-2025, Aurora MySQL & Postgres support 256 TiB—update your capacity models. (Amazon Web Services, Inc.)
Forgetting custom endpoints after restores (snapshots don’t keep them). Recreate them. (AWS Documentation)
Manual “Promote” during Blue/Green—that breaks the deployment state. Use the official switch action. (AWS Documentation)
Copying RDS params verbatim—Aurora has different defaults and behaviors; test and tune.

Conclusion & takeaways

Aurora can cut failover times, scale reads cheaply, and extend your storage runway to 256 TiB per cluster—but only if you migrate with discipline. Lock down version parity, validate limits, build proper endpoints, load test with production-like data, and execute a scripted cutover. Do that, and your migration becomes a controlled routine—not a midnight firefight.

TL;DR

Pick the right Aurora edition and confirm version/feature parity.
Validate limits (cluster storage up to 256 TiB; table size nuances). (Amazon Web Services, Inc.)
Build green env with writer/reader/custom endpoints; rehearse load. (AWS Documentation)
Prefer Blue/Green or replica-based switchovers; perform DNS cutover cleanly. (AWS Documentation)

Internal link ideas (for your blog)

“Aurora Serverless v2 vs Provisioned: Cost and Latency Trade-offs”
“Designing Read Pools with Custom Endpoints and Connection Poolers”
“MySQL 8.0 Upgrade Guide for Teams still on 5.7”
“PostgreSQL Extensions Support in Aurora vs RDS: What to Watch”

Image prompt

“A clean, modern diagram of an RDS-to-Aurora migration: left side shows an RDS instance, right side an Aurora cluster with writer/reader/custom endpoints; arrows illustrate Blue/Green switchover and DNS CNAME cutover; include storage bars annotated ‘64 TiB’ (RDS) and ‘256 TiB’ (Aurora). Minimalistic, high contrast, isometric 3D style.”

Data/ML Engineer Blog