Cloud Migration Guide: Step by Step, Without the Drama
A practical cloud migration guide for AWS & GCP: cloud migration strategy (the 6 R's), a step-by-step cloud migration checklist, how to avoid downtime, security & cost.
Cloud migration isn’t “moving servers” — it’s a chance to build a stable, secure, cost-efficient foundation. But without proper planning, a cloud migration ends in a bloated bill, security surprises, and downtime that costs you money and trust. This guide is a complete cloud migration strategy — from the initial assessment to post-cutover optimization — so you migrate to the cloud once, properly, instead of spending a year fixing it.
This isn’t a theoretical article. Everything here is what I do in production for clients: engineering teams that need an AWS migration or a GCP migration without stopping the business, and startups that grew too fast and need infrastructure that holds up. We’ll start with the most important question — why migrate at all — and drill down to the technical details of DNS cutover, data migration, and rollback.
Why move to the cloud (and when not to)
Before talking about a cloud migration strategy, you have to decide whether to migrate at all. A cloud migration is an expensive project in effort and time, and not every workload benefits from it.
Good reasons to migrate to the cloud:
- Elasticity — scale resources up and down based on actual demand, instead of buying hardware for peak load and paying for it all year.
- Reliability — multiple availability zones, managed backups, and automatic failover that are hard and expensive to build on-prem.
- Development speed — managed services (databases, queues, object storage) that let your team focus on the product instead of maintaining infrastructure.
- Pay-per-use — convert capital expenditure (CapEx) into operating expense (OpEx), with no large upfront investment.
- Global reach — deploy a service close to users across continents in minutes.
Bad reasons to migrate to the cloud:
- “Because everyone is” or chasing buzz. Fashion is not a strategy.
- Expecting automatic savings. An unmanaged cloud is usually more expensive than on-prem, not cheaper.
- Running away from an architecture problem. If the app is broken, it’ll be broken in the cloud too — just with a bigger bill.
If your current system is stable, cheap, and meets requirements — don’t migrate for its own sake. It’s worth migrating to the cloud when you need real scale, higher reliability, development speed, or when on-prem (hardware refresh, data center management, lack of staff) has become a genuine burden.
Pre-migration assessment
The easiest step to skip, and the most expensive one to skip. A good pre-migration assessment saves months of rework. Before you write a single line of code or stand up a cloud account, you need to understand what you have and what you’re moving.
What to map in the assessment phase:
- Inventory — every server, service, database, job, and cron. You’ll be surprised how many things nobody remembers are still running.
- Dependencies — who talks to whom. A service that looks standalone is usually tied to three others. A wrong dependency map is the number-one cause of downtime in a migration.
- Current costs — what you pay today (hardware, licensing, power, staff). Without a baseline, you can’t know whether the cloud migration paid off.
- Compliance requirements — GDPR, HIPAA, SOC 2, data residency. This affects your region choice and your architecture.
- Performance requirements — latency, throughput, RTO/RPO (how much downtime and data loss you can tolerate).
- State of the code — is the app stateless? Does it depend on local file paths, a fixed IP, or assumptions about the hardware?
The output of this phase is a decision matrix: for each workload, which migration strategy (which R) fits, what its dependencies are, and how complex it is. This is the foundation of any cloud migration strategy.
The 6 migration strategies (the 6 R’s)
For each workload, pick a strategy. There’s no single right one — most migrations are a mix. The table below summarizes the 6 R’s, when to use each, and the cost-versus-value tradeoff:
| Strategy | What you do | When it fits | Effort | Cloud value |
|---|---|---|---|---|
| Rehost (“lift and shift”) | Move as-is, no code changes | Fast migration, tight deadline, data center exit | Low | Low |
| Replatform | Small changes (managed DB, managed load balancer) | Want a quick win without a rewrite | Medium | Medium |
| Refactor / Re-architect | Rewrite cloud-native (containers, serverless) | Workload critical to growth, needs real scale | High | High |
| Repurchase | Switch to SaaS instead of maintaining | Generic software (CRM, email, BI) | Low | Varies |
| Retire | Turn off what you don’t need | There’s always something — dead services nobody monitors | Low | Immediate savings |
| Retain | Leave on-prem for now | Legacy not ready, compliance constraint | None | None |
Rehost (“lift and shift”) is the popular starting point: fast, low-risk, and lets you exit a data center quickly. The downside — you pay for the cloud but don’t leverage it (still VMs, still maintenance). It’s a legitimate first step, as long as there’s a plan to keep going.
Replatform is the sweet spot of many migrations: for example, replacing a self-managed database with a managed RDS/Cloud SQL, or moving a load balancer to a managed service — without rewriting the application. A meaningful win for reasonable effort.
Refactor delivers the maximum value (auto-scaling, lower cost, higher uptime) but requires the biggest investment. Reserve it for workloads that are critical to growth, not for everything.
The practical approach: fast rehost for simple workloads, replatform for what wins quickly, refactor for what’s critical to growth, retire everything that’s dead, and retain the legacy that isn’t ready.
Cloud migration checklist: step by step
This is the heart of the guide — an ordered cloud migration checklist, in the right sequence. Skipping an early step (especially the landing zone) comes back to bite you later.
Step 1 — Map & assess
See the pre-migration assessment section above. Output: inventory, dependency map, cost baseline, and an R-per-workload matrix. Don’t start without it.
Step 2 — Landing zone (the cloud foundation)
This is the step that separates a professional migration from “a mess in the cloud.” Before you move even a single workload, build a secure cloud foundation:
- Account structure — AWS Organizations / GCP folders & projects. Separate prod, staging, and dev. Environment separation is your first line of defense.
- Networking — VPC, subnets, security groups / firewall rules, private vs public. Plan your CIDR ranges up front — they’re hard to change later.
- IAM — least-privilege roles, federation with your existing IdP, no root users for day-to-day work.
- Secrets management — Secrets Manager / Secret Manager. Zero passwords in code or env vars.
- Logging & observability — CloudTrail / Cloud Audit Logs, metrics, alerts — from day one, not after an incident.
- Infrastructure-as-code (IaC) — all of this in Terraform. The landing zone must be reproducible, not ClickOps.
Step 3 — Per-workload strategy
Apply the R matrix: for each component, the chosen strategy. Document decisions and dependencies. This is where you decide order: start with simple, standalone (low-risk) workloads to build confidence and experience, and leave the critical, complex ones for last.
Step 4 — Data migration
The most sensitive part of any cloud migration. Data is the asset you can’t “rebuild.” Your options:
- Backup & restore — simple, but requires a downtime window. Fine for small DBs.
- Live replication — stand up a replica in the cloud, sync it, and cut over in a short window. Minimal downtime.
- Managed migration services — AWS DMS / GCP Database Migration Service to move databases with minimal downtime.
- Physical transfer — for huge volumes (terabytes+), disk-shipping services (AWS Snowball) are faster than the network.
In every case: a rollback plan before you start, and data integrity validation (checksums, row counts) afterward.
Step 5 — Testing
Before cutting traffic over, validate in the cloud environment:
- Performance — load testing against the target, compared to baseline.
- Security — scan permissions, security groups, unintended public exposure.
- Disaster recovery — confirm backup and restore actually work, not just in theory.
- Functionality — smoke tests and E2E on the critical flows.
Step 6 — Gradual cutover
Shift traffic incrementally, not all at once. Methods: canary (a small percentage of users), blue-green (two environments, you swap), or staged DNS cutover. See the next section on how to avoid downtime.
Step 7 — Optimize
The migration doesn’t end at cutover. Once the system is in the cloud:
- FinOps — right-sizing, reserved/committed use, deleting orphaned resources. This is where the real money is saved.
- Reliability (SRE) — SLOs, alerting, auto-scaling, resilience.
- Automation — CI/CD, DevOps for startups that turns a deploy into a non-event.
How to avoid downtime (downtime & rollback)
This is the biggest fear of anyone considering a cloud migration: “what if it falls over mid-migration?” The answer isn’t “hope for the best” — it’s an architecture that allows gradual cutover and instant rollback.
The principles:
- Sync data ahead of time — before cutover, the data in the cloud is already up to date (live replication). The cutover itself is a brief moment, not a marathon.
- Staged DNS cutover — lower the TTL on your DNS records in advance (e.g. to 60 seconds) a few days before, so the cutover propagates fast. Point a small percentage of traffic at the target, verify, and only then scale up.
- Blue-green / canary — two live environments in parallel. If something breaks, point DNS / the load balancer back to the old environment within seconds.
- A written rollback plan — not “we’ll see in real time.” A document with a clear trigger (“if error rate > X%”), exact steps, and an owner for the decision.
- Infrastructure-as-code — Terraform lets you reproduce each step exactly. If you need to roll back, you return to a known-good configuration instead of guessing.
Common mistake: cutting DNS over without lowering the TTL in advance. A record with a 24-hour TTL means some users will keep hitting the old server for a full day after cutover — turning rollback into a nightmare.
Security and cost — from day one
The two areas easiest to defer to “later” — and the most expensive to defer.
Security
- Least-privilege IAM — every identity gets exactly the permissions it needs, no more. No
*:*. - Closed networks — tight security groups / firewall rules, zero unnecessary public exposure. Databases never live in a public subnet.
- Secrets in the right place — Secrets Manager / Secret Manager, not env vars and not code.
- Zero-trust — authenticate at every layer, encrypt in transit and at rest, full audit. The default is “deny,” not “allow.”
- Security-as-code — policy, IAM, and firewall in Terraform, reviewed in a PR like any other code.
Cost
- Tagging — every resource tagged (owner, env, project). Without tags there’s no FinOps — you can’t tell who’s wasting.
- Budgets and alerts — budget alerts before the bill explodes, not after.
- Right-size from the start — don’t stand up an
xlarge“just to be safe.” Start small, measure, scale up as needed. - Automatic shutdown — dev/staging environments shut down at night and on weekends. Why pay for a server nobody uses at 3 a.m.?
An unmanaged cloud = a ballooning bill. See our guide to cutting your cloud bill with FinOps.
AWS or GCP? (AWS migration vs GCP migration)
The question that always comes up. There’s no single answer — it depends on your needs, your team, and your existing system. Both are excellent; the choice should be deliberate, not fashionable.
Considerations for an AWS migration:
- The broadest range of services on the market and the highest maturity.
- A huge ecosystem — easy to find staff, tools, and documentation.
- Strong presence in large enterprises and compliance-heavy environments.
Considerations for a GCP migration:
- A strong developer experience, especially around containers (GKE) and data/ML (BigQuery, Vertex AI).
- Networking and egress pricing that’s often competitive.
- Some of the most advanced data analytics tooling on the market.
Practical recommendation: if you already have an investment (knowledge, tooling, contracts) in one of them — it’s usually right to stay. If you’re starting from scratch, choose based on your primary workload: heavy data/ML leans toward GCP, broad range and enterprise maturity lean toward AWS. In our cloud consulting we help you choose, or support both — plus hybrid and multi-cloud architectures when it’s genuinely warranted.
Common cloud migration pitfalls
The mistakes I see again and again in teams that did the migration on their own:
- Skipping the landing zone — they start moving workloads before there’s a secure foundation. Then they try to bolt security onto an existing mess. Painful.
- Lift and shift with no follow-through — they move everything as-is, pay cloud prices, and stay stuck there forever without enjoying a single advantage.
- Ignoring cost until the first bill — no tagging, no budgets, then a surprise of thousands of dollars at month’s end.
- An incomplete dependency map — they move a service without knowing it depends on something still on-prem. Result: downtime.
- No rollback plan — they rely on “it’ll work,” and then when it doesn’t, there’s no orderly way back.
- High DNS TTL during cutover — users stuck on the old server for hours after you’ve cut over.
- Late right-sizing — they stand everything up large “to be safe” and pay 3x what they need.
- Security as an afterthought — IAM too open, security groups too broad, secrets in code. An open door for an attacker.
What all these mistakes have in common: they’re all avoidable with planning. That’s exactly what the assessment phase is for.
Frequently asked questions
How long does a cloud migration take? It depends on scope and strategy. A rehost of a single workload can take days; a full migration of an organization with dozens of services, data migration, and refactoring can take months. A good pre-migration assessment gives you a realistic timeline instead of a guess.
What is lift and shift, and when should I use it? Lift and shift (rehost) is moving a workload to the cloud as-is, with no code changes. Use it when you have a tight deadline, an urgent data center exit, or as a fast first step before optimization. The downside: you don’t leverage the cloud’s advantages. Don’t stay there forever.
How do I avoid downtime during the migration? Sync data ahead of time (live replication), lower the DNS TTL before cutover, cut over gradually (canary / blue-green), and keep a written rollback plan. With the right architecture, downtime is seconds — or zero.
Is an AWS migration or a GCP migration better? Both are excellent. If you already have an investment in one — stay. If you’re starting from scratch — choose based on your primary workload (data/ML leans toward GCP, range and enterprise maturity lean toward AWS). The choice should be an engineering decision, not a fashion statement.
How much will it cost, and how much can I save? A cloud migration doesn’t automatically cut costs — an unmanaged cloud is usually more expensive. The real savings come from FinOps after the migration: right-sizing, committed use, shutting down idle environments, and deleting orphaned resources. With proper management, saving tens of percent off the bill is realistic.
Do I have to rewrite everything (refactor)? Absolutely not. Most migrations are a mix: refactor only the workloads critical to growth, replatform what wins quickly without a rewrite, and rehost the rest. Rewriting everything is a waste of time and money.
How to start
A successful cloud migration starts with an assessment, not a panic. A good cloud migration strategy is the difference between a stable, cost-efficient foundation and another year of fixes and bloated bills.
Want to migrate to the cloud the right way — an AWS migration, a GCP migration, or a hybrid architecture — or rein in a cloud that’s already out of control? Talk to us for a free intro call. We’ll walk through your system together, build an R-per-workload matrix, and sketch a migration plan that fits the business. You can also read about our full cloud services and DevOps for startups.