Cloud Migration Guide: Step by Step, Without the Drama

Cloud migration isn’t “moving servers” — it’s a chance to build a stable, secure, cost-efficient foundation. But without proper planning, a cloud migration ends in a bloated bill, security surprises, and downtime that costs you money and trust. This guide is a complete cloud migration strategy — from the initial assessment to post-cutover optimization — so you migrate to the cloud once, properly, instead of spending a year fixing it.

This isn’t a theoretical article. Everything here is what I do in production for clients: engineering teams that need an AWS migration or a GCP migration without stopping the business, and startups that grew too fast and need infrastructure that holds up. We’ll start with the most important question — why migrate at all — and drill down to the technical details of DNS cutover, data migration, and rollback.

Why move to the cloud (and when not to)

Before talking about a cloud migration strategy, you have to decide whether to migrate at all. A cloud migration is an expensive project in effort and time, and not every workload benefits from it.

Good reasons to migrate to the cloud:

Elasticity — scale resources up and down based on actual demand, instead of buying hardware for peak load and paying for it all year.
Reliability — multiple availability zones, managed backups, and automatic failover that are hard and expensive to build on-prem.
Development speed — managed services (databases, queues, object storage) that let your team focus on the product instead of maintaining infrastructure.
Pay-per-use — convert capital expenditure (CapEx) into operating expense (OpEx), with no large upfront investment.
Global reach — deploy a service close to users across continents in minutes.

Bad reasons to migrate to the cloud:

“Because everyone is” or chasing buzz. Fashion is not a strategy.
Expecting automatic savings. An unmanaged cloud is usually more expensive than on-prem, not cheaper.
Running away from an architecture problem. If the app is broken, it’ll be broken in the cloud too — just with a bigger bill.

If your current system is stable, cheap, and meets requirements — don’t migrate for its own sake. It’s worth migrating to the cloud when you need real scale, higher reliability, development speed, or when on-prem (hardware refresh, data center management, lack of staff) has become a genuine burden.

Pre-migration assessment

The easiest step to skip, and the most expensive one to skip. A good pre-migration assessment saves months of rework. Before you write a single line of code or stand up a cloud account, you need to understand what you have and what you’re moving.

What to map in the assessment phase:

Inventory — every server, service, database, job, and cron. You’ll be surprised how many things nobody remembers are still running.
Dependencies — who talks to whom. A service that looks standalone is usually tied to three others. A wrong dependency map is the number-one cause of downtime in a migration.
Current costs — what you pay today (hardware, licensing, power, staff). Without a baseline, you can’t know whether the cloud migration paid off.
Compliance requirements — GDPR, HIPAA, SOC 2, data residency. This affects your region choice and your architecture.
Performance requirements — latency, throughput, RTO/RPO (how much downtime and data loss you can tolerate).
State of the code — is the app stateless? Does it depend on local file paths, a fixed IP, or assumptions about the hardware?

The output of this phase is a decision matrix: for each workload, which migration strategy (which R) fits, what its dependencies are, and how complex it is. This is the foundation of any cloud migration strategy.

The 6 migration strategies (the 6 R’s)

For each workload, pick a strategy. There’s no single right one — most migrations are a mix. The table below summarizes the 6 R’s, when to use each, and the cost-versus-value tradeoff:

Strategy	What you do	When it fits	Effort	Cloud value
Rehost (“lift and shift”)	Move as-is, no code changes	Fast migration, tight deadline, data center exit	Low	Low
Replatform	Small changes (managed DB, managed load balancer)	Want a quick win without a rewrite	Medium	Medium
Refactor / Re-architect	Rewrite cloud-native (containers, serverless)	Workload critical to growth, needs real scale	High	High
Repurchase	Switch to SaaS instead of maintaining	Generic software (CRM, email, BI)	Low	Varies
Retire	Turn off what you don’t need	There’s always something — dead services nobody monitors	Low	Immediate savings
Retain	Leave on-prem for now	Legacy not ready, compliance constraint	None	None

Rehost (“lift and shift”) is the popular starting point: fast, low-risk, and lets you exit a data center quickly. The downside — you pay for the cloud but don’t leverage it (still VMs, still maintenance). It’s a legitimate first step, as long as there’s a plan to keep going.

Replatform is the sweet spot of many migrations: for example, replacing a self-managed database with a managed RDS/Cloud SQL, or moving a load balancer to a managed service — without rewriting the application. A meaningful win for reasonable effort.

Refactor delivers the maximum value (auto-scaling, lower cost, higher uptime) but requires the biggest investment. Reserve it for workloads that are critical to growth, not for everything.

The practical approach: fast rehost for simple workloads, replatform for what wins quickly, refactor for what’s critical to growth, retire everything that’s dead, and retain the legacy that isn’t ready.

Cloud migration checklist: step by step

This is the heart of the guide — an ordered cloud migration checklist, in the right sequence. Skipping an early step (especially the landing zone) comes back to bite you later.

Step 1 — Map & assess

See the pre-migration assessment section above. Output: inventory, dependency map, cost baseline, and an R-per-workload matrix. Don’t start without it.

Step 2 — Landing zone (the cloud foundation)

This is the step that separates a professional migration from “a mess in the cloud.” Before you move even a single workload, build a secure cloud foundation:

Account structure — AWS Organizations / GCP folders & projects. Separate prod, staging, and dev. Environment separation is your first line of defense.
Networking — VPC, subnets, security groups / firewall rules, private vs public. Plan your CIDR ranges up front — they’re hard to change later.
IAM — least-privilege roles, federation with your existing IdP, no root users for day-to-day work.
Secrets management — Secrets Manager / Secret Manager. Zero passwords in code or env vars.
Logging & observability — CloudTrail / Cloud Audit Logs, metrics, alerts — from day one, not after an incident.
Infrastructure-as-code (IaC) — all of this in Terraform. The landing zone must be reproducible, not ClickOps.

Step 3 — Per-workload strategy

Apply the R matrix: for each component, the chosen strategy. Document decisions and dependencies. This is where you decide order: start with simple, standalone (low-risk) workloads to build confidence and experience, and leave the critical, complex ones for last.

Step 4 — Data migration

The most sensitive part of any cloud migration. Data is the asset you can’t “rebuild.” Your options:

Backup & restore — simple, but requires a downtime window. Fine for small DBs.
Live replication — stand up a replica in the cloud, sync it, and cut over in a short window. Minimal downtime.
Managed migration services — AWS DMS / GCP Database Migration Service to move databases with minimal downtime.
Physical transfer — for huge volumes (terabytes+), disk-shipping services (AWS Snowball) are faster than the network.

In every case: a rollback plan before you start, and data integrity validation (checksums, row counts) afterward.

Step 5 — Testing

Before cutting traffic over, validate in the cloud environment:

Performance — load testing against the target, compared to baseline.
Security — scan permissions, security groups, unintended public exposure.
Disaster recovery — confirm backup and restore actually work, not just in theory.
Functionality — smoke tests and E2E on the critical flows.

Step 6 — Gradual cutover

Shift traffic incrementally, not all at once. Methods: canary (a small percentage of users), blue-green (two environments, you swap), or staged DNS cutover. See the next section on how to avoid downtime.

Step 7 — Optimize

The migration doesn’t end at cutover. Once the system is in the cloud:

FinOps — right-sizing, reserved/committed use, deleting orphaned resources. This is where the real money is saved.
Reliability (SRE) — SLOs, alerting, auto-scaling, resilience.
Automation — CI/CD, DevOps for startups that turns a deploy into a non-event.

How to avoid downtime (downtime & rollback)

This is the biggest fear of anyone considering a cloud migration: “what if it falls over mid-migration?” The answer isn’t “hope for the best” — it’s an architecture that allows gradual cutover and instant rollback.

The principles:

Sync data ahead of time — before cutover, the data in the cloud is already up to date (live replication). The cutover itself is a brief moment, not a marathon.
Staged DNS cutover — lower the TTL on your DNS records in advance (e.g. to 60 seconds) a few days before, so the cutover propagates fast. Point a small percentage of traffic at the target, verify, and only then scale up.
Blue-green / canary — two live environments in parallel. If something breaks, point DNS / the load balancer back to the old environment within seconds.
A written rollback plan — not “we’ll see in real time.” A document with a clear trigger (“if error rate > X%”), exact steps, and an owner for the decision.
Infrastructure-as-code — Terraform lets you reproduce each step exactly. If you need to roll back, you return to a known-good configuration instead of guessing.

Common mistake: cutting DNS over without lowering the TTL in advance. A record with a 24-hour TTL means some users will keep hitting the old server for a full day after cutover — turning rollback into a nightmare.

Security and cost — from day one

The two areas easiest to defer to “later” — and the most expensive to defer.

Security

Least-privilege IAM — every identity gets exactly the permissions it needs, no more. No *:*.
Closed networks — tight security groups / firewall rules, zero unnecessary public exposure. Databases never live in a public subnet.
Secrets in the right place — Secrets Manager / Secret Manager, not env vars and not code.
Zero-trust — authenticate at every layer, encrypt in transit and at rest, full audit. The default is “deny,” not “allow.”
Security-as-code — policy, IAM, and firewall in Terraform, reviewed in a PR like any other code.

Cost

Tagging — every resource tagged (owner, env, project). Without tags there’s no FinOps — you can’t tell who’s wasting.
Budgets and alerts — budget alerts before the bill explodes, not after.
Right-size from the start — don’t stand up an xlarge “just to be safe.” Start small, measure, scale up as needed.
Automatic shutdown — dev/staging environments shut down at night and on weekends. Why pay for a server nobody uses at 3 a.m.?

An unmanaged cloud = a ballooning bill. See our guide to cutting your cloud bill with FinOps.

AWS or GCP? (AWS migration vs GCP migration)

The question that always comes up. There’s no single answer — it depends on your needs, your team, and your existing system. Both are excellent; the choice should be deliberate, not fashionable.

Considerations for an AWS migration:

The broadest range of services on the market and the highest maturity.
A huge ecosystem — easy to find staff, tools, and documentation.
Strong presence in large enterprises and compliance-heavy environments.

Considerations for a GCP migration:

A strong developer experience, especially around containers (GKE) and data/ML (BigQuery, Vertex AI).
Networking and egress pricing that’s often competitive.
Some of the most advanced data analytics tooling on the market.

Practical recommendation: if you already have an investment (knowledge, tooling, contracts) in one of them — it’s usually right to stay. If you’re starting from scratch, choose based on your primary workload: heavy data/ML leans toward GCP, broad range and enterprise maturity lean toward AWS. In our cloud consulting we help you choose, or support both — plus hybrid and multi-cloud architectures when it’s genuinely warranted.

Common cloud migration pitfalls

The mistakes I see again and again in teams that did the migration on their own:

Skipping the landing zone — they start moving workloads before there’s a secure foundation. Then they try to bolt security onto an existing mess. Painful.
Lift and shift with no follow-through — they move everything as-is, pay cloud prices, and stay stuck there forever without enjoying a single advantage.
Ignoring cost until the first bill — no tagging, no budgets, then a surprise of thousands of dollars at month’s end.
An incomplete dependency map — they move a service without knowing it depends on something still on-prem. Result: downtime.
No rollback plan — they rely on “it’ll work,” and then when it doesn’t, there’s no orderly way back.
High DNS TTL during cutover — users stuck on the old server for hours after you’ve cut over.
Late right-sizing — they stand everything up large “to be safe” and pay 3x what they need.
Security as an afterthought — IAM too open, security groups too broad, secrets in code. An open door for an attacker.

What all these mistakes have in common: they’re all avoidable with planning. That’s exactly what the assessment phase is for.

Frequently asked questions

How long does a cloud migration take? It depends on scope and strategy. A rehost of a single workload can take days; a full migration of an organization with dozens of services, data migration, and refactoring can take months. A good pre-migration assessment gives you a realistic timeline instead of a guess.

What is lift and shift, and when should I use it? Lift and shift (rehost) is moving a workload to the cloud as-is, with no code changes. Use it when you have a tight deadline, an urgent data center exit, or as a fast first step before optimization. The downside: you don’t leverage the cloud’s advantages. Don’t stay there forever.

How do I avoid downtime during the migration? Sync data ahead of time (live replication), lower the DNS TTL before cutover, cut over gradually (canary / blue-green), and keep a written rollback plan. With the right architecture, downtime is seconds — or zero.

Is an AWS migration or a GCP migration better? Both are excellent. If you already have an investment in one — stay. If you’re starting from scratch — choose based on your primary workload (data/ML leans toward GCP, range and enterprise maturity lean toward AWS). The choice should be an engineering decision, not a fashion statement.

How much will it cost, and how much can I save? A cloud migration doesn’t automatically cut costs — an unmanaged cloud is usually more expensive. The real savings come from FinOps after the migration: right-sizing, committed use, shutting down idle environments, and deleting orphaned resources. With proper management, saving tens of percent off the bill is realistic.

Do I have to rewrite everything (refactor)? Absolutely not. Most migrations are a mix: refactor only the workloads critical to growth, replatform what wins quickly without a rewrite, and rehost the rest. Rewriting everything is a waste of time and money.

How to start

A successful cloud migration starts with an assessment, not a panic. A good cloud migration strategy is the difference between a stable, cost-efficient foundation and another year of fixes and bloated bills.

Want to migrate to the cloud the right way — an AWS migration, a GCP migration, or a hybrid architecture — or rein in a cloud that’s already out of control? Talk to us for a free intro call. We’ll walk through your system together, build an R-per-workload matrix, and sketch a migration plan that fits the business. You can also read about our full cloud services and DevOps for startups.