Migrating a 2M-Record Org

Published February 2026

Read Time 12 min

Data Migration Multi-Org Healthcare

Every Salesforce architect has a migration story. This is mine. A national healthcare network with three regional Salesforce orgs, two million records scattered across them, and a mandate from the new CTO: one platform, one truth, six months. What followed was equal parts data engineering, political negotiation, and controlled demolition.

Three Orgs, Zero Alignment

The organization had grown through acquisition. Each regional division had stood up its own Salesforce instance over the years, independently configured by different consultants with different philosophies. The East Coast org was heavily customized with Apex everywhere. The Midwest org leaned on Flows and process builders. The West Coast org was practically vanilla Sales Cloud with a mountain of manual workarounds.

The result was predictable: three different Account models, three different opportunity stages, three different definitions of what a "closed deal" even meant. A patient who interacted with two regions existed as two completely unrelated records. Leadership had no consolidated view of anything.

Before writing a single line of migration code, we spent three weeks just understanding what we were dealing with. I ran metadata exports on all three orgs using the Metadata API and built a comparison matrix: custom objects, custom fields, Apex classes, Flows, validation rules, record types. The numbers were sobering.

Combined custom objects: 87 (with 23 overlapping in purpose but not in schema)
Custom fields across all orgs: 2,400+
Active Apex classes: 340
Active Flows and Process Builders: 180
Total records to migrate: ~2.1 million

Mapping the Data Universe

The audit phase was where we separated what we needed to keep from what we could leave behind. Not every field with data in it is a field worth migrating. We categorized every custom field into one of four buckets: migrate as-is, migrate and transform, merge into unified field, or deprecate.

The hardest conversations were about the "merge" bucket. Each region had its own way of tracking referral sources, for example. East Coast used a picklist with 45 values. Midwest used a lookup to a custom object. West Coast used a free-text field. We had to design a unified referral model that preserved the analytical value of the historical data without carrying forward three incompatible schemas.

Key Takeaway

Data migration is not a technical project. It is a business alignment project. The hardest part is not moving records from A to B. It is getting three regional directors to agree on what an opportunity stage means.

We built a field mapping document that became the single source of truth for the entire migration. Every field in the target org was mapped to its source (or sources), with transformation logic documented inline. This document was over 200 rows long, and every row was reviewed by both a technical lead and a business stakeholder.

The Architecture Decision: Merge vs. Migrate

We had two viable strategies. Option one: pick the most mature org as the "survivor" and migrate the other two into it. Option two: stand up a clean new org and migrate all three into it. Each had trade-offs.

Option one was faster and cheaper. The East Coast org had the most sophisticated automation and the largest user base. But it also carried years of technical debt: deprecated fields that were still referenced in reports, Apex classes that hadn't been touched since 2018, and a permission model that had been patched so many times it was essentially held together with duct tape.

We went with option two. A clean target org gave us the opportunity to build the unified data model correctly from the start, without inheriting anyone's legacy baggage. Yes, it meant more migration work. But it also meant we could standardize automation patterns, clean up the permission model, and establish governance from day one.

Building the Migration Pipeline

For a migration of this scale, Salesforce Data Loader running on someone's laptop was not going to cut it. We needed a repeatable, auditable pipeline that could be run multiple times during testing and then executed cleanly during the cutover window.

The pipeline had four stages: extract, transform, load, and validate. Extract pulled records from all three source orgs via Bulk API. Transform applied the field mapping logic, deduplication rules, and data cleansing. Load pushed records into the target org in dependency order. Validate ran record count comparisons and spot-check queries to verify integrity.

The sequencing of the load phase was critical. Salesforce enforces referential integrity on lookup and master-detail relationships. You cannot insert a Contact before its parent Account exists. You cannot insert an Opportunity Contact Role before both the Opportunity and the Contact exist. We mapped the full dependency graph and built the load sequence accordingly:

Accounts (with external IDs from all three source orgs)
Contacts (matched to Accounts via external ID)
Opportunities (matched to Accounts)
Opportunity Contact Roles
Cases, Tasks, Events, Notes, Attachments
Custom junction objects and child records

The external ID strategy was the linchpin. We created a custom field called Legacy_External_ID__c on every migrated object. This field stored a composite key: the source org identifier plus the original Salesforce record ID. This allowed us to use upsert operations, which are idempotent — meaning we could re-run the migration without creating duplicates.

public class AccountMigrationBatch implements Database.Batchable<SObject> {

    public Database.QueryLocator start(Database.BatchableContext bc) {
        return Database.getQueryLocator([
            SELECT Legacy_External_ID__c, Name, BillingStreet,
                   BillingCity, BillingState, BillingPostalCode,
                   Phone, Website, Industry, Type,
                   Region__c, Source_Org__c
            FROM Account
            WHERE Migration_Status__c = 'Pending'
        ]);
    }

    public void execute(Database.BatchableContext bc, List<Account> scope) {
        List<Database.UpsertResult> results = Database.upsert(
            scope,
            Account.Legacy_External_ID__c,
            false  // allOrNone = false for partial success
        );

        List<Migration_Log__c> logs = new List<Migration_Log__c>();
        for (Integer i = 0; i < results.size(); i++) {
            if (!results[i].isSuccess()) {
                logs.add(new Migration_Log__c(
                    Object_Type__c = 'Account',
                    Record_ID__c = scope[i].Legacy_External_ID__c,
                    Error_Message__c = results[i].getErrors()[0].getMessage(),
                    Batch_Timestamp__c = Datetime.now()
                ));
            }
        }
        if (!logs.isEmpty()) insert logs;
    }

    public void finish(Database.BatchableContext bc) {
        // Send summary email to migration team
    }
}

Every batch job wrote failures to a Migration_Log__c custom object. This gave us a queryable audit trail of every record that failed, why it failed, and when. After each test run, we could pull a report of failures, fix the root causes in the transformation layer, and re-run.

The Deduplication Problem

With records coming from three separate orgs, deduplication was inevitable. The same healthcare provider might exist as an Account in all three instances, with slightly different names, addresses, and phone numbers. We could not simply merge on name — "St. Mary's Hospital" in one org might be "Saint Mary's Regional Medical Center" in another.

We built a fuzzy matching algorithm that scored potential duplicates across multiple dimensions: name similarity (using Jaro-Winkler distance), address proximity (normalized and geocoded), phone number match, and NPI (National Provider Identifier) for healthcare-specific matching. Records scoring above our confidence threshold were auto-merged. Records in the gray zone were flagged for manual review by regional data stewards.

The manual review queue ended up being about 1,200 records — far fewer than the 8,000+ we initially feared. Investing in a good matching algorithm upfront saved weeks of manual work downstream.

The quality of your migration is determined before you move a single record. It is determined by how well you understand the data you are leaving behind.

The Cutover Weekend

We scheduled the production cutover for a Friday evening to Monday morning window. The healthcare network's lowest-activity period. We had rehearsed the full migration three times in sandbox environments, with each rehearsal getting faster and smoother. By the third rehearsal, the full pipeline completed in 14 hours.

The cutover plan was a 47-step runbook, with explicit go/no-go checkpoints at each stage. Every step had an owner, an estimated duration, and a rollback procedure. We also built a real-time dashboard that tracked migration progress: records processed, records succeeded, records failed, and estimated time to completion.

Friday at 6 PM, we froze all three source orgs (read-only profiles applied to all non-admin users), kicked off the extract jobs, and started the clock. By Saturday at 8 AM, all records were loaded. Saturday afternoon was validation: automated count reconciliation, sample audits by regional leads, and smoke testing of critical workflows. Sunday was buffer time for fixes. We found 340 Opportunity records with orphaned Contact Roles (the Contacts had been merged during dedup, invalidating the junction records) and wrote a quick fix script.

Monday at 6 AM, we flipped the new org to live. Users logged in to a unified platform for the first time. The first support ticket came in at 6:12 AM: "Where's my custom report?" We had anticipated this. A FAQ document and a dedicated Slack channel were ready.

Validation and Reconciliation

Migration is not done when the records land in the target org. It is done when the business confirms the data is correct, complete, and usable. We ran three levels of validation:

Automated count reconciliation: Record counts by object, by record type, and by owner, compared between source and target. Any variance over 0.1% triggered investigation.
Field-level spot checks: Random sampling of 500 records per object, comparing every field value between source and target. We wrote an Apex utility that pulled both versions and generated a diff report.
Business process validation: Regional leads walked through their top 10 workflows end-to-end in the new org. Could they find their key accounts? Were the opportunity amounts correct? Did the approval processes fire as expected?

The spot checks uncovered a subtle bug in our date transformation logic: a timezone offset was shifting some CloseDate values by one day. It only affected records migrated from the West Coast org (UTC-8) where the original date was stored without timezone context. We fixed the transformation, re-ran the affected batch, and re-validated. Total impact: 4,200 records, zero data loss.

What I Would Do Differently

If I were running this migration again, I would change two things. First, I would start the deduplication process earlier — ideally during the audit phase, not after the field mapping was complete. Understanding the overlap between orgs informs the field mapping decisions and helps you catch schema conflicts sooner.

Second, I would invest more in user communication during the pre-cutover phase. We sent emails and held town halls, but many users didn't internalize the changes until they logged in Monday morning and couldn't find their bookmarked reports. A sandbox preview environment with their actual data, available two weeks before cutover, would have smoothed the transition significantly.

Key Takeaway

Multi-org migrations are 30% technical execution and 70% stakeholder alignment, data governance, and change management. The ETL pipeline is the easy part. Getting three regional teams to agree on a unified data model is the hard part.

The unified org has been live for over a year now. The healthcare network finally has a single view of every provider relationship across all regions. Pipeline reporting that used to require three separate exports and a manual merge in Excel now runs as a single Salesforce dashboard. And the new governance framework has prevented the kind of organic sprawl that created the problem in the first place.

Two million records, three orgs, one weekend. It was the hardest migration I have led, and the one I am most proud of.

Three Orgs, Zero Alignment

Mapping the Data Universe

The Architecture Decision: Merge vs. Migrate

Building the Migration Pipeline

The Deduplication Problem

The Cutover Weekend

Validation and Reconciliation

What I Would Do Differently

Related Posts

When Flows Break at Scale

From Spreadsheets to Salesforce

Need a Salesforce architect?