At KELL we perform a lot of data migrations for our clients. As of March 2017, we've completed more than 1,500 database migrations since 2009. Along the way, we've learned a LOT and would like to share some best practices. Today we're talking about partial vs. full data loads for testing and validation during the migration process.
Best Practice #1 - Perform an Initial Test Load
Anytime you are moving from one database into another, it’s always best practice to do a test load. A test load accomplishes several important goals:
A test load allows your data guru to troubleshoot any errors that might arise without having an impact on your migration timeline. For example, sometimes there are processing errors, or limits that are reached and need to be extended, or automations that need to be turned off, etc. These types of issues can be identified during a test load, and then measures can be taken to ensure they don’t happen in the final migration.
Issues with the source data are often identified that can be resolved before the final load. For example, if 100 email addresses couldn’t be imported because of poorly formatted data, you could clean them up in the legacy database prior to the final migration.
Field mapping (deciding where to migrate particular types of data from your legacy system) often sounds good in theory, but looks different in practice. A test load with validation ensures that a few power users have an opportunity to review the data in the new system. Sometimes “seeing” the data in the new system makes all the difference in the world.
Best Practice #2 - Migrate ALL Data for the Initial Load
At KELL, we always perform a complete/full migration of our clients legacy data for the initial load. Sometimes clients ask that we import just a subset or “example” set of data, but we do not recommend this approach. Salesforce is a relational database. That means that one record can be connected to many other records, which may be connected to many other records, which may be connected to many other records, and so on.
For example, let’s say you only want to include John and Mary Smith in a test load. You migrate John and Mary Smith and their demographic data, all of their donations, their relationships, their event attendance, etc. That sounds very straightforward. However, let’s say that John Smith has a relationship with Jimmy Johnson. He also attended an event as a guest of Jimmy Johnson. And Mary Smith was actually soft credited on a Donation made by Acme Inc.
If Jimmy Johnson and Acme, Inc. were not also included in this “example” set of data, you would not be able to migrate this very important constituent information about John and Mary Smith because it would require Jimmy and Acme to exist. These types of links between other records can be pervasive and significant. Imagine if you're migrating membership records and you don't create all of the historical memberships, your team will be scratching their heads when trying to validate the results.
Validation of the data migration suddenly becomes very difficult! The Development Associate doing the data validation wants to know why John Smith’s relationships and Mary Smith’s Soft Credits are missing, or why their membership information is different from the legacy database. It would be very difficult to determine what was a mapping/migration issue vs. what was only a result of certain constituents being excluded from the “example” data set.
Therefore we find that we get the best results when we migrate a complete data set during the test load.
Best Practice #3 - Use an External Temporary DB for Migration Prep
Create a temporary database for the purpose of migration preparation using a separate system from Salesforce. At KELL, we create temporary SQL databases for each migration. By using SQL we're able to script the migration steps with granular detail and save these scripts for use in both the initial and full migration process. Salesforce becomes the destination, and we're able to push our client's clean, deduped and finalized data into Salesforce all at once. This also allows us to easily wipe Salesforce clean between the initial and final load, because all of the prep work is preserved in SQL and easily replicable with a new backup of legacy data.