Data migrations can be a tricky business based on the size and complexity of the data. Data rarely comes in clean and often times there is “garbage data” and duplicate records that are either wholly identical or that have different aspects of the same person. John Smith may have registered several times using his personal and work email address. He may show up as two records but with two different email addresses – these records should be merged. People move from companies to other companies. There should be a strategy around data being cleansed – first can bad data be cleansed before it goes in and once data goes in, how is it checked?
MondayCall has done data migrations in the millions of records and in multiple systems. It takes special skills and experience to do it right. It takes careful planning, data structuring and restructuring, verification of cleanliness of data, test migrations, actual migrations and data QA.
We note a variety of tips that we’ve found to be useful in data migrations:
- First rule of any data work – BACK UP EVERYTHING!!!
- Always have a roll back plan. At any point before release you should be able to roll back. Modifying data such that you can’t role it back is not a best practice.
- Use realistic data during development and testing. Using real records will increase the likelihood of ironing out issues early rather than using fictitious data. Choosing the right subset that represents all of your data can be a challenge but improves risk
- In most cases, not every record can be verified and so sampling must be done in order to test it. Test samples must be done in different categories to ensure a decent test coverage. Ideally all categories should be tested but if there are too many combinations deciding what areas to sample is critical in risk mitigation
- The migration itself can take a significant amount of time – sometimes several days of processing time– depending upon the speed and power of the machine(s) involved. During this time there must be a data “freeze” where no data is being modified, deleted or added “in flight”. Otherwise, data integrity issues can occur. During test sample migrations, the amount of processing time can be estimated
- Migrations are a good opportunity for transformation, cleaning, etc. Modifying data, provided the data set is of reasonable size, can sometimes be easier to do in Excel. Some customers elect to produce an export file and modify the data quickly in Excel and then import the cleaner data into Salesforce. Others may use staging tables to clean data in an intermediary location.
- Data integrity is key. As they say, “Garbage in, garbage out.” It’s important to clean data before it gets into Salesforce – it can either be cleaned at the source, cleaned in an intermediate step (as noted in the previous bullet point) or cleaned up after it has imported. Note that data must not only be clean but that data integrity must be maintained. Make sure you deal with items like “orphan records” where data relationships between records might be broken.
- Consider the concept of UPSERT. Most everyone is familiar with INSERT, UPDATE and DELETE but there is another supported data action which is to UPSERT – to insert a row if the row does not already exist – otherwise UPDATE if the row already exists.
- Have a documented strategy on how it should work. Data migration is best done all at once but there are times when things need to be corrected before the export file, on the export file itself or once the export file has been imported into the system.
- Plan for the worst case scenario. Make sure automated rule sets are consciously turned on or off during data migration. Alerts may be turned on when new records are added but may not be relevant in the case of a mass data upload. On the other hand, an import of leads with an automated lead assignment could have your newly imported leads all nicely assigned as intended. Examine all of your data rules to make sure you have the ones you want turned on.
- Test data once data has been migrated. Sampling may be necessary due to large data sets but it is not uncommon to find that some data might be missing and that some aspects of the data migration must be re-run. Test your data before releasing it because once your users start to modify the data it is very hard to get back to where you were before.