Using realistic data for testing is a great way to find bugs earlier in your CI pipeline. The best data that you’ll ever be able to use for testing is your production data. It contains the intricacies that your code needs to deal with. However, regulations and controls restrict access to sensitive information stored in production. In the past, this might have prevented you from using production data for testing.
The good news is that Gearset now supports data masking during data deployments. Data masking obfuscates sensitive information while maintaining the complex relationships within your data.
Why should I mask my data?
Developers can test their changes against different categories of data.
Fictional data is usually a good starting point. Typically, this will be simple data that a developer has invented, perhaps for a unit test. While great for testing simple cases, fictional data is often used in theoretical test cases. The data doesn’t capture complex relationships that exist and the ways that users actually behave.
The other extreme is deploying production data to a sandbox for testing. This captures real life complexity, helping developers find bugs before changes hit production. Catching bugs earlier is great, but using your production data isn’t always possible. Regulations (such as GDPR) and other controls restrict access to personally identifiable information (PII). Developers probably won’t have access to this sensitive data.
Data masking hits that sweet spot in the middle. It lets us conceal sensitive information such as names and emails. The complexity of the data still exists, but the developers can’t access real customer information.
How to use data masking
Salesforce data masking is part of Gearset’s data loader, so to get started we kick off a data deployment as usual by selecting the objects we want to deploy. In this example, we’ll deploy some Contact records and their related Account records.
When we click Next, we’ll go to the data masking page. This shows the objects in the deployment with the fields that Gearset can mask. We also see a sample of the masked output for each field.
Masking can be turned on or off for individual fields or entire field types. The toggles on the left let us turn masking on or off for all Email fields. The list on the right configures masking for individual objects or fields.
Once we’re happy with the masking selections, we can go ahead with the deployment. In our example we’ll mask the FirstName, LastName and Email on the Contact object.
We can see that the Contact in the source and the Contact in the target have different values for FirstName, LastName and Email. The Contact is still attached to the correct Account.
We can’t mask everything (just yet!)
There are a few important caveats to data masking:
- It’s crucial to remember there’s no rollback available for data deployment, so make sure you’ve taken a backup of any important data in the target org before you deploy.
- Gearset won’t let you mask data if the target is a production org, to help avoid costly deployment mistakes.
- If an object is being upserted, then you can’t mask the external id field, because the upsert uses the value of that external id field to match records.
- If a masked field is empty on a record in the source, then the field will remain empty on the record in the target.
Note that while data masking is free to all Gearset data loader users during pilot, it will come with an additional charge when it moves to GA.
Try it out!
We’d love for you to try out data masking and let us know what you think. At the moment, you can’t mask every field type, but there will be more coming soon. Are there things you love about it? Things you’d like to see improved? What can we do to make it as useful as possible for you?
Get in touch with any thoughts and feedback, no matter how big or small, via the live chat.