Expedite Data Masking in Salesforce -- Approach B
Objective:
Welcome to this chapter where we would be exploring another approach to expedite Data Masking in Salesforce. This is a continuation to the Data Masking Approach discussed in our previous article (https://manojn2sf.blogspot.com/2022/01/expedite-data-masking-in-salesforce.html). The approach we discussed in our previous article had
- A Common database/sandbox which will be sync'd with sandboxes masked data as and when we refresh it.
- The masked data in this common database/sandbox will then be re-used to apply data masking for a newly refreshed sandbox.
Usecase:
Universal Builders has roughly 20k Salesforce users and 150k external Partner Users. Agile methodology is followed and as per their development life cycle, SIT BUILD and UAT are done in Full Copy sandboxes. Once sign OFF obtained, features get deployed to production.
Universal Builders were also compliant with GDPR regulations, as they used an ETL tool (For instance Informatica, Boomi et al) for masking data in sandboxes (Partial Copy and Full Copy). Their Sprint deliverables had a decent velocity but as the Contact volume grew from 20M to 40M, Universal Builders started observing latency in their sandbox readiness which impacted their sprint velocity. This was because the Data Masking in sandbox was averaging 4M/day (Implies the Data Masking activity was taking 10 days.)
Eric approached Smita regarding this situation, because the eventual reduction in sprint velocity has made leadership unhappy. Smita is thus tasked to find a solution for expediting Data Masking process.
Smitha thus introduces us to the Alternative approach for Data Masking as below
Solution:
- We are using an ETL tool to first Mask data in a recently refreshed Full copy sandbox so as to comply with GDPR regulations.
- The ETL tool would be leveraging pk chunking method to first query the data to be masked and then leverage BULK API capabilities to perform Data Masking in an expedited fashion. Few ETL tools that support pk chunking and BULK API capabilities can be Boomi, Informatica, Mulesoft etc.
- With this we would be getting a masked Full Copy sandbox which can be referred as "Common Masked Database/sandbox".
- Data Sync is the next part of the solution where we would look at continuously syncing data between production and the Common sandbox via the ETL tool.
- ETL tool would also be masking the fresh data that got sync'd from production based on the masking rules.
- Having an environment which has the latest production data and can bee used for performance test use cases.
- One can also explore the possibility of establishing the Common sandbox/database as a secondary environment to Salesforce Production since the data gets replicated from production via data sync process. As a result, one can use the secondary environment to run their business (or redirect API transactions) in case Salesforce is having a downtime. Though this aspect excites use, its beyond the scope of this article.
- The Prod equivalent data in common database/sandbox will then be acting as Source data for our Data Masking process for any Full/Partial Copy sandboxes that we refresh there of.
- An ETL tool is then used for Data Masking process thus performing a Simple "UPDATE" operation via BULK API to mask respective PII/Personal information fields data in the newly refreshed sandbox.
- Eric exceeded Business expectations in masking data with improved performance.
- Eric now has a secondary environment to Production that can be used for Performance testing usecase.
- Erica has to less worry about errors like "Record Not Available" OR "Delta" records as this approach solves those issues in comparison to Approach A.
Article Links/References
- Sandbox Types (https://www.salesforceben.com/salesforce-sandbox/)
- Bulk API details (https://developer.salesforce.com/docs/atlas.en-us.220.0.api_asynch.meta/api_asynch/asynch_api_intro.htm)
- PK Chunking in Salesforce (https://developer.salesforce.com/blogs/engineering/2015/03/use-pk-chunking-extract-large-data-sets-salesforce)
Comments
Post a Comment