Professional Documents
Culture Documents
Data Masking - 5-Phase Methodology
Data Masking - 5-Phase Methodology
Data Masking - 5-Phase Methodology
implementation methodology
Nikhil Patwardhan
Nikhil Patwardhan
Analysis .................................................................................................................. 9
Set-Up ..................................................................................................................... 9
Masking ................................................................................................................ 10
Uploading ............................................................................................................. 10
Conclusion............................................................................................................... 10
REFERENCES.......................................................................................................... 10
To tackle this problem, organizations spend lot of time and money to do either of the following:-
1. Share production data with an implicit agreement not to abuse it. This carries a great deal
of risk as sensitive data is disclosed as is.
2. Create fictitious dataset and share. The dataset generated is far from having a realistic
look.
3. Create in-house program to hide sensitivity of production data before it could be shared.
Program lacks a common approach and different silos have separate customized program
leading to high cost and difficulty to sustain in long term.
As far as the utility and privacy of data is concerned, approach-3 above looks good but has
inherent problems before it could be deployed enterprise-wide.
Data masking comes in this context, precisely to take care of these problems.
The expected benefits are:-
1. Need to mask all data fields – Organizations don’t want to take any chance and
categorizes every data field as sensitive and favors data masking for all them.
2. De-normalized data
Because data might exist in de-normalized form, masking must make sure that the de-
normalized data is masked with same values wherever they occur.
5. Value constraints
Masking must guarantee that data values are within the lower and upper value bounds as
specified in the metadata. In absence of such constraints in the database, one should be
allowed to define it as part of masking process.
7. Sampling
It’s often the case that organizations have big data storage and would like to take a sample
of it for masking. Masking process should sample the most meaningful and representative
data out of such big data set and proceed for masking.
Set-Up
This phase consists of the following steps.
• Identify the right people to be involved in masking.
• Tagging of proper role to different people. TCS recommends role-based masking for
improved data privacy.
• Creating a static data source from production. TCS strongly discourages using
Production environment as a data source. Non-conformance to this may lead to
problems like integrity violation, failure of masking process etc
• Installing a data privacy tool in client’s environment.
• Setting up of a project in the tool. This includes defining the data source identifiers
and importing entities intended for masking.
• Define any constraints and referential relationship of data in addition to what is
already defined at the data level.
• Creating any customized data sets. A data set is an external set of data used to mask
the original data.
• Selecting sensitive data and defining the apt masking technique.
• Defining the right tests to gauge the quality of masking.
Upload
Once Masking is done, one must make sure that the masked data is uploaded to
various target systems.
Conclusion
In today’s business scenario, data masking is no longer an option, it is a mandatory
activity. Organizations would reinstate customer confidence by quickly adopting and integrating
data masking as part of its data management policy. Organization must choose a tool which
could handle not only the complexity of today’s business environment but could easily be
customized to handle the complexity of tomorrow. More organizations are already on this path
but a clear lack of understanding is making the process unnecessarily complex. There is a need
that organizations take help from data privacy experts to create the right awareness within and
drive the masking program efficiently to derive competitive advantage over its competitors.
REFERENCES
1. Protecting Private Data With Data Masking by Noel Yuhanna
2. Build Your Privacy Program: Oversight And Management by Michael Rasmussen
3. Data privacy in Indian IT and ITES-BPO sectors: NASSCOMM handbook
(http://www.nasscom.org/download/Information Security in the ITES-BPO sector.pdf)
4. Articles and white papers about Data Privacy:
http://www.itbusinessedge.com/taxonomy/?t=8256