Detect any kind of information in your data with auto tagging. Specify custom data domains using regular expressions, dictionaries or plug your own algorithm
Mask personal information using predefined algorithms or specify a custom one.
Supply your own data discovery to the masking process. Fine tune any time and re-run the masking every time you need it
Perform data alignment or find specific cases to replicate in your testing environment using the subsetting module.
Apply data masking algorythms on the fly, without anyone accessing personal information in the process.
Use realistic data in yout tests, entirely skipping real data containing personal information. Tune the generation, rollback the same generated data or create new batches every time you need it.
Our banking customer adopted the Esplores platform with three different features for GDPR compliance enforcement:
The first phase was to define what kind of information the privacy department chose to mark as personal and needed masking.
Then the application developers, in collaboration with DBAs, defined the actual perimeter with databases and schemas to scan for personal information.
Esplores allows the customers to prepare PoCs or small tests on real environments, to get an estimate of required resources and time spans and to tune the default configurations.
Three tests were performed: since they were interdependent from each other, each test was not made before the execution of the real activity, so the data masking test was executed only after the data discovery and validation of personal information fields was completed. The same happened for subsetting: we waited to have masked at least one database before making the actual subsetting test
i. Testing data discovery
The data discovery was tested just configuring read only access to the business continuity servers of the production environment and running the default discovery on a subset of the perimeter.
After the results were analyzed, some assumptions were made on the required resources and rules of inclusion and exclusion were applied
ii. Testing data masking
Masking requires a test environment that can be refreshed or deleted.
The first test was to clone some database tables and mask some personal data with default parameters.
After the first tuning, a test was made on a database that was going to be refreshed with original data, so that a real execution was made on a subset of the actual perimeter.
After the test, some assumptions were made on the required resources, on exclusions and on the application tests to prepare for the entire perimeter
iii. Testing subsetting
Subsetting required the setup to be able to read data in the production environment and write in the test environment. After checking for permissions, a simple test of one row was made to check that everything was setup correctly.
Data has been identified using the Esplores Data Discovery module, but for some databases the customer had already run another data discovery solution.
Esplores can be complementary with other solutions in two ways:
Validation can be done using the Esplores user interface or throughout exports and imports.
a. Comparison of data discovery
The customer had made a data discovery with a competitor tool that returned a list of fields with their assigned personal information domain. That discovery had two problems:
Esplores used a custom randomization algorithm to download data from the database, so that the sample contained both old and new records and to minimize the chance to get null records. We also used custom dictionaries for names and surnames and deep learning algorithms to identify personal information.
After retrieving a sample in percentage (x%) of the database and at least 1.000.000 records, the data discovery algorithms were run and the results were submitted to the application development representatives and data officers using an export of the data containing:
A comparison table between old and new discovery results has been generated, highlighting discrepancies and missing personal information
b. Validation
Validation required consulting all the developer team leaders to check the accuracy of the discovery but also the technical feasibility of the masking of the fields: some fields, although containing personal information, have to be masked with special rules (e.g.: ignore some values or apply a string format to the content) or even excluded by the masking process because would break the application or render it unusable (e.g.: user names cannot be changed because the user has to use its actual user name).
Prior to actual masking, the validated discovery has been configured on the Esplores Data Masking module and performance and tuning tests have been performed on databases of the project perimeter
Masking used the following workflow:
Optionally, the saved data on the filesystem can be kept for a limited period of time to be able to run checks on the masking or rollback a few data to pass the application tests.
To avoid using real data, the customer asked to prepare a fake data generator to populate the test and development environments.
We defined a set of personal information to generate with names, surnames and other personal information.
The fake data generator can be run on demand and used to feed the subsetting phase.
REQUEST YOUR DEMO.
By ticking the above box I agree and consent the terms of the Privacy Policy consenting Esplores processing and disclosing my data and communicating with me according to the policy.
CHECK OUT OUR PUBLICATIONS.
We often publish new content.
© Copyright Esplores - P.IVA 04284390160 - Privacy Policy - Contact us
We use only functional cookies to have a better site experience. By continuing to use this site, you consent to this policy. Click to learn more