The Dutch bank Rabobank has implemented a creative way of using customer data, without having to request permissions. If you are one of their customers and they use your data with internal tests to develop new services, there is a chance that you will get a different name. With special software data is pseudonymized and they do so with Latin plant and animal names.
Your first name will become i.e. Rosa arvensis, the Latin name of a forest rose, and your street name i.e. Turdus merula, the scientific name of a blackbird. It is a useful solution for the bank to be somehow in line with the General Data Protection Regulation (GDPR) that takes effect on the 25th of May. When developing applications or services, analyzing data or executing marketing campaigns based on PII (Personally Identifiable Information) type of data, companies require to have an explicit consent. In order to be able to do this after May and without getting your consent, the bank uses data masking / pseudonnymization techniques.
Explicit consent & pseudonymization
With the new privacy law the personal data of citizens are better protected. One of the corner stones of the GDPR is the requirement to get an explicit consent and linked to that the purpose. Even with a general consent, companies do not get a carte blanche to do whatever they want to do with your data. Organizations must explain how data is used and by whom, where they are stored and for how long (more info about GDPR). Companies can work around these limitations if they anonymize / pseudonymize this PII type of data because they can still use and valorize this data but without a direct and obvious link to you as a person. You as a person become unrecognizable but your data remains usable for analysis or tests.
Why scientific animal and plant names?
‘You can not use names that are traceable to the person according to the rules, but suppose it is a requirement to use letters with names, you have to come up with something else,” explains the vendor that delivered the software. “That’s how we came up with flower names, you can not confuse them, but they look like names for the system. Therefore, it is not necessary for organizations to change entire programs to comply with the new privacy law”.°
Note that data anonymization/ pseudonymization technology does not require you to use plant and animal names. Most of this type of implementations will convert real to fictitious names and addresses that even better reflect the reality and perhaps better also match the usage requirements (i.e. specific application testing requirements). Typically substitution techniques are applied where a real name is replaced with a another real name.
Pseudonymization vs anonymization
Pseudonymization and anonymization are two distinct terms that are often confused in the data security world. With the advent of GDPR, it is important to understand the difference, since anonymized data and pseudonymized data fall under very different categories in the regulation. Pseudonymization and anonymization are different in one key aspect. Anonymization irreversibly removes any way of identifying the data subject. Pseudonymization substitutes the identity of the data subject in such a way that additional information is required to re-identify the data subject. With anonymisation, the data is cleansed for any information that may be an identifier of a data subject. Pseudonymisation does not remove all identifying information from the data but only reduces the linkability of a dataset with the original identity (using i.e. a specific encryption scheme).
Pseudonymization is a method to substitute identifiable data with a reversible, consistent value. Anonymization is the destruction of the identifiable data.
Only for test data management?
You will need to look into your exact use cases and determine what techniques are the most appropriate ones. Every organization will most likely need both. Here are some use cases that illustrate this:
|Your marketing team needs to setup a marketing campaign and will need to use customer data (city, total customer value, household context, …).
|Depending on the consent that you received, anonymization or pseudonymization techniques might need to be applied.
|You are currently implementing a new CRM system and have outsourced the implementation to an external partner.
|Anonymization needs to be applied. The data (including the sensitive PII data) that you use for test data management purposes will need to transformed to data that cannot be linked to the original.
|You are implementing a cloud based business application and want to make sure that your PII data is really protected. You even want to prevent that the IT team (with full system and database privileges) of your cloud provider has no access to your data.
|Distinct from data masking, data encryption translates data into another form, or code, so that only people with access to a secret key or password can read it. People with access but without the key will not be able to read the real content of the data.
|You have a global organization also servicing EU clients. Due to the GDPR, you want to prevent your non-EU employees to access data from your EU clients.
|Based on role and location, dynamic data masking accommodates data security and privacy policies that vary based on users’ locations. Also data encryption can be setup to facilitate this.
|Your have a brilliant team of data scientists on board. They love to crunch all your Big Data and come up with the best analysis. In order to do that, they need all the data you possibly have.
|A data lake also needs to be in line with what the GDPR specifies. Depending on the usage you may need to implement anonymization or pseudonymization techniques.
Is Pseudenomization the golden GDPR bullet?
Pseudonomization or anonymization can be one aspect of a good GDPR approach. However, it is definitely not the complete answer and you also will need to look into a number of other important elements:
Key to the GDPR is consent and the linked purpose dimension. In order to manage the complete consent state you need to make sure that this information is available to all your data consumers and automatically applied. You can use consent mastering techniques such as master data management and data virtualization for this purpose.
Data Discovery & Classification
The GDPR is all about protecting personal data. Do you know where all you PII type of data is located? Data discovery will automatically locate and classify sensitive data and calculate risk/breach cost based on defined policies.
Data Discovery & Classification
A data register is also a key GDPR requirement. You are expected to maintain a record of processing activities under your responsibility or with other words you must keep an inventory of all personal data processed. The minimum information goes beyond knowing what data an organization processes. Also included should be for example the purposes of the processing, whether or not the personal data is exported and all third parties receiving the data.
A data register that is integrated in your overall data governance program and that is linked with the reality of your data landscape is the recommended way forward.
° Financieele Dagblad