A call for a common techno-legal language to speak about anonymisation, pseudonymisation, de-identification… Could this be one of the biggest challenges brought about by the GDPR?

legalese

The General Data Protection Regulation (GDPR) will be applicable in less than two years and lawyers as well as others are trying to grapple with definitional issues.

The graduated approach that would have meant alleviating the regime of certain categories of data such as pseudonymised data (e.g. by eliminating the need to comply with a certain number of data subject rights for example) has officially been rejected.

With this said, we still find within both the binding part of the GDPR and its recitals the words “data which have undergone pseudonymisation” and “pseudonymisation.” One interesting question is therefore what the precise legal effects of pseudonymisation are, assuming we know what it means.

In a previous post I have highlighted the problems raised by this (legal) concept, which seems to be both too narrow and too broad.

I’ll repeat the words of Article 4(5) of the GDPR for the sake of clarity:

‘pseudonymisation’ means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person;

Article 4(5) appears too narrow because it seems to be quite demanding: in other words it seems to be asking that it shall not be possible to attribute the data that has undergone pseudonymisation to an identifiable natural person. At the same time, it appears too broad in the sense that it becomes difficult to distinguish between pseudonymisation and anonymisation, unless one accepts to speak about anonymised data only once the additional information has been destroyed.

Going further, is it true that pseudonymisation within the meaning of Article 4(5) of the GDPR really equates the ‘mere’ removal of direct identifiers? Shouldn’t we also resort to generalisation techniques to pseudonymise data within the meaning of Article 4(5) of the GDPR?

What does identifiable really mean? One has to read Recital 26 of the GDPR to have the answer:

To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly.

As a result, it is necessary to take into account the means of third parties (and not only those of the data controller) to determine whether one is actually dealing with data that has undergone pseudonymisation!

In this sense, and if we had to draw an axis pointing towards the ‘horizon’ of anonymisation and locate pseudonymised data and de-identified data within the meaning of Article 11 of the GDPR [not to be misleading Article 11 does not use the words “de-identified data” but governs situations in which the data controller is not able to identify the data subject], data that has undergone pseudonymisation appears ‘less’ personal than de-identified data within the meaning of Article 11. Why?

Because Article 11 is only concerned about the means of the data controller.

Article 11 of the GDPR reads as follow:

If the purposes for which a controller processes personal data do not or do no longer require the identification of a data subject by the controller, the controller shall not be obliged to maintain, acquire or process additional information in order to identify the data subject for the sole purpose of complying with this Regulation.
Where, in cases referred to in paragraph 1 of this Article, the controller is able to demonstrate that it is not in a position to identify the data subject, the controller shall inform the data subject accordingly, if possible. In such cases, Articles 15 to 20 shall not apply except where the data subject, for the purpose of exercising his or her rights under those articles, provides additional information enabling his or her identification.

Interestingly, de-identified data within the meaning of Article 11 renders a certain number of data subject rights inapplicable. Yet, it is not expressly stated in the GDPR that pseudonymised data could have the same effect! Still, could pseudonymised data ever have the same effect?

Well, the answer should be that pseudonymised data within the meaning of Article 4(5) of the GDPR [it is crucial to understand that data that has undergone pseudonymisation within the meaning of the GDPR has little to do with what many call “pseudonymous data”] is a subcategory of de-identified data within the meaning of Article 11 when the data controller at stake is not the initial controller, i.e. the controller who has actually pseudonymised the dataset, but is a recipient of that dataset [or when the initial data controller has destroyed the additional information]!

Therefore depending upon who the data controller is, the initial data controller or the recipient of the dataset, [or whether the initial data controller has destroyed the additional information] the dealing with pseudonymised data should render Articles 15 to 20 inapplicable.

What is more, Article 4(5) of the GDPR is quite ‘knotty’ inasmuch as the test to determine whether an individual is identifiable to be found in Recital 26 of the GDPR appears to be slightly stricter than the test found in Recital 26 of the DPD. Recital 26 of the GDPR states that:

To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.

There is thus a need to take into account technological developments to determine whether the data has undergone pseudonymisation! [Wow!]

Why did the EU legislator choose to include a definition of pseudonymisation? Wouldn’t we better off without such a definition, I am wondering. Sometimes, the quest for technological neutrality could simply mean less words, couldn’t it!

I don’t think I am doing the peaky (biased) academic here, I am just thinking that to be able to give sensible legal advice it is essential to have clear definitions in place. And it is actually important to understand what data that has undergone pseudonymisation is if a data controller needs to explain what his or her degree of responsibility is.

One more thing. Although the GDPR builds upon the DPD as regards its definition of personal data, there is indeed a notable difference between both texts. Article 4(1) of the GDPR expressly refers to online identifiers. However, and this is important, online identifers not expressly equated with IP addresses. What is an online identifier? Is an IP address an online identifier? If the answer to this question is positive does it mean that the recent Breyer Case [commented here] will have to be rewritten in part once the GDPR is applicable so that IP addresses should always be considered personal data?

The recent decision of the French Supreme Court concerning the characterisation of IP addresses, in its brevity, does not really rely upon contextual considerations, although examining the context of the case at hand would have helped the Court to justify its position.

My hope is that the contextual approach followed by the Court of Justice of the European Union (CJEU) in Breyer will survive the GDPR. Why? Because it is crucial to understand that the data environment is almost [or at least] as important as the data itself to assess the fairness of the situation! The conditions for accessing and sharing the data are crucial considerations, as recognised by the CJEU in the Digital Rights Ireland case, although the CJEU made this point in a different context.

To continue with Breyer, I still have more questions than answers but I am tempted to write that the CJEU delivered, in its wisdom, a decision that should not be forgotten too quickly.

The GDPR is a fantastic adventure for many different reasons, it is here to stay, and we should make the most of it! Nonetheless, to make the most of it shouldn’t we be speaking the same language?

And with this, I warmly thank the organisers of the Brussels Privacy Symposium for the very high quality of the debate!

Sophie Stalla-Bourdillon