ChatGPT is banned? When Data Protection meets Online Safety!

Here we are! It took the Italian Data Protection Supervisory Authority (SA) only a few months to react to the widespread use of ChatGPT. On 30 March the SA ordered the immediate suspension of the processing of the personal data of data subjects located in Italy. OpenAI has been quick to react to the order, making ChatGPT unavailable to users in Italy. The ChatGPT order follows the Replika order of 2 February 2023 and the Tik Tok order of 22 January 2021 [Quite a busy SA! Clap!].

When attempting to reach the chat.openai.com website from Italy the landing page now states:

“We regret to inform you that we have disabled ChatGPT for users in Italy at the request of the Italian Garante.”

The OpenAI support team promises to “engage with the Garante with the goal of restoring (…) access as soon as possible as they believe that they “offer ChatGPT in compliance with GDPR and other privacy laws.” [Really !?!]

What is happening? What are the reasons for a such an order?

The Italian SA leveraged Article 58(2)(f) to urgently react against ChatGPT practices, without having to worry about GDPR Article 56(1) and the one-stop-shop procedure. It identified a series of violations, i.e., violations to GDPR Articles 5, 6, 8, 13 and 25. [By the way, the order is brief and the reasoning of the SA is relatively elliptic as it has to react quickly].

Let’s explore the interplay between Articles 5, 6 and 8 and the implications of the order in terms of legal basis.

There are two different kind of processing activities that are at stake:

1. AI training, which can be subdivided into two set of processing activities:

AI training before deployment, i.e. prior to the making available of ChatGPT.
AI training after deployment, i.e. service improvement, which requires the processing of users‘ input data to enhance the performance of the model

2. Provision of service strictly speaking, which covers a set of processing activities such as account registration, query processing and result production and display.

ChatGPT’s privacy policy [or notice] does not cover activities related to AI training undertaken before deployment.

However, both AI training after deployment, i.e. service improvement, and provision of service are covered by the privacy policy. These two types of activities are said to be grounded on the performance of contract legal basis. This is how the privacy policy reads: “Our legal bases for processing your Personal Information include:…Performance of a contract with you when we provide, maintain, and improve our Services. This may include the processing of Account Information, Content, and Technical Information.”

Considering the recent European Data Protection Board (EDPB) decisions, it is not surprising to read that the Italian SA is not satisfied with such a legal basis. In its decision 4/2022 targeting the Instagram service, the EDPB has clearly stated that the performance of a contract legal basis cannot be relied upon to justify processing activities amounting to behavioural advertising (see para 136 and 137). The same is stated in the EDPB’s 3/2022 decision targeting the Facebook service (para. 132 and 133). “Article 6(1)(b) GDPR does not cover processing which is useful but not objectively necessary for performing the contractual service, even if it is necessary for the controller’s other business purposes” explained the EDPB.

To use the words of the Irish DPC in the September 2022 Instagram decision, [although it’s unclear whether the Irish DPC fully understand the implications of its own words], at the very least “the assessment of purpose in Article 6(1)(b) requires the identification of contractual obligations to establish the extent to which processing of personal data may be necessary” (para. 81). A reasonable reading of these words would imply that service improvement is not objectively necessary for performing the contract. [Besides, the OpenAI terms of use do not even mention service improvement].

Now come the burning question: could OpenAI easily cure the violation of the lawfulness principle?

The answer is probably positive for training after deployment, although this is not saying that Article 13 violations and Article 5 violations, in particular violations of the principle of purpose limitation, will be automatically cured. [The argument based upon the principle of accuracy is maybe less convincing assuming there is an effective right of rectification in place and the LLM is not just making things up… but ChatGPT does make things up]. The EDPB’s ancestor made it [relatively] clear in its 03/2023 opinion that “a purpose that is vague or general, such as for instance ‘improving users’ experience’, ‘marketing purposes’, ‘IT-security purposes’ or ‘future research’ will – without more detail – usually not meet the criteria of being ‘specific’.”

As regards training before deployment, the answer is less obvious, although the Google Spain case teaches us that the balancing of the legitimate interests pursued by search engines and internet users with the interests or fundamental rights and freedoms of the data subject to whom the data pertains can lead to a compromise: the processing is lawful up until a legitimate request to deletion is received by the data controller. ChatGPT is obviously different from Google search engine, at least in its 2014 version [although they are getting closer by the day]; one notable difference being the ease with which data subjects would be able to exercise their rights, and in particular their rights to object and to deletion [exercising one’s data protection rights becomes harder with generative language models like ChatGPT to say the least].

However, and this is an important caveat, which will have implications for both child and adult user experience, the legal assessment is further complicated if we consider children.

The EDPB decision 2/2022 suggests that the legitimate interest legal basis is not straightforward to use when the data subjects are minors, although it accepts that Article 6(1)(f) remains in principle available when Article 6(1)(b) cannot be relied upon (para. 105). The EDPB [rightly] stresses that “[i]n order to ensure that the interests and fundamental rights and freedoms of data subjects do not override the legitimate interests pursued, the safeguards in question must be adequate and sufficient, and must unquestionably and significantly reduce the impact on data subjects.” [Note that in this case the Irish SA had found that that “the contact information processing by [Meta IE] (both before September 2019, and after) result[ed] in high risks to the rights and freedoms of child users, for the purposes of Article 35(1) GDPR”.

Although the Italian SA refers to GDPR Article 8, parental consent should not necessarily work for all processing activities [or does it ?]. The EDPB in its guidance on consent states that “Generally, consent can only be an appropriate lawful basis if a data subject is offered control and is offered a genuine choice with regard to accepting or declining the terms offered or declining them without detriment.” Technically speaking, service improvement is probably one only logical candidate for consent here, while service provision should be based upon the legitimate interests of the data controller and third parties for users who cannot validly enter into enforceable contracts. [It’s unclear whether the IT SA has this distinction in mind].

Assuming parental consent can cure lawfulness, this will mean implementing age verification or estimation techniques that are more impactful but also potentially more intrusive than age declaration. The Italian SA expressly stated that it was in favour of age estimation measures when discussing its order targeting Tik Tok. [Other SAs such as the French SA are more cautious with age verification and estimation techniques, see here,].

However, assuming parental consent can cure lawfulness, it is not obvious why data protection law would require content filtering measures…., unless the legal basis for service provision is in fact the legitimate interest legal basis which would require implementing protective measures for children [at least in theory].

This is where data protection meets online safety! One should however be careful when merging the two together and associating content filtering measures with GDPR Article 25 or data protection by design and by default too quickly, without carefully unpacking their implications [carefully unpacking the implications of age verification and age estimation would also be a good idea by the way!].

Post Scriptum: There are many [important] reasons [slightly different from the absence of content filtering], which would justify halting the release of ChatGPT (see here). Let’s hope more SAs will follow and help unpack the high risks to the fundamental rights of all data subjects.