Artificial Intelligence and Data Protection: The EDPB Opinion 28/2024

  • Insight Articles 14 March 2025 14 March 2025
  • UK & Europe

  • Regulatory & Investigations - Technology Risk

  • Cyber Risk

On 18 December 2024, the European Data Protection Board (“EDPB”) published its “Opinion 28/2024 on certain data protection aspects related to the processing of personal data in the context of AI models” (“Opinion”), dealing with the following three questions:

  1. Under which circumstances can an AI model be considered “anonymous”?
     
  2. Under which circumstances can “legitimate interests” under Article 6(1)(f) GDPR be considered the proper legal basis for the processing of personal data in connection with the development and provision of an AI model?
     
  3. How can unlawful processing of personal data in the context of development of an AI model influence any subsequent processing?

The EDPB highlights that it only provides outlines for detailed supervisory assessments regarding these questions. As such, companies that process personal data related to the use of AI systems should also consider the EDPB’s comments very carefully.

Anonymity of AI models

The EDPB asserts that not all AI models that process personal data can be considered anonymous. Rather, the threshold for considering an AI model anonymous is high. The assessment of potential anonymity should always be done on a case-by-case basis. Due to the broad definition of “personal data” under the GDPR, supervisory authorities should assume that AI models require a thorough assessment.

An AI model may be deemed anonymous if the probability of extracting personal data from the model – either directly or through queries – can be considered negligible for each data subject. This requires that the likelihood of identification is insignificant for all individuals whose data contributed to the model’s development. If personal data from a public figure, for example, were processed, the likelihood of identification could increase for this person alone, compromising the anonymity of the entire model.

The EDPB highlights the importance of considering the Article 29 Working Party’s guidelines on anonymisation, specifically its “2014 Opinion on Anonymisation Techniques”. However, this document does not address AI models, and an update is needed. The EDPB’s 2024/2025 work programme indicates that new anonymisation guidelines will be developed.

Key to the assessment is whether any means could be reasonably used by the controller or a third party to identify data subjects. Factors to consider include the AI model’s design, the data and methods used for training, and available technological developments. Importantly, identification by unlawful means should not be considered, as only lawful methods are relevant according to ECJ case law. This distinction is significant because it challenges the EDPB’s view that cyber-attack risks should automatically increase the likelihood of identification and undermine anonymity.

The EDPB also outlines factors that may indicate whether an AI model is anonymous, though the absence or presence of these factors does not guarantee anonymity. The selection of data sources is critical, with a focus on whether the controller took steps to avoid or limit the collection of personal data. Similarly, processing methods like pseudonymisation or anonymisation, as well as training techniques such as differential privacy, are crucial considerations.

For fully developed AI models, it should be verified whether they align with the initial design and whether measures to reduce identification risk have been effectively implemented. Documentation is key: controllers must document technical and organisational measures to reduce the risk of identification. A failure to provide this documentation may indicate non-compliance with the GDPR, particularly if the controller claims anonymity without substantive evidence.

Requirements for ‘legitimate interests’ as a legal basis

The response to the second question of the Opinion addresses the requirements for processing personal data in the context of developing and providing AI models under the legal basis of legitimate interests, as outlined in Article 6(1)(f) GDPR. The EDPB outlines three cumulative conditions for relying on this legal basis: (1) a legitimate interest, (2) the necessity of the processing for that interest, and (3) a balancing of interests, where the data subject's rights must not override the legitimate interests.

a) Legitimate Interest

A ‘legitimate interest’ must be real, lawful, clearly defined, and non-speculative. It may include legal, economic, or non-material interests, but it must be substantiated. In the context of AI, examples include the development of AI systems like chatbots, fraud detection, or IT risk systems.

b) Necessity of Processing

Processing must be necessary to achieving the intended purpose, and no less intrusive means should be available. The amount of personal data processed must be proportionate to the purpose. In particular, controllers should carefully determine, before AI model development, the necessary scope of personal data and document this to avoid disputes. The use of 'scraping' for data collection presents challenges in justifying the need for large data sets, especially in light of the EDPB's principle of minimising data usage.

c) Balancing of Interests

When balancing interests, the EDPB refers to potential risks to data subjects, such as the large-scale collection of personal data by AI systems, which may lead to feelings of surveillance or self-censorship. Controllers must assess e.g. nature of the data, the context of its processing, and any potential impact on data subjects’ rights. The sensitivity of the personal data and the context in which it is processed are crucial in this assessment. Additionally, the reasonable expectations of data subjects must be considered. Due to the complexity of AI models, data subjects may not fully understand the extent of data processing, making transparency obligations crucial.

The EDPB notes that simply fulfilling transparency obligations under Articles 13, 14 GDPR may not suffice to ensure that data subjects can reasonably expect their data to be processed. The ECJ's recent ruling reinforces the need for careful compliance with transparency requirements, as failure to communicate the legitimate interest may invalidate the legal basis for processing.

d) Mitigating Measures

Measures beyond the minimum legal requirements can strengthen the balancing of interests in favour of the controller. For example, controllers could implement waiting periods between data collection and processing to allow data subjects to exercise their rights. The EDPB also suggests granting unconditional rights, such as the right to erasure or objection, even if the conditions under Articles 17 or 21 GDPR are not met.

Controllers may also provide additional transparency by public information, such as media campaigns or FAQs, or publish their balancing of interests to increase fairness. However, measures that go beyond legal obligations carry risks. For example, an unconditional right to erasure could undermine the data needed for AI model development, and publishing the balancing of interests might disadvantage the controller in future disputes. Controllers should carefully assess whether additional measures align with their legitimate interests and consider the potential impact on data subjects.

Possible Consequences of Unlawful Processing in AI Development

Lastly, the EDPB discusses the consequences of unlawful processing in AI development and outlines three scenarios where unlawful processing of personal data in AI model development may affect subsequent processing:

a) Scenario 1

If unlawful processing occurs during development and the data is retained in the AI model, supervisory actions may follow. If a supervisory authority orders data deletion, further processing is prohibited, potentially limiting or halting the AI model’s use and causing economic damage. Additionally, the failure to justify the first processing may affect the balancing of interests in the second processing.

b) Scenario 2

If another controller processes the data later, they must ensure that no unlawful processing occurred in developing the AI model. This includes verifying data sources and checking for prior supervisory authority or court decisions. However, practical challenges arise as the second controller may not have access to this critical information. The EDPB also clarifies that the declaration of conformity for high-risk AI systems does not automatically demonstrate GDPR compliance.

c) Scenario 3

If personal data is anonymised after unlawful processing, and the anonymisation is effective, the GDPR no longer applies, and the original unlawful processing does not affect subsequent processing.

What’s Next?

The Opinion provides a valuable framework for addressing data protection issues related to AI models, particularly in relation to regulatory reviews. However, it is noteworthy that the EDPB frequently suggests that controllers should adopt measures beyond the minimum legal requirements. While this is justified if such measures improve the balancing of interests, it should not lead to AI models being viewed with undue suspicion, nor should the absence of extra-mandatory measures be viewed negatively by default.

Controllers can gain significant insights from the Opinion when developing and deploying AI models. They must carefully assess whether and to what extent the EDPB's criteria should be incorporated into their processes. Regardless, it is essential for controllers to prioritise transparency and documentation obligations. This ensures they can consistently substantiate their processing activities and demonstrate ongoing compliance with data protection requirements.

End

Areas:

  • Legal Developments

Stay up to date with Clyde & Co

Sign up to receive email updates straight to your inbox!