The EU’s Digital Strategy focuses on the balanced development of artificial intelligence (AI) and the protection of intellectual property rights. It faces two key challenges: implementing the provisions on Text and Data Mining (TDM) and complying with the AI Act – Regulation (EU) 2024/1689 laying down harmonized rules on AI.
The purpose of legal text and data mining (TDM)
Text and data mining (TDM) is an automated process that enables the computerized analysis of large amounts of data to extract patterns, trends and new information. TDM is used in many areas, including:
- Research and science – it enables faster discovery of new medical therapies, analysis of scientific literature and prediction of epidemiological trends.
- Developing artificial intelligence – it improves machine learning models, helps understanding natural language and text generation.
- Journalism and data analysis – it helps detect fake news, analyze social networks and research economic trends.
- Industry and business – companies use TDM to analyze market trends, automate document processing and improve business strategies.
EU legislation recognizes the value of TDM and provides specific exceptions for the legal implementation of this process, while protecting the rights of copyright holders.
Text and data mining and the right to opt-out
Directive on Copyright and Related Rights in the Digital Single Market, 2019/790 (DSM) introduces rules that allow for the lawful mining of text and data, but at the same time, in some cases, give right holders the possibility to opt out of this process.
Text and data mining is regulated by the Slovenian Copyright and Related Rights Act (in Slovene: Zakon o avtorski in sorodnih pravicah; the ZASP), in Articles 57a and 57b.
- Article 57a of the ZASP stipulates that for the purposes of text and data mining, the reproduction of lawfully accessed works shall be free. There are obligations on security measures that must not unduly restrict the exercise of mining, and an opt-out right for authors who can explicitly reserve the right to use copyright works.
- Article 57b of the ZASP introduces TDM for scientific research purposes, setting out the conditions for research organizations and the obligation to store data in a secure environment.
This transposes the provisions of Directive (EU) 2019/790 on copyright and related rights in the Digital Single Market (DMS Directive) and sets out the conditions under which users can carry out text and data mining, while protecting the rights of copyright holders.
The problem here is mainly of a practical nature: how to ensure an efficient, transparent and uniform system that would allow these exceptions to be implemented in a way that does not hamper the development of artificial intelligence and data analysis, while at the same time providing a sufficient level of legal certainty. Currently, there is no single solution to manage these exceptions, resulting in legal and technical uncertainty for AI researchers and developers.
Copyright obligations in the AI Act
The AI Act, which sets out the regulatory framework for the development and use of artificial intelligence in the EU, introduces strict transparency requirements for the data sources used by AI models. A key article governing copyright and TDM is Article 53 of the AI Act, which:
- requires providers of general-purpose AI models based on data mining to publish a sufficiently detailed summary of the data sources used. This means that companies must disclose what content they have used to train their models, which has direct implications for copyright and compliance with TDM exceptions;
- introduces requirements for providers of general-purpose AI models (such as large language models) to ensure copyright compliance and to clearly indicate if their models use copyrighted data.
These provisions mean that AI providers and developers must not only monitor potential opt-out exceptions to the DSM Directive, but also ensure transparency about the use of the data, which adds to the complexity of compliance.
The European Commission is looking for solutions
In order to better understand and address these issues, the European Commission has launched a call for a feasibility study for a central registry of opt-outs under the TDM exception. The aim of this initiative is to explore how a single, centralized system could be put in place that would allow for the effective management of exceptions without hampering innovation and research in the field of AI.
Through this study, the EU aims to gain insight into the technical and legal aspects of exception management, assess the possibilities for standardizing processes, and examine how compliance with existing and future legislative requirements, as set out in the DSM Directive and the AI Act, could be ensured.
The call clearly demonstrates the European Union’s commitment to finding innovative solutions that will enable the further development of digital technologies, while protecting the interests of rights holders and promoting transparency in the use of data in AI.
*This article was generated using ChatGPT, an artificial intelligence tool developed by OpenAI.
Source: European Commission tender documents no. EC-CNECT/2025/OP/0002 https://ec.europa.eu/info/funding-tenders/opportunities/portal/screen/opportunities/tender-details/8726813a-bd9b-4f58-8679-01c80f7a1abf-CN
Ethical review for publishing an AI-generated article on the website
Quality control: Aljaž Jadek guided ChatGPT in the writing of this article with various prompts. Aljaž Jadek has reviewed, corrected and updated the article and confirms that the quality of the article is sufficient for publication and that there are no AI hallucinations.
Avoiding plagiarism: The article has been reviewed and checked for plagiarism by Aljaž Jadek, who confirms that the article does not constitute plagiarism.
Maintaining creativity: Aljaž considers the content to be unique, informative, interesting and thus suitable for publication.
Ethical considerations: In Aljaž’s view, the content is not misleading or otherwise ethically problematic and is therefore suitable for publication.
Designation of AI content: It is transparently disclosed that the article was created by AI, i.e. ChatGPT.
Best SEO practices: The article was not created and published to manipulate search rankings, but is informative and describes an interesting intersection of two legal regimes – copyright protection and artificial intelligence.