Citing copyright infringement, the Dutch-based organization BREIN has succeeded in taking down a large language dataset that was being used in training for AI.

In a statement released on Tuesday, BREIN explained that the dataset comprised 10,000 books, news articles, and Dutch language subtitles for movies and TV series that were obtained without permission.

EU’s AI Act aims to regulate training data sources

According to director Bastiaan van Ramshorst, it was not immediately clear how much the dataset could have been used by AI firms. “It’s very difficult to know, but we are trying to be on time” to avoid future lawsuits, he said.

The European Union’s recently proposed AI Act will also require AI companies to provide access to their dataset and source of data used to train AI models. Other related legal battles are still being fought in the United States. For example, Microsoft-backed OpenAI regularly gets involved in various legal issues, like the recent one with the New York Times.

Microsoft has been said to have allegedly copied the plaintiff’s registered journalism works in addition to other copyrighted journalism works. On the issue of potential infringement, the company’s CEO has been quoted as saying that the company has this data.

The allegations suggest that Microsoft used these copyrighted materials in AI products, including ChatGPT and Copilot, without obtaining the licenses. The complaint specifically accuses Microsoft of removing significant information from these works. Such as the author’s name, title of work, ‘copyright’ watermark, and other restrictions.

In Denmark, anti-piracy measures have also produced substantial results in the fight against copyright infringement. Last year, a copyright protection group based in Denmark, the Danish Rights Alliance, demanded and got the “Books3” dataset pulled down from the Internet.

Dataset provider complies with court order, removes content

The person who provided the Dutch dataset adhered to the court order made by BREIN. This agreement resulted in the dataset being taken down from the website that previously provided the dataset for download. BREIN refused to disclose the identity of a person involved in this case because of the Dutch privacy laws.

The removal of this dataset shows that copyright enforcement groups continue to fight for the protection of intellectual property rights in the digital world. To address the issue of mass scraping of copyrighted materials, BREIN recommends rights holders use reservations as provided under the Copyright Act (Article 15o.1).

Earn more PRC tokens by sharing this post. Copy and paste the URL below and share to friends, when they click and visit Parrot Coin website you earn: https://parrotcoin.net0

PRC Comment Policy

Your comments MUST BE constructive with vivid and clear suggestion relating to the post.

Your comments MUST NOT be less than 5 words.

Do NOT in any way copy/duplicate or transmit another members comment and paste to earn. Members who indulge themselves copying and duplicating comments, their earnings would be wiped out totally as a warning and Account deactivated if the user continue the act.

Parrot Coin does not pay for exclamatory comments Such as hahaha, nice one, wow, congrats, lmao, lol, etc are strictly forbidden and disallowed. Kindly adhere to this rule.

Constructive REPLY to comments is allowed

S/N	Instance	Amount	Limit
1	Your Earnings for Approved Comment	20,000 PARROT	30 per day
2	Your Earnings for reading news	5,000 PARROT	30 per day
3	Your Earnings for referring a visitor	30,000 PARROT	No limit
4	Your Earnings for daily site visit	3,000 PARROT	Once per day
5	Your Earnings for affiliate sales	Coming soon	No limit
6	Your Earnings for publishing new Post	20,000 PARROT	Once per day
7	Your Earnings for Affiliate Referrals	Coming soon	No Limit
8	Your earnings for freelance article	20,000 PARROT	Once per day
9	Your earnings for store product review	Coming soon	No limit

Dutch foundation takes down illegally used AI training dataset

EU’s AI Act aims to regulate training data sources

Dataset provider complies with court order, removes content

PRC Comment Policy

Leave a Reply Cancel reply