Cornell study reveals AI can regenerate famous books with amazing accuracy, sparks copyright concerns

Several authors around the world have accused AI companies of using their books to train AI models. Some of these cases have also reached courts, where companies have argued in their defence that their models do not keep exact copies of copyrighted books used during training and cannot reproduce them. However, these companies may soon face fresh legal trouble as new research suggests otherwise.

Researchers at Cornell University claim that fine-tuning AI models, which means training an already trained model, can reconstruct up to 85–90 percent of copyrighted books. The researchers also said that models were able to reproduce books they were not trained on during the fine-tuning stage.

AI companies have added extra safeguards to prevent their models from reproducing exact text from copyrighted books, articles, or other protected content. However, the researchers claim that fine-tuning can change how these models behave and allow them to expand short plot summaries into full text stories.

The researchers conducted experiments on models including GPT-4o, Gemini 2.5 Pro, and DeepSeek-V3.1. After additional training, they found that these models could expand plot summaries to reconstruct up to 85–90 percent of copyrighted books, even if those books were not used during the fine-tuning stage.

Recall extended beyond the trained author

Researchers also found that the fine-tuned models were not limited to the author whose works were used during training. They said the models were fine-tuned only on novels by Haruki Murakami, but the models could recall verbatim passages from books written by more than 30 other authors.

The researchers also trained models using random author pairs and public-domain text. They found that the models could still reproduce text from copyrighted books at similar levels. However, when the models were trained on synthetic text that was not from real books, they did not reproduce copyrighted passages.

Study raises broader industry concerns

Researchers concluded that when models are fine-tuned on real author texts, it can reactivate memorised material from earlier training. They described this as a security and industry-wide problem because multiple models showed the same behaviour.

The findings have also raised questions around Fair use rulings, where courts have assumed that safeguards prevent AI systems from reproducing protected content and have allowed AI companies to use copyrighted material.

Latest

Trump administration appeals ruling that blocked Pentagon action against Anthropic over AI dispute

Trump administration appeals ruling that blocked Pentagon action against Anthropic over AI dispute

IPhone 17 Pro Max is selling online with Rs 10,410 discount offer, deal explained

Limited time Rs 15,000 discount available on MacBook Air M4 online purchase

MIT researchers reveal too much AI may make you less smart and delusional over time

AI is becoming a part of everyday life. In fact, for many users, it has become a go-to companion for work and even personal advice. However, MIT researchers war

Samsung Galaxy Z Flip 7 price drop on Amazon: 4 reasons to buy and 1 to skip

Samsung Galaxy Z Flip7 5G is available at its lowest price on Amazon with bank discounts. Here are key reasons to buy or skip the foldable.

IPhone 18 Pro launching soon, design, display, colours and everything else to expect

The iPhone 18 Pro series is expected to launch soon, likely within the next six months. Ahead of its debut, leaks have begun revealing details about its design,

Topics

Tech Tonic – HDMI, USB-C and history repeating itself

We survived the HDMI mess, hoped to do better, and then rebuilt the exact same complication with a smaller connector.

Gemma 4 imbibes Google’s sharpest AI instincts, and is more welcoming

Google’s Gemma 4 open models make a strong case for an open AI that can run locally, with an eye on competition from China

Namit Malhotra responds to India vs US launch debate after Ramayana teaser release: ‘Don’t divide’

On Thursday, Namit Malhotra joined Nitesh Tiwari in Mumbai to unveil Ramayana’s teaser, offering audiences a first glimpse of Ranbir Kapoor as Lord Rama.

Superstar in this Ramayana adaptation has legendary grandfather who played Ram-Ravana 5 times, sometimes in same film

A famous Tollywood superstar, who once played Rama in a Ramayana adaptation, followed in his grandfather's footsteps. Know all about him. 

Aparna Sen says she regrets turning down Shyam Benegal for Ankur: ‘We got an actress of Shabana Azmi’s calibre’

Ankur marked the directorial debut of Shyam Benegal as well as the acting debut of Shabana Azmi. She went on to win the National Award for her performance.

Why General Randy George was fired by Pete Hegseth: ‘Retirement’ push sparks row amid Iran war

US Army Cief of Staff. General Randy George has been asked to retire by Pete Hegseth; resignation is reportedly imminent. It's unclear what prompted the row.

‘Kristi Noem and Pam Bondi…’: Trump faces massive allegations ahead of key Tulsi Gabbard, Leavitt decision

Donald Trump is being accused of firing only ‘incompetent women’ in his administration and not looking at ‘incompetent men’.

Pak’s response to attack would be decisive: Khawaja Asif fires back at India

Responding on X, Khawaja Asif criticised the tone of these statements, calling them part of a pattern. "Repeated rhetoric reflects not strength, but visible str
spot_img

Related Articles

Popular Categories

spot_imgspot_img