Two US authors sued OpenAI in federal court in San Francisco on Wednesday, claiming in a proposed class motion lawsuit that the company misused its work to “train” its popular generative AI system ChatGPT.
Massachusetts-based writers Paul Tremblay and Mona Awad said ChatGPT was extracting data copied from hundreds of books without permission, violating the authors’ copyright.
Matthew Butterick, the authors’ lawyer, declined to comment. Representatives for OpenAI, a non-public company backed by Microsoft Corp (MSFT.O), didn’t immediately respond to a request for comment.
Several legal complaints have been filed against the materials used to train state-of-the-art AI systems. Reasons include source code owners against OpenAI and Microsoft GitHub, and visual artists against Stability AI, Midjourney and DeviantArt.
The lawsuit’s targets argued that their systems make fair use of copyrighted works.
ChatGPT responds to user text prompts in a conversational way. Earlier this 12 months, it became the fastest-growing consumer app in history, reaching 100 million lively users in January just two months after launching.
![Paul Tremblay and Mona Awad said ChatGPT extracted data copied from thousands of books without permission.](https://nypost.com/wp-content/uploads/sites/2/2023/07/NYPICHPDPICT000012778214.jpg?w=1024)
ChatGPT and other generative AI systems create content using large amounts of data collected from the web. Tremblay and Awad’s lawsuit said the books are a “key ingredient” because they provide “the best examples of high-quality long writing.”
The grievance estimated that OpenAI’s training data included greater than 300,000 books, including from illegal “shadow libraries” that offer copyrighted books without permission.
Awad is understood for novels resembling “13 Ways to Have a look at a Fat Girl” and “The Bunny”. Tremblay’s novels include “The Cabin at the End of the World”, which was adapted into M. Night Shyamalan’s “Knock on the Cabin”, which premiered in February.
Tremblay and Awad said ChatGPT could generate “very accurate” summaries of their books, indicating that they appeared in its database.
The lawsuit seeks an unspecified amount of monetary damages on behalf of a nationwide class of copyright owners whose works were allegedly misused by OpenAI.