Info for Authors – Author AI FAQ

Author AI FAQ

Last updated November 1, 2023

Hachette Book Group (“HBG”) has prepared the following FAQs in response to questions we’ve received from authors who are concerned about the use of their works to train generative AI models for text generation, also known as large language models. This is an incredibly complex topic involving a rapidly evolving technology that is raising novel legal and ethical issues that cannot possibly be captured in a set of FAQs. If you have questions after reading these FAQs, please reach out to your editor. These FAQs are intended as an HBG author reference and not meant for distribution.

USE OF AUTHOR WORKS FOR LARGE LANGUAGE MODEL TRAINING

What is HBG’s position on author works being used to train large language models?

  • We recognize that there is value in responsibly using generative AI tools to support our work together, to improve the way we work and our mutual ability to sell the work we publish. As we keep both these goals and our commitment to the protection of intellectual property in mind, we will proceed carefully and thoughtfully to explore the possibilities this emergent technology affords. 

Some authors are suing OpenAI and other companies that have trained large language models on their works. Should I join any of these lawsuits? Do I need to bring my own lawsuit? What is HBG’s position on these lawsuits?

  • Some authors have brought lawsuits on behalf of themselves and other similarly situated authors in what is called a “class action” litigation.  For example, the Authors Guild and a group of 17 authors are suing OpenAI on behalf of a class of fiction writers whose works have been used to train GPT. HBG authors David Baldacci, Michael Connelly, Elin Hilderbrand, Douglas Preston and Scott Turow are among the named plaintiffs. For more information on this suit and the representative class, see: https://authorsguild.org/news/ag-and-authors-file-class-action-suit-against-openai/. Other authors such as Michael Chabon, Sarah Silverman and Paul Tremblay, have brought separate suits against OpenAI and Meta on behalf of all copyright holders which would include both fiction and non-fiction authors.
  • If successful, authors who are a part of the class represented in these suits would have the option to participate in any recovery or to opt out of the class and preserve their own potential separate claims.

I learned that my book was included in a data set used to train an AI model. What can I do to remove my copyrighted work from that system? Does it make a difference that my book was scraped from a pirate site?  

  • We share your anger that pirated copies of your books were apparently used to train a large language model; it’s a double offense and a flagrant disregard of your copyright. As discussed above, whether this unauthorized use of your work is ultimately defensible as a fair use will be determined by the courts.
  • Unfortunately, if your work was used to train an existing model, it is now part of the program and may not be able to be removed. Your interests are being represented and litigated in the lawsuits described above, but the outcome of these suits is uncertain. That is why the Authors Guild is also seeking compensation in the form of collective licensing fees for the use of authors’ works that were used to train existing models. For more information, see https://authorsguild.org/advocacy/artificial-intelligence/faq/.  
  • HBG is committed to combatting online piracy and successfully litigated against and obtained a permanent injunction preventing Internet Archive from scanning and posting free copies of our authors’ works on their Open Library site. We use the anti-piracy service Link-Busters to investigate reports of online piracy of our titles. If you discover pirated copies of your works online, please report it here:  https://www.hachettebookgroup.com/terms-and-policies/report-piracy/   

My book just came out and it looks like there are more AI-generated summaries of my book now readily available on Amazon. Can you get them taken down?

  • Any summary sold through an online retailer – AI-generated or not – should have a disclaimer that makes it clear to the reader that it is not affiliated with you or the original work.  
  • Please contact us right away if you see a listing that (a) includes the cover image of your book, (b) copies text or any images directly from your book (this may require ordering the summary to assess), or (c) includes copy in the listing that seems to misrepresent itself as affiliated with you or with HBG. While we cannot get all summaries taken down through cease-and-desist demands, we have had success doing so in these categories of cases.
  • Please know that we are actively engaged with our retail partners in how to best address the increasing prevalence of these types of AI-generated summaries. These works are low quality knock-offs that are flooding retailers to the detriment of consumers, not to mention authors and HBG, and we hope to be able to partner with retailers to find solutions.

How do I stop people from generating stories in my style, or from writing articles about my books or my life that include false facts from public AI tools?

  • As of the time of this writing, there are no new legal avenues applicable to these tools – these issues are being litigated in the lawsuits described above and other suits representing the rights of artists and coders – and we are not aware of any accepted technical measures that may be used to effectively block an individual from generating outputs from AI tools trained on your works or inaccurate information scraped from the Internet. Having said that, you still have rights to protect your work under existing law. Depending on the facts and the similarity of the output to the actual expression of your work, you may still have a copyright infringement claim against the individual who generated a work in your style. You also still have the ability to prevent individuals or companies from using your name, image, or voice for commercial purposes. Similarly, to the extent a false fact is published about you that harms your reputation, you may have a defamation claim. How all these claims play out in the world of AI is yet to be seen, but there will be legal cases decided that address all of these issues. 

AUTHOR USE OF AI TOOLS

What are HBG’s expectations about how authors may use generative AI in their own work?

  • In your agreement with HBG, you represent that your work is original to you, and that you have the ability to grant HBG the exclusive rights covered by the publishing agreement. HBG contracted with you based on your unique talents and experience. Similarly, the US Copyright Office has made clear that a work generated by an AI is not protectable under copyright, which is an important protection for all published works.
  • We recognize though that there may be some AI-based tools that are helpful to our authors and that you may wish to use them in some manner as a part of your creative process. If that is the case, please discuss it with your editor. We hope that we can all learn together how these tools are useful and might be used responsibly in our industry without supplanting the creative process.
  • Please remember that it remains your obligation to make sure that your work is your own creation and copyrightable. Because of the Copyright Office’s position on the registrability of works even partially generated by AI, please provide your editor with a summary of the nature of any use of AI-powered tools or technologies in the creation of the text of the work at the time of submission, if not before. This requirement will be a part of our new contractual language on AI, and is an important part of our ability to work and learn together as we enter into this new age of AI-enabled work product. For more information on the use of AI tools in copyrightable works, see the Copyright Office’s webpage on the subject, which includes webinars and written guidance.

May I feed my book into a public generative AI model?

  • You own the text of your copyrighted work and may use it for your own personal uses in any manner you see fit. However, please be mindful that your contractual commitments to HBG continue to apply, including as it relates to any use of that text for commercial purposes or in a way that might compete with your HBG publication.  

May I have a copy of my audiobook file to use to train an AI-enabled voice or other model?

  • Please note that while you may own the copyright in your text, our Audio group invests a great deal of time and effort in the creation of an audiobook recording, and as a result generally retains the copyright in the sound recording created. You may not use your audiobook file for use in an AI-enabled product or service without our permission. Of course, this does not prohibit you from pursuing these types of products using your own voice samples, but we encourage you to discuss potential use cases with your team at HBG, as we may have information or background that may help you in considering your options.

HBG USE OF AI TOOLS

Does HBG have a policy governing its employees’ use of AI tools for work on its books?

  • Our policy is that employees may not input author, illustrator, or other third-party content into public generative AI models, nor may employees input confidential information into these tools. 

How does HBG use generative AI technologies in its business, or hope to in the future?

  • As of the time of this writing, HBG has not adopted any internal tools based on large language models (other than those embedded in backend legacy software, like Microsoft Office). However, if we do, we anticipate that these tools will be (a) private tools that do not expose author content to public use or training, and (b) operational uses designed to streamline workflows, improve discoverability, or otherwise help HBG to sell more of your books.
  • Any potential internal HBG use of private AI tools will be vetted to ensure that your information is protected from public disclosure, and that author content is not ingested into any public-facing model. Our guiding principles in all cases will be to protect the interests of our authors and creative contributors, and to ensure that any use of AI-enabled tools will serve to improve our publishing services.