Copyright and Artificial Intelligence (AI) may be on a collision course. The capacity to comb through vast troves of data to generate arguably novel stories, poems, artwork, music, and other expressive works, gives rise to an array of tough legal issues.
Are the generated works truly original and therefore subject to copyright? If so, who owns the copyright? And what scope of protection does such a copyright confer?
Answering these questions poses formidable challenges. But it’s important to put this in perspective. We’ve been here before. Copyright and technology have collided again and again over the ages. In fact, there were collisions before there were copyright statutes. In each case, solutions were found, and copyright emerged intact or even stronger.
In the Middle Ages, the painstaking process of copying books by hand limited any advantages to be gained through infringement. That changed in 1452 when Johannes Gutenberg invented the printing press. Suddenly, it became possible to produce 2000-3000 pages of text in a day, 100 times faster than hand copying. This revolutionary technological advance delivered many benefits, but it also made piracy possible and profitable.
The invention of the telegraph presented another technological challenge. Now geography was no longer an impediment to copying. Charles Dickens learned this through sad experience. His books went “viral” (or whatever the 19th century version of that term may have been), with sales all over the world. British publishers shipped books overseas to meet the demand. But the telegraph proved the contemporary equivalent of the world wide web, making content accessible around the globe. American agents in London could purchase a single copy of the latest Dickens bestseller, and telegraph its contents across the ocean to the United States. Unlicensed American editions of his books appeared on the street before the authorized editions arrived by ship from England.
In 1970, a California graduate student named Paul Orfalea leased a Xerox copier and opened the first Kinko’s store. As business boomed, Orfalea rolled the copier out to the street, to be used as a self-service machine. Soon there were hundreds of such businesses operating on or near college campuses, charging 3, 2, or 1 cent per copied page, depending on the volume. Widespread illegal copying of college texts resulted.
The advent of peer-to-peer sharing of MP3 files in 1999 led to mass infringement of musical recordings. Napster utilized this technology to gain over 80 million registered users, all happily downloading songs. On college campuses, as much as 61% of external network transfers consisted of MP3 file transfers.
In each case, copyright law evolved to meet the challenge posed by technological advances. The invention of Gutenberg’s printing press fostered the passage of the Statute of Queen Anne, the first act designed to protect the rights of authors. The widespread copying of Dickens’ books led to the Berne Convention for the Protection of Literary and Artistic Works, the genesis of international copyright protection. Kinko’s and Napster were forced to change their business models by adverse court decisions.
These precedents inspire hope that AI and copyright will figure out a way to coexist. But their value as precedents is limited. Each one involved technological progress that made copying easier. The challenge presented by AI is different. It does not make copying easier. It makes the borders of copyright harder to define.
To find a more apt precedent we might turn to another technological innovation: photography.
On March 2, 1865, six weeks before his death, President Abraham Lincoln signed a number of bills into law. One was the Copyright Act of 1865, which formally conferred copyright protection on photographs.
The law was challenged two decades later in a case involving a photograph of Oscar Wilde. The defendant, the Burrow-Giles Lithographic Company, had reprinted the photograph and sold 85,000 copies. Napoleon Sarony, the photographer, sued for infringement.
The case went up to the Supreme Court, where Burrow-Giles argued that a photograph could not be subject to copyright protection because it lacked originality and novelty, which are constitutional requirements . Unlike a painting or an engraving, which embodied the creator’s intellectual conception, a photograph was merely a mechanical reproduction of the physical features of some object. Burrow-Giles conceded that the technology underlying photography – transforming the effect of light on a prepared plate – might be a proper subject for patent protection for the camera manufacturer. But it argued that the photograph involved no originality on the part of the photographer. He was merely instigating a mechanical and chemical process.
The Supreme Court conceded that the issue “is not free from difficulty,” but sided with Sarony, finding that he had contributed something original by “posing the said Oscar Wilde in front of the camera, selecting and arranging the costume, draperies, and other various accessories in said photograph, arranging the subject so as to present graceful outlines, arranging and disposing the light and shade, and suggesting and evoking the desired expression.”
Today the Copyright Office mirrors this approach, advising applicants that “the copyright in a photograph protects the photographer’s artistic choices, such as the selection of the subject matter, any positioning of subject(s), the selection of camera lens, the placement of the camera, the angle of the image, the lighting, and the timing of the picture.” The Copyright Office will not register photographs “that lack a sufficient amount of creative expression.”
This separation of a photographer’s contributions from the camera’s mechanical and chemical processes provides a path to understanding how copyright and AI might coexist.
An AI program may be analogized to a camera. The user’s prompt is like a shutter, activating the camera’s mechanical and chemical processes. The billions of data points upon which the AI program is trained, is like the external world offering an endless array of possible photographic subjects. The AI-generated work –whether in literary, artistic, musical, or other expressive form – is like the photograph which results from the combination of the photographer’s contribution and the camera’s processes.
The analogy may not be perfect, but it helps to answer a number of issues posed by the collision of copyright and AI.
Are AI-generated works copyrightable? Yes, but only to the extent that they reflect the user’s creativity and originality in formulating the prompt. To that extent, the AI-generated work is like a photograph reflecting the photographer’s selection of lens, lighting, background, and other such elements. A simple prompt, lacking any originality, should not result in a copyrightable AI-generated work – just as a photograph lacking sufficient creative expression will not be registered by the Copyright Office.
Who owns the copyright to the AI-generated work? The user who formulates the prompt – just as the photographer who poses his subject and sets the mood owns the copyright to the portrait.
What about the owner of the AI program? Does that party own any copyright interest in the generated work? No, just as Nikon does not own the copyright to the photographs taken with its devices. The AI program may be ingeniously designed, and it may operate with blinding speed, but it does not contribute human creativity to the product.
What about the billions of data points utilized by the AI program? Some of that data may comprise copyrighted works owned by third parties, while some of that data may be in the public domain. It doesn’t matter which is which. None of that data is copied in the making of an AI-generated work. An AI program does not copy or even store the countless literary, artistic, and musical works that may populate its data pool. Instead it learns from them. It studies the parameters that define characteristics so that it can generate novel works in response to the user’s prompt.
Comparably, a young photographer may study the technique of Ansel Adams, whose “Zone System” included precise exposure, sharp focus, and high contrast. That same photographer may then take a picture of Half Dome, which closely resembles Adams’ photograph of the same scene. Yet it is not a copy. The copyright to the new photograph belongs to the young photographer – not to Adams whose style he emulated, and not to Nikon whose camera he used.
All of these “answers” to questions posed by the collision of AI and copyright are speculative of course. AI technology is new and evolving. It may take many years – and many court cases and legislative actions — before these issues are fully resolved. The process has barely begun. But as that process unfolds, there is some comfort in knowing that technology and copyright have clashed before. Somehow, solutions have always emerged, allowing the public to enjoy the benefits of both.
Marblehead Messenger you have crafted such a timely and evocative piece here. Context is everything isn’t it. I hope this gets wider play so that all can be instructed with hope that we as a people have met this problem before and found ways to navigate it for the better. My best to you and your family. I hope all is well. Rich Lombardi
You wrote that “copyright law evolved” to handle widespread piracy of MP3 files… But that doesn’t capture the history. “The law” was ineffective, so the music industry evolved. It adjusted to collapsing revenues from sales of recordings (down 96% from 1999), and focused instead a on concert ticket sales as well as streaming and subscription services. Piracy of recorded music is still rampant and unstoppable.
You brush off the significance of the content used to train a large language model… but what if someone took your body of briefs and opinions to tune their LLM so that the model’s output echoed your distinctive voice in some ways? Do you have any rights in that?
Thank you for your comment.
I agree with your point that widespread online copying caused the music industry to evolve. But I stand by my contention that copyright law also evolved. The Napster litigation marked the first time that the doctrine of vicarious infringement was applied to peer-to-peer file-sharing.
As to your second point, I’m flattered that you consider my voice distinctive. If it is, I’m not sure why anyone would choose to echo it. But if anyone did, copyright law would not afford me any remedy. Copyright law protects an author’s text, but it does not protect his style. Of course, if the culprit used AI to pass off the output as mine, then I would have a legal recourse. My rights in that scenario would probably be based false endorsement or other common law causes of action. Not on copyright.