This browser is not actively supported anymore. For the best passle experience, we strongly recommend you upgrade your browser.

Technology Law

| 6 minute read
Reposted from IP & Media Law Updates

BREAKING: COURT RULES THAT DEVELOPER'S USE OF COPYRIGHTED MATERIAL TO TRAIN AI IS NOT FAIR USE

This morning, the court in Thomson Reuters v. Ross Intelligence produced one of the first major rulings to address the application of fair use to artificial intelligence model training. In a dramatic about-face, the same judge who had previously suggested Ross’s AI training might qualify for fair use now holds that it’s not even up for debate that it’s not. Although the decision is expressly limited to its unique facts, it could have significant implications for the pending infringement cases against AI models, and helps delineate the boundaries of acceptable use of copyrighted material for AI model training.

Background

Thomson Reuters owns copyrights in its Westlaw headnotes and key-number system—editorial content that organizes judicial opinions and helps researchers navigate the law. Ross, an AI-powered legal research platform, used these headnotes to train the AI model powering its own product, albeit never displaying the headnotes to end users. Thomson Reuters sued Ross for copyright infringement. Ross asserted multiple defenses including, most notably, fair use.  Both sides moved for summary judgment. The court initially denied summary judgment on fair use and other key issues and readied the case for trial. However, in a stunning development, Judge Bibas hit the pause button, indicating that he was reconsidering his prior ruling, effectively putting summary judgment back on the table.  On February 11, 2025, the court issued this decision.

The Decision

Ross Copied Westlaw's Headnotes

Beginning with the claim for copyright infringement, the court first determined that Westlaw’s headnotes and key-number system are copyrightable due to the original editorial choices involved. The court then held that Ross had copied these materials, observing that the "bulk memos" that Ross used to train its AI looked "more like a headnote than it does like the underlying judicial opinion," which was "strong circumstantial evidence of actual copying.”  The court did not need an expert to find that thousands of headnotes were substantially similar to the training materials ingested into the platform:  “As a lawyer and judge, I am myself an ordinary user of Westlaw headnotes. So I am well positioned to determine substantial similarity here.”

Ross's Non-Fair Use Defenses Hold No Water

The court breezily rejected Ross's innocent infringement, copyright misuse, merger, and scenes à faire defenses, finding that none of them "holds water."

Thomson Reuters, Not Ross, Prevails on the Fair Use Defense

The heart of the court's opinion is its fair use analysis. In an abrupt shift from its earlier stance that Ross’s AI training might qualify for fair use, the court found that Thomson Reuters prevailed on two decisive factors and on the overall balancing.

Factor One Disfavors Fair Use Because Ross Built a Commercial, Competitive Product

After quickly flagging Ross’s purpose as commercial (Ross built a for-profit legal research service), the court analyzed whether Ross’s copying of Westlaw’s headnotes was transformative.  The court sought guidance from Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith.  In that landmark fair use case, the Supreme Court held that if an original work and a secondary use share the same or very similar purposes, and the secondary use is of a commercial nature, then the first factor will often weigh against fair use—unless the user can point to another strong justification for copying. Applying that standard, the court held that Ross's use was not transformative because it "was using Thomson Reuters’s headnotes as AI data to create a legal research tool to compete with Westlaw."

The court next turned to Ross’s argument that the headnotes were never shown to end users, but instead were converted into “numerical data" behind the scenes.  The court rejected this so-called "intermediate copying" defense, distinguishing earlier precedent like Sega, Sony, and Google v. Oracle, in which intermediate copying was upheld as fair use. The court reasoned that those cases involved computer code, which is inherently functional, and was copied out of necessity to achieve interoperability.  Here, on the other hand, Westlaw's headnotes are editorial (not functional) in nature and did not need to be copied to achieve Ross's new purpose.

Factor Two Favors Fair Use Because Headnotes Are Barely Creative

Though the court recognized that Westlaw’s headnotes cleared the low bar for copyrightability, it noted that they primarily summarize factual material with minimal creative expression. The court therefore concluded the headnotes were “far from the most creative works,” given their primarily factual nature. As a result, factor two weighed in Ross’s favor, though the court noted that this factor “has rarely played a significant role in the determination of a fair use dispute.”

Factor Three Favors Fair Use Because the Headnotes Are Not Shown to Users

Though Ross allegedly copied thousands of headnotes, the court focused on the fact that none were used in Ross’s final outputs, concluding that “what matters is . . . the amount and substantiality of what is thereby made accessible to a public.”  Thus, the third factor weighed in favor of fair use.

Factor Four Disfavors Fair Use Because Ross’s Product Threatens Westlaw’s Business Model and Data Licensing Opportunities

Factor Four, which the court dubbed “the single most important element of fair use,” examines the likely market effect of Ross’s copying, including both existing and potential derivative markets. Although the court initially left this issue to the jury—concerned about whether Thomson Reuters intended to license its headnotes for AI training and whether Ross truly competed with Westlaw—it was now convinced that Ross’s product posed a direct threat to the market for Westlaw and future AI-training data license opportunities.

The court also weighed any potential public benefit in Ross’s tool against the substantial market harm. It found that the public’s ability to access the legal opinions for free suffices to serve the public need to access the law.  However, in a sentiment that will be heralded by folks in the copyright community, the judge observed:

“The public has no right to Thomson Reuters’s parsing of the law. Copyrights encourage people to develop things that help society, like good legal-research tools. Their builders earn the right to be paid accordingly.”

Key Takeaways

Favorable Precedent for Plaintiffs in AI Cases, But Limited to Its Facts. Plaintiffs in the pending AI + copyright cases surely will tout this ruling as strong support for their claims against AI developers. However, the court emphasized that it was deciding against fair use based on Ross’s particular use—namely, using protected material to build a directly competing legal research tool.  “Because the AI landscape is changing rapidly, I note for readers that only non-generative AI is before me today.” Thus, the decision leaves ample room for developers of large language and generative AI models to distinguish their cases on the facts.

Court’s Heavy Reliance on Warhol. The decision leans heavily on the Supreme Court’s Warhol framework, which underscores whether the secondary use “shares the same or very similar purposes” as the original. The court’s finding that Ross’s competitive use was not transformative plays into widespread concerns that Warhol may serve to narrow fair use. Developers hoping to claim transformative use will need to address Warhol head-on, especially where their product competes directly with the copyright owner’s offering.

Important Guidance for AI Tool Developers. The court drew a line in the sand for developers training AI on copyrighted materials: if your model effectively leverages someone else’s editorial labor to create a rival product (even if the underlying text isn’t displayed to users), it may not qualify as fair use.  The fair use analysis may turn on differences in AI model training methods, uses, purposes, and end products.

Absence of Verbose Copying in Outputs Is Not Dispositive. The ruling challenges the notion that fair use hinges solely on whether the final product quotes the original material. Whereas fair use may embrace the intermediate copying of large data sets to make them searchable, as long as the output limits public display (See, e.g., Authors Guild v. Google), it may not support the intermediate copying of data sets to develop competitive, commercial products, even if the works themselves never see the light of day.

AI + Copyright Litigation Is Far From Over. There are still some narrow factual issues for a jury to decide, and an appeal to the Third Circuit remains a real possibility. Because this is a district court decision, courts in the Second and Ninth Circuits—where many other major AI cases are pending—may ultimately arrive at different or more expansive views of fair use.

Tags

copyright, ai, fair use