Anthropic discovered 22 security vulnerabilities in Firefox using its Claude Opus 4.6 AI model in January 2026. Mozilla addressed these issues in Firefox 148.
The researchers state that AI models are now capable of finding high-severity software flaws independently. They identified 22 Firefox vulnerabilities in two weeks, 14 of which were high-severity, nearly a fifth of all high-severity Firefox issues fixed in 2025, demonstrating AI’s ability to rapidly detect critical security risks in complex software.

In late 2025, Anthropic evaluated Claude Opus 4.6 on Firefox to test its ability to identify complex, high-impact security vulnerabilities. Initially, the model successfully reproduced many historical CVEs from older Firefox versions. Researchers then tasked Claude with finding new, previously unreported bugs, starting with the JavaScript engine. Within twenty minutes, Claude identified a Use After Free vulnerability, which the team validated and reported to Mozilla along with a proposed patch. While triaging, Claude discovered dozens of additional crashes, leading to a total of 112 unique reports across nearly 6,000 C++ files.
“After a technical discussion about our respective processes and sharing a few more vulnerabilities we had manually validated, they encouraged us to submit all of our findings in bulk without validating each one, even if we weren’t confident that all of the crashing test cases had security implications.” reads the report published by Anthropic. “By the end of this effort, we had scanned nearly 6,000 C++ files and submitted a total of 112 unique reports, including the high- and moderate-severity vulnerabilities mentioned above. “
Most issues, including high- and moderate-severity vulnerabilities, were fixed in Firefox 148, with remaining patches planned for future releases.
Mozilla praised the collaboration and began experimenting internally with AI-assisted security research. This project demonstrates AI’s growing capacity to rapidly detect and report critical software flaws.
To test Claude Opus 4.6’s ability to exploit vulnerabilities, researchers provided it with bugs previously submitted to Mozilla and asked it to create functional exploits. Claude attempted several hundred tests, demonstrating attacks that read and wrote local files, spending around $4,000 in API credits. It successfully produced working exploits in only two cases, showing that while the model excels at finding vulnerabilities, exploiting them remains far more difficult and costly.
“We ran this test several hundred times with different starting points, spending approximately $4,000 in API credits. Despite this, Opus 4.6 was only able to actually turn the vulnerability into an exploit in two cases. This tells us two things.” continues the report. “One, Claude is much better at finding these bugs than it is at exploiting them. Two, the cost of identifying vulnerabilities is an order of magnitude cheaper than creating an exploit for them. However, the fact that Claude could succeed at automatically developing a crude browser exploit, even if only in a few cases, is concerning.”
The successful exploits were “crude” and worked only in controlled test environments with security features like sandboxing disabled, meaning real-world impact would be limited. Nonetheless, Claude’s ability to automatically generate even primitive browser exploits highlights the potential risks as AI-assisted offensive capabilities advance.
“These early signs of AI-enabled exploit development underscore the importance of accelerating the find-and-fix process for defenders.” concludes the report. “In our experience, Claude works best when it’s able to check its own work with another tool. We refer to this class of tool as a “task verifier”: a trusted method of confirming whether an AI agent’s output actually achieves its goal. Task verifiers give the agent real-time feedback as it explores a codebase, allowing it to iterate deeply until it succeeds. Task verifiers helped us discover the Firefox vulnerabilities described above,2 and in separate research, we’ve found that they’re also useful for fixing bugs.”
Mozilla reported that AI-assisted analysis uncovered 90 additional Firefox bugs, mostly fixed, including logic errors missed by traditional fuzzing, highlighting AI’s growing role in security.
“The scale of findings reflects the power of combining rigorous engineering with new analysis tools for continuous improvement. We view this as clear evidence that large-scale, AI-assisted analysis is a powerful new addition in security engineers’ toolbox.” states Mozilla.
Follow me on Twitter: @securityaffairs and Facebook and Mastodon
(SecurityAffairs – hacking, Anthropic Claude)