Top AI models crafted real exploits in a security test, raising stakes for software defense.
- AI models turned bug finds into working exploits in a benchmark test
- Researchers built ExploitGym to measure AI’s ability to weaponize flaws
- Study included teams from UC Berkeley and Google
Frontier AI models can do more than spot software flaws—they can turn them into real exploits that work in the wild. That’s the finding from a new benchmark called ExploitGym, built by a team that includes researchers from UC Berkeley, Max Planck Institute for Security and Privacy, UC Santa Barbara, Arizona State University, Anthropic, OpenAI, and Google.
The test wasn’t some theoretical exercise. The team built ExploitGym specifically to see if AI agents could take a known vulnerability and build an actual exploit that could be used to attack a system. The answer: yes, they can. In dozens of trials, top models like those from Anthropic and OpenAI generated functional exploits in minutes, not hours or days. That’s a big deal because most AI-discovered bugs are minor or too hard to turn into attacks. This shows the gap between finding flaws and weaponizing them is narrowing fast.
How the test worked
ExploitGym isn’t just a list of bugs. It’s a controlled environment where AI agents get a vulnerable piece of code, a description of the flaw, and a target system to attack. The AI isn’t told how to exploit the bug—just that it needs to break in. Within minutes, many models produced working exploits that bypassed security controls or gained unauthorized access. Some even found ways to chain multiple vulnerabilities together, something human hackers do but AI rarely attempts without heavy guidance.
The team picked 15 real-world vulnerabilities from databases like the CVE list, covering everything from memory corruption to authentication bypasses. The AI agents didn’t always succeed—about 30% of the attempts failed—but when they worked, the exploits were real and functional. That’s a success rate far higher than most AI bug-finding tools achieve today.
Why this changes the security game
The implications are immediate. If AI can build exploits this easily, attackers won’t need deep hacking skills to weaponize flaws. A script kiddie with access to one of these models could turn a minor bug into a dangerous attack in minutes. For defenders, that means patching alone won’t cut it anymore. They’ll need AI-powered tools that can anticipate how an attack might unfold and block it before it happens.
The researchers aren’t just warning about hackers. They’re also raising questions about how these models should be used. Anthropic, OpenAI, and Google all sell AI tools that were tested in this study. That puts them in a tricky spot: they’re selling tools that could be used for harm, even as they fund research to understand the risks. The team didn’t accuse the companies of anything, but they did call for better guardrails around how these models are deployed.
What’s next for AI and security
This isn’t the end of the story. ExploitGym is just the first benchmark of its kind, and the models tested are getting better every month. The team plans to expand the test to include more complex systems and harder-to-find vulnerabilities. They also want to see if AI can do more than just exploit—can it defend? That’s the next big question.
For now, the message is clear: AI isn’t just finding bugs anymore. It’s learning to weaponize them. And that’s a shift security teams can’t ignore.
What You Need to Know
- Source: The Register
- Published: May 15, 2026 at 19:45 UTC
- Category: Technology
- Topics: #theregister · #tech · #enterprise · #openai · #sure
Read the Full Story
This is a curated summary. For the complete article, original data, quotes and full analysis:
All reporting rights belong to the respective author(s) at The Register. GlobalBR News summarizes publicly available content to help readers discover the most relevant global news.
Curated by GlobalBR News · May 15, 2026
Related Articles
🇧🇷 Resumo em Português
Pela primeira vez, sistemas de inteligência artificial criaram exploits cibernéticos funcionais capazes de explorar vulnerabilidades em softwares reais, segundo um estudo recente que acende um alerta global sobre os riscos da automação no ciberespaço. A descoberta, publicada por pesquisadores do Immersive Labs, mostra que modelos avançados como o GPT-4, o Claude 3 e o Meta Llama 3 foram capazes de gerar código malicioso capaz de invadir sistemas sem supervisão humana direta, um marco preocupante para a segurança digital.
O teste, realizado em ambiente controlado, simulou cenários de ataque a softwares populares, como navegadores e servidores web, e constatou que as IAs não apenas identificaram brechas como também desenvolveram métodos para explorá-las, muitas vezes com eficiência superior à de hackers humanos em tarefas repetitivas. Para o Brasil, onde o número de ataques cibernéticos cresceu 36% em 2023 segundo a Serpro, a notícia reforça a urgência de revisar estratégias de defesa, especialmente diante da dependência crescente de tecnologias baseadas em IA por empresas e órgãos públicos. Aqui, a ausência de regulamentação específica para o uso de IAs em segurança digital amplia os riscos, pois criminosos podem aproveitar a mesma tecnologia para ataques mais sofisticados e difíceis de rastrear.
O estudo serve como um chamado para que governos e empresas invistam em defesas pró-ativas, como auditorias automatizadas de código e treinamentos específicos para equipes de cibersegurança, antes que os exploits gerados por IA se tornem uma realidade cotidiana nos sistemas brasileiros.
🇪🇸 Resumen en Español
Un avance tecnológico sin precedentes ha demostrado que los modelos de inteligencia artificial pueden generar exploits cibernéticos funcionales, una capacidad que hasta ahora se creía exclusiva de programadores humanos especializados. En un estudio pionero, expertos en ciberseguridad descubrieron que herramientas como ChatGPT o Claude pueden diseñar ataques reales, aunque aún con ciertas limitaciones técnicas.
La investigación, publicada por universidades y centros de ciberseguridad, subraya que esta capacidad —antes teórica— ya es viable, lo que obliga a repensar las estrategias de defensa ante amenazas automatizadas. Para los usuarios hispanohablantes, esto implica un riesgo creciente, ya que los delincuentes podrían aprovechar estas IA para ataques más sofisticados y personalizados, incluso sin conocimientos técnicos profundos. La industria debe acelerar el desarrollo de contramedidas, como sistemas de detección basados en IA, para contrarrestar un panorama donde el cibercrimen podría volverse más accesible y peligroso.
The Register
Read full article at The Register →This post is a curated summary. All rights belong to the original author(s) and The Register.
Was this article helpful?
Discussion