Today, we’re announcing Aardvark, an agentic security researcher powered by GPT‑5.
The agent continuously monitors code repositories to find and validate vulnerabilities, assess their exploitability, and propose targeted patches. — Analytics India Magazine
How Aardvark works
Aardvark continuously analyses source code repositories to identify vulnerabilities, assess exploitability, prioritize severity, and propose targeted patches. — OpenAI
Aardvark relies on a multi-stage pipeline to identify, explain, and fix vulnerabilities:
- Analysis: It begins by analyzing the full repository to produce a threat model reflecting its understanding of the project’s security objectives and design. — OpenAI
- Commit scanning: It scans for vulnerabilities by inspecting commit-level changes against the entire repository and threat model as new code is committed. When a repository is first connected, Aardvark will scan its history to identify existing issues. Aardvark explains the vulnerabilities it finds step-by-step, annotating code for human review. — OpenAI
- Validation: Once Aardvark has identified a potential vulnerability, it will attempt to trigger it in an isolated, sandboxed environment to confirm its exploitability. Aardvark describes the steps taken to help ensure accurate, high-quality, and low false-positive insights are returned to users. — OpenAI
- Patching: Aardvark integrates with OpenAI Codex to help fix the vulnerabilities it finds. It attaches a Codex-generated and Aardvark-scanned patch to each finding for human review and efficient, one-click patching. — OpenAI
Aardvark works alongside engineers, integrating with GitHub, Codex, and existing workflows to deliver clear, actionable insights without slowing development. While Aardvark is built for security, in our testing we’ve found that it can also uncover bugs such as logic flaws, incomplete fixes, and privacy issues. — OpenAI
Real impact, today
Aardvark has been in service for several months, running continuously across OpenAI’s internal codebases and those of external alpha partners. Within OpenAI, it has surfaced meaningful vulnerabilities and contributed to OpenAI’s defensive posture. Partners have highlighted the depth of its analysis, with Aardvark finding issues that occur only under complex conditions. — OpenAI
In benchmark testing on “golden” repositories, Aardvark identified 92% of known and synthetically-introduced vulnerabilities, demonstrating high recall and real-world effectiveness. — OpenAI — Analytics India Magazine
Aardvark has also been applied to open-source projects, where it has discovered and we have responsibly disclosed numerous vulnerabilities—ten of which have received Common Vulnerabilities and Exposures (CVE) identifiers. — OpenAI
We recently updated our outbound coordinated disclosure policy which takes a developer-friendly stance, focused on collaboration and scalable impact, rather than rigid disclosure timelines that can pressure developers. We anticipate tools like Aardvark will result in the discovery of increasing numbers of bugs, and want to sustainably collaborate to achieve long-term resilience. — OpenAI
Why it matters
Software is now the backbone of every industry—which means software vulnerabilities are a systemic risk to businesses, infrastructure, and society. Over 40,000 CVEs were reported in 2024 alone. Our testing shows that around 1.2% of commits introduce bugs—small changes that can have outsized consequences. Aardvark represents a new defender-first model: an agentic security researcher that partners with teams by delivering continuous protection as code evolves. — OpenAI
Private beta now open
We’re inviting select partners to join the Aardvark private beta. Participants will gain early access and work directly with our team to refine detection accuracy, validation workflows, and reporting experience. If your organization or open source project is interested in joining, you can apply here. — OpenAI
ZDNET’s key takeaways
- OpenAI has launched Aardvark, a cybersecurity researcher agent. — ZDNET
- Aardvark is powered by GPT-5 and is in private beta. — ZDNET
- It can discover and help fix security vulnerabilities. — ZDNET
Share Your Thoughts
What impact could this have on how development teams manage secure code?
Do you agree that agentic tools should be integrated into standard CI/CD security workflows?
How should maintainers of open-source projects respond to increased automated scanning and disclosures?
What lessons can security teams learn from Aardvark’s multi-stage validation and patching approach?
How might policy-makers and platform owners need to adapt disclosure guidelines as agentic researchers scale?
