Simcha Kosman

Israeli Researcher Uncovers Critical Software Infrastructure Flaws Using AI — on an $80 Budget

Posted on 11 December 2025 by Techtime

Next week, Simcha Kosman, a senior researcher at CyberArk Labs, will present a new study at BlackHat London—one of the world’s most prestigious cybersecurity conferences, where only a small fraction of submissions are accepted. His research demonstrates how artificial intelligence can be leveraged to detect security flaws in widely used software systems at a fraction of the traditional cost and time, effectively rivaling the capabilities of industry giants like Google and OpenAI.

Kosman and his team set out to answer a deceptively simple question: Can AI be used to uncover real vulnerabilities in massive software projects—such as the Linux kernel, Redis and FFmpeg—without huge budgets or large teams? Their findings point to an unequivocal yes. In just two days, and for less than $80 in total compute costs, their tool led to the discovery of dozens of vulnerabilities. Several have already been assigned nine official CVE identifiers across major projects including the Linux Kernel, FFmpeg, Redis, RetroArch, Libretro, Bullet3 and Linenoise.

At the heart of the study is a new open-source tool called Vulnhalla. The system combines CodeQL—GitHub’s industry-standard static analysis engine—with an AI model designed to dramatically reduce noise. On large repositories, CodeQL alone can generate tens of thousands of alerts, the vast majority of them false positives. Vulnhalla tackles this bottleneck directly: it analyzes CodeQL’s findings, extracts relevant code context for each alert, and uses the AI model to determine which findings have genuine exploit potential.

Crucially, the researchers don’t simply ask the model broad questions like “Is this a vulnerability?” Instead, they guide it through a structured sequence of prompts that mirror the reasoning of an experienced security analyst: Where is the buffer defined? What is its size? Does it change? What is the target size? Is there a data flow that could lead to a memory-boundary violation? This step-by-step, logic-driven approach forces the model to perform genuine reasoning rather than relying on superficial pattern recognition. According to the study, this methodology reduces false positives by more than 90% for several vulnerability classes, and in some cases by up to 96%.

The result positions Vulnhalla as a compelling alternative to more advanced proprietary systems such as Google’s Deep Sleep and OpenAI’s Aardvark. It delivers comparable vulnerability-detection performance while remaining fully open, transparent and community-driven. For development and security teams struggling under the weight of soaring alert volumes, this hybrid approach offers a way to focus resources on a far smaller set of findings with real-world impact.

As Kosman notes, the research marks another step toward using AI not just to detect weaknesses faster, but to help close widening security gaps in the software we all rely on every day.

Analytics

WP Engine - Hosting Provider

Cloudflare - Cloud based security and web performance processor.

Google Cloud Platform - data centers provider for WP Engine

Sucuri - Website security provider

Mailchimp - Newsletter service provider

Google Analytics, Adwords, Webmasters

Facebook - We use Facebook for advertising and place tracking code on our website for enhancing digital marketing campaigns (i.e - Facebook Pixel).

Planwize Ltd - Digital Marketing Agency.

Israeli Researcher Uncovers Critical Software Infrastructure Flaws Using AI — on an $80 Budget

Who we are

What personal data we collect and why we collect it

Comments

Media

Analytics

How long we retain your data

Request for Receiving Data Associated with One’s Email Address

Where we send your data

Contact information

How we protect your data

What data breach procedures we have in place

What third parties we receive data from

What automated decision making and/or profiling we do with user data