AI Agents Uncover 21 Zero-Day Vulnerabilities in FFmpeg
Autonomous security agents found decades-old memory safety flaws in the widely deployed media processing library, prompting urgent updates.
FFmpeg is the quiet backbone of modern digital media. Powering everything from web browsers to massive video streaming platforms, the library is responsible for parsing hundreds of complex, highly optimized, and often untrusted media formats. Written in roughly 1.5 million lines of heavily optimized C, it represents a massive attack surface. Because it routinely handles untrusted inputs, it has been subjected to over two decades of relentless manual audits and automated fuzzing.
Yet, despite this hardening, security researchers are proving that deep-seated vulnerabilities still lurk in its codebase. Following recent disclosures of 13 vulnerabilities by Google's Big Sleep team and further findings by Anthropic's Mythos model, security firm depthfirst has announced the discovery of 21 new zero-day vulnerabilities in FFmpeg. Using an autonomous security agent, the team uncovered bugs spanning from 23-year-old latent flaws to recent regressions introduced in 2025.
A Deep Dive into the CVEs
Of the 21 zero-days discovered, eight have already been assigned CVE identifiers. The vulnerabilities span a wide array of components, including demuxers, decoders, and option parsers, demonstrating that even heavily scrutinized paths remain vulnerable to memory safety issues.
- CVE-2026-39214 (Stack Buffer Overflow): This is the oldest vulnerability in the batch, introduced in 2003 during the original Service Description Table (SDT) implementation. The bug writes service entries without tracking the remaining space in the buffer, allowing an attacker to overflow the stack. It remained undetected in the codebase for 23 years.
- CVE-2026-39210 (Heap Buffer Overflow): Introduced in 2010 within the MPEG-TS demuxer, this flaw lacks length bounds checks before reading two bytes, leading to an out-of-bounds read/write scenario.
- CVE-2026-39211 (Integer Overflow): Also dating back to 2010, this vulnerability was introduced during a
swscalerefactor. A size factor formula lacked upper bounds, allowing user-controlled parameters to trigger an arbitrarily large scaling operation and subsequent memory corruption. - CVE-2026-39215 & CVE-2026-39216 (Heap Buffer Overflows): Both introduced in 2012. CVE-2026-39215 resides in
update_mb_info(), where a logic error allows a subsequent call to write 12 bytes past the allocated buffer. CVE-2026-39216 exists inimg2enc.cand was caused by replacing a safe chroma size with an unbounded dimension-derived size. - CVE-2026-39218 (Heap Buffer Overflow): Introduced in 2017 in the DASH demuxer, this vulnerability stems from a failure to reject negative duration values. When processed, these negative values turn fragment array indices negative, leading to out-of-bounds heap access.
- CVE-2026-39213 (Heap Buffer Overflow): A more recent bug introduced in 2023 in the
yuv4mpegencrawvideo input path, which fails to validate dimensions against the actual packet size. - Recent Regressions (2025):
- CVE-2026-39212 (Stack Overflow): A regression from July 2025 inside
ffmpeg_opt.c. A specially crafted preset file can trigger option parsing recursively without a depth limit, exhausting the stack. - CVE-2026-39217 (Heap Buffer Overflow): A regression from March 2025 in the VP9 decoder, where a refactored size update function caused tile thread buffers to miss necessary reallocations.
- CVE-2026-39212 (Stack Overflow): A regression from July 2025 inside
Beyond these eight CVEs, the remaining 13 vulnerabilities—including a heap buffer overflow in the RTP AV1 depacketizer tracked internally as DFVULN-127—have been fixed but do not yet have CVE identifiers assigned.
How Autonomous Security Agents Find What Fuzzers Miss
The discovery of these vulnerabilities highlights a shift in how software is audited. While traditional fuzzing excels at finding shallow bugs by mutating inputs rapidly, it often struggles with deep logical paths that require highly specific, valid structures to reach.
The depthfirst team utilized a specialized, autonomous security agent to scan the codebase. Unlike interactive coding assistants designed to write application code, a security agent operates with a highly targeted, adversarial objective. The agent's workflow consists of several key stages:
- Threat Modeling: The agent begins by mapping the architecture of the codebase. It identifies exposed parsers, protocol handlers, and entry points where attacker-controlled input enters the system.
- Data Flow Analysis: Instead of treating the repository as a flat collection of files, the agent traces data flows from untrusted entry points to potential sinks (such as memory allocation and copy functions).
- Hypothesis Testing & Guardrails: To prevent the hallucination of theoretical bugs, the agent is equipped with strict guardrails. It must verify that the attacker actually controls the inputs driving the execution path and that the vulnerable sink is reachable.
- Harness Generation and Verification: When a potential vulnerability is identified, the agent generates targeted test harnesses to interact with the specific components. It then produces a concrete, reproducible Proof-of-Concept (PoC) input to confirm the vulnerability via execution.
This automated verification process ensures that findings are actionable and free of false positives. Notably, depthfirst achieved these results at a fraction of the cost of previous LLM-based security efforts. While Anthropic's scan of FFmpeg using its Mythos model cost approximately $10,000, depthfirst's agent completed its run for roughly $1,000 using commercially available models.
Mitigation and the Reality of C Codebases
For developers and system administrators, the immediate takeaway is clear: any application processing untrusted media using FFmpeg must be updated immediately. Because FFmpeg is often statically linked or bundled inside container images, developers should audit their dependency trees to ensure they are running the latest patched versions.
The fact that a 23-year-old stack buffer overflow (CVE-2026-39214) and multiple decade-old heap overflows went unnoticed despite decades of fuzzing underscores the limitations of traditional dynamic analysis. It also highlights the double-edged sword of refactoring in C: while refactoring is intended to clean up legacy code, it frequently introduces new regressions, as seen in the 2025 VP9 decoder and option parser bugs (CVE-2026-39217 and CVE-2026-39212).
As LLM-driven security agents become more sophisticated and cost-effective, the barrier to finding zero-day vulnerabilities in legacy C codebases is dropping rapidly. For maintainers of open-source infrastructure, integrating these autonomous auditing tools into the CI/CD pipeline may soon become a necessity to find bugs before malicious actors do.
Sources & further reading
- Twenty One Zero-Days in FFmpeg — depthfirst.com
Ji-ho covers the increasingly tangled overlap between cloud architecture and security, drawing on a background as a penetration tester to keep his reporting grounded in real-world attack paths. He never lets a vendor claim go unquestioned and insists that every buzzword come with a proof of concept.
Discussion 0
No comments yet
Be the first to weigh in.