Calif's AI Audit of FreeBSD: 15 Kernel Bugs (3 RCEs, 5 LPEs, 1 bhyve Escape) and Three Public CVE Writeups

Original text: “An AI audit of FreeBSD — 15 kernel bugs, including 3 RCEs, 5 LPEs, and 1 bhyve escape” — Calif (publication; no individual byline), blog.calif.io (May 28, 2026). The PoC repositories on GitHub are califio/publications/MADBugs/freebsd. Demo GIFs below are reproduced verbatim with attribution captions.

Executive Summary

Calif — a small AI-security shop — ran a coordinated audit of the FreeBSD kernel using an AI agent on top of OpenAI and Anthropic models. Output: 15 kernel bugs, comprising 3 remote code executions, 5 local privilege escalations, 1 bhyve guest-to-host escape, and a tail of memory disclosures and denial-of-service issues. The work was conducted in direct partnership with the FreeBSD security team — bugs were reported high-signal (one-liner + working PoC), patches were suggested but not insisted on, and the relationship is open enough that report-to-fix latency is now days rather than months. Twelve bugs remain private until FreeBSD ships patches. Three are public: CVE-2026-45250 (setcred), CVE-2026-45253 (ptrace), and CVE-2026-45251 (procdesc).

Each of the public LPEs is structurally interesting in its own right. setcred is a one-character sizeof confusion in kern_setcred_copyin_supp_groups that causes a stack overflow in the user_setcred frame — bug present in 14.3, 14.4 and 15.0, but only reliably exploitable on 14.4. ptrace is a missing bounds check on the redirected syscall number in PT_SC_REMOTE, giving the attacker an out-of-bounds index into the sysent table that chains to LPE. procdesc is a use-after-free in procdesc_free(): the freed struct procdesc embeds a pd_selinfo with un-drained poll waiters, so reclaiming the slot — conveniently, with SCM_RIGHTS file descriptors — leads to stale TAILQ_REMOVE operations and ultimately arbitrary kernel-pointer writes. The exploits and writeups were generated by AI; humans verified that each PoC works before release. The article also flags two earlier AI-assisted FreeBSD CVEs from the same campaign — CVE-2026-4747 (the first AI-assisted FreeBSD remote kernel exploit, March 2026) and CVE-2026-7270 (the “exeCVE” bug, April 2026).

At a Glance

Field	Value
Target	FreeBSD kernel (releases 14.3, 14.4, 15.0)
Auditor	Calif (blog.calif.io)
Audit method	AI agent (OpenAI + Anthropic models) producing exploits and writeups; humans verify working PoC before release
Total bugs reported	15 — all kernel-level
Severity split	3 RCEs · 5 LPEs · 1 bhyve guest-to-host escape · tail of memory disclosure / DoS
Public CVEs (this post)	CVE-2026-45250 (setcred), CVE-2026-45253 (ptrace), CVE-2026-45251 (procdesc)
Earlier public CVEs (same campaign)	CVE-2026-4747 (March 2026, first AI-assisted FreeBSD RKE); CVE-2026-7270 (April 2026, “exeCVE”)
Bugs withheld	12 — held until FreeBSD ships fixes
PoC repository	califio/publications/MADBugs/freebsd
Disclosure model	Coordinated; direct video-call relationship with FreeBSD team; report-to-fix in days

Calif’s FreeBSD audit at a glance. Source: original article.

What Calif Wants to Achieve

The framing in the post is unusually maintainer-centric for a vulnerability-research piece. The explicit goals are (a) make finding bugs in FreeBSD more expensive for adversaries — raise attacker cost, not CVE count — and (b) help the FreeBSD team find, eliminate, and prevent the same bug classes post-engagement. There is no claim about being first, no leaderboard, no chasing CNAs. The framing is “we owe open-source infrastructure a debt and this is how we’re paying it down.”

How Calif Works with Maintainers

Several operational details in the “How we work” section are worth lifting out, because they generalise well beyond this engagement:

Severity is the maintainer’s call. Only High and Critical are escalated; Calif takes the project’s severity classification as authoritative and doesn’t argue for upgrades.
Reports are minimal. One-liner plus a working PoC. Deep-dive narratives are available on request, not unsolicited.
Patches are suggestions, not deliverables. Calif drafts a fix where one is obvious, labels it as a suggestion, and lets the maintainer decide. No back-and-forth required.
Direct relationship up front. Video calls early, separate channel for ongoing reports. The result is that bug-to-fix latency dropped to days — far below the typical inbox-based disclosure cadence.
Same pattern is being replicated with other internet-infrastructure projects.

The combination — respect the maintainer’s triage, keep reports terse, ship working PoCs, and invest in the human relationship before the first ticket — is the single most generalisable artefact of the post, even before the bugs themselves.

Published LPEs

CVE-2026-45250 — setcred

A one-character sizeof confusion in kern_setcred_copyin_supp_groups miscalculates the byte length of an attacker-controlled copy. The resulting stack overflow lands inside the user_setcred frame and turns into a local root shell. The structural bug is present in FreeBSD 14.3, 14.4, and 15.0; only 14.4 is reliably exploitable in practice — the other two have layout differences that defeat the canonical PoC. The vuln + PoC repository is califio/publications/…/setcred-CVE-2026-45250.

Animated demo of the setcred CVE-2026-45250 LPE exploit landing a root shell on FreeBSD 14.4. Source: original article.

CVE-2026-45253 — ptrace

ptrace(PT_SC_REMOTE) lets a debugger redirect a tracee’s syscall to a different number. The validation forgot to bounds-check that number against the sysent table, so an attacker who can ptrace a child of their own can drive an out-of-bounds index into the syscall dispatch table. The chain ends in LPE. The vuln + PoC repository is califio/publications/…/ptrace-CVE-2026-45253.

Animated demo of the ptrace(PT_SC_REMOTE) CVE-2026-45253 LPE exploit. Source: original article.

CVE-2026-45251 — procdesc

procdesc_free() tears down a struct procdesc — FreeBSD’s “process descriptor” (the pdfork()-returned handle) — without draining the poll waiters attached to the embedded pd_selinfo. Once the procdesc memory has been freed, the waiter list still points at it; reclaiming the slot via SCM_RIGHTS file descriptors lets the attacker control what the dangling pointer references. The stale TAILQ_REMOVE calls that fire when poll notifications eventually wake then turn into arbitrary kernel-pointer writes. The vuln + PoC repository is califio/publications/…/file-CVE-2026-45251.

Animated demo of the procdesc CVE-2026-45251 use-after-free LPE exploit. Source: original article.

The Three Bug Classes, in One Frame

The published trio is a representative slice of where AI auditing currently has leverage on kernel code:

setcred — a specification bug. The sizeof-of-wrong-thing pattern is the textbook example of a defect that diff-and-grep tools struggle with because the broken code is locally consistent. An LLM that follows the data through the copy reasons about “is this size the right size for what is being copied?” rather than “does this code parse?”
ptrace — an invariant bug. The missing bounds check is invisible inside the function; the check is supposed to exist somewhere on the path before sysent[i] is dereferenced. Tracing the syscall path top-down and listing the indexes that ever reach the dispatch table is the kind of work LLMs do well now.
procdesc — a lifetime bug. UAF in a path that drains some related state (waiters, child sockets, …) but forgets one piece. The hard part of this class isn’t spotting the freed object, it’s reasoning about who else holds a pointer to it; that’s exactly the kind of multi-source-file question current models are good at.

That all three classes show up in a single audit is what makes the engagement interesting beyond the individual CVEs — the bug-class distribution is consistent with what an LLM auditor would be expected to find.

Key Takeaways

AI-assisted kernel auditing is now producing chains of real, reproducible CVEs against a mainstream BSD — this is no longer a thought experiment.
The audit produced 15 bugs of varying severity. The single bhyve guest-to-host escape is, on its own, the kind of finding that justifies an entire engagement; the audit produced it as one item in a list.
The three public LPEs (setcred, ptrace, procdesc) span three distinct bug classes — size-of confusion, missing bounds check, UAF — which is consistent with what LLM auditors are structurally good at.
Exploits and writeups were AI-generated. Human verification (“does the PoC fire?”) is still doing the load-bearing safety work; the AI is doing the exploration.
Calif’s disclosure model — one-liner + PoC, severity is the maintainer’s call, patches are suggestions, video call up front, dedicated channel — closed the report-to-fix gap to days.
The remaining 12 bugs are private until patches ship. CVE-2026-4747 (March, first AI-assisted FreeBSD RKE) and CVE-2026-7270 (April, “exeCVE”) are public from earlier in the same campaign.
The structural lesson for open-source maintainers: relationships and bug-quality matter more than process volume. Calif’s default of “send only High/Critical, send working PoCs, accept the maintainer’s severity assessment” produced higher throughput than a generic disclosure pipeline would.

Defensive Recommendations

Upgrade FreeBSD as patches land. Track the three public CVEs (45250, 45251, 45253) and apply the FreeBSD-SA advisories once they ship. Pay particular attention to 14.4 — the only release where the setcred PoC is reliably exploitable.
Audit your own kernel-adjacent code for sizeof-of-wrong-thing patterns. The setcred bug class generalises — sizeof on the wrong type or on a pointer instead of a struct is invisible in code review and visible to LLMs.
For ptrace-aware threat models, treat PT_SC_REMOTE as elevated risk. Containers, jail-style environments and ptrace-permissive sandboxes are the exposure surface; if the kernel below them is unpatched, the host integrity is at stake.
Inventory pdfork() users on production systems. The procdesc UAF is reachable from any process that can hold a process descriptor and a Unix socket simultaneously. pdfork() is not heavily used outside specific service shapes; knowing who uses it on your hosts is the first step.
Adopt the “one-liner + PoC” reporting model for your own internal vulnerability pipeline. Calif’s data point is that report quality, not report volume, drives time-to-fix.
Run an LLM auditor against your own infrastructure code. The Calif engagement is the proof that the bug classes LLMs find on FreeBSD are findable on similar code bases. Internal tooling is now the right level of investment, not external research contracts only.
Subscribe to FreeBSD’s security advisory feed. The pattern from this engagement — days, not weeks, from PoC to patch — means advisories now matter for opportunistic detection of in-the-wild exploitation, not just compliance reporting.

Conclusion

The Calif write-up is interesting on two distinct axes. As a vulnerability disclosure it is a credible AI-assisted security campaign against a mainstream BSD that produced three structurally-distinct LPEs (setcred, ptrace, procdesc), a bhyve guest-to-host escape, and 11 more findings that remain private. As an operational case study it’s a clean example of a low-friction disclosure relationship: severity owned by the maintainer, reports terse, PoCs working, patches suggested rather than insisted on, and a video-call relationship rather than a ticket queue. The bug classes the audit surfaces — sizeof confusion, missing bounds-check on a syscall-index path, UAF in a teardown that forgot one waiter list — are exactly the shapes LLM-driven auditors are now demonstrably good at. Credit and thanks to the Calif team for the engagement and the unusually maintainer-centric framing of the write-up, and to the FreeBSD security team for closing the loop fast.

References

Original text: “An AI audit of FreeBSD” by Calif (publication; no individual byline) at blog.calif.io (May 28, 2026).

Calif’s AI Audit of FreeBSD: 15 Kernel Bugs (3 RCEs, 5 LPEs, 1 bhyve Escape) and Three Public CVE Writeups

Executive Summary

At a Glance

What Calif Wants to Achieve

How Calif Works with Maintainers