Copy Fail and the forensic blind spot hiding in Linux memory

There is already a long queue of articles explaining how Copy Fail works, what kernel version you need to patch to, and what the Python PoC does step by step. This is not one of those articles. CVE-2026-31431 is genuinely interesting for a different reason: it is one of the cleanest examples in recent memory of a vulnerability that specifically defeats classic forensic and detection assumptions, not through obfuscation or stealth code, but through a fundamental property of how the Linux kernel manages memory. Understanding that is more useful, and more lasting, than memorising the exploit primitives.

cover

A nine-year-old optimization, quietly gone wrong

The story starts in 2017, when commit 72548b093ee3 introduced an in-place optimization to algif_aead.c, the component of the Linux kernel’s AF_ALG socket interface that exposes AEAD cipher operations to userspace. The intent was a minor performance improvement: instead of copying data, the kernel would process it in-place by setting req->src = req->dst and chaining the tag pages from the source scatterlist into the output scatterlist via sg_chain().

The problem is subtle enough that it survived unnoticed for almost nine years. When userspace feeds data to the socket through splice(), the tag pages reference the page cache of the spliced file directly. The authencesn(hmac(sha256),cbc(aes)) implementation then writes four bytes at offset assoclen + cryptlen as scratch space for Extended Sequence Number rearrangement, and because the output scatterlist now extends into those chained page cache pages, that write lands inside the cached data of the spliced file, bypassing all permission checks. The HMAC verification fails, returning -EBADMSG as expected, but the page cache corruption has already happened. A failing decrypt still corrupts the page.

info

The fix, included in kernel 7.0, 6.19.12, and 6.18.22, is almost comically simple: revert to out-of-place operation in algif_aead, removing essentially all the complexity introduced in 2017. The Openwall oss-security disclosure provides the full technical details and kernel patch links. There is no benefit in operating in-place in algif_aead since the source and destination come from different mappings, as the fix commit message notes without ceremony.

The DFIR angle nobody is writing about: when the disk is clean but memory lies

Here is where Copy Fail becomes interesting for an investigator. The exploitation path is: bind an AF_ALG socket to authencesn(hmac(sha256),cbc(aes)), splice() the page cache pages of /usr/bin/su into the crypto pipeline, then issue a recvmsg() whose AAD bytes supply the four-byte value the authencesn scratch write will deposit into the target page. Repeat at successive offsets to stage shellcode into the cached pages of /usr/bin/su. Run su. Get a root shell.

The key detail: the file on disk is never touched. The corruption affects only the in-memory page cache, which is what the kernel actually uses when executing a binary. A traditional file-integrity monitor that checks SHA-256 hashes of files on disk sees nothing. An IDS that watches for writes to /usr/bin/su sees nothing. The audit trail for the su execution looks like a normal privilege transition. The attacker leaves the disk in a state that would pass any post-incident verification based on static analysis of storage.

This is not entirely new territory. If you have been following the evolution of forensic anti-patterns on Linux, you have seen similar surprises before: the Windows 11 PCA artifact quietly recording execution evidence where analysts were not looking, iOS metadata recording data users assumed was not there, wiper malware targeting filesystem structures beyond the MBR that classic recovery procedures did not cover. Copy Fail belongs to the same family of cases where the evidence, or the absence of it, is not where the analyst expects. HelpNet Security has published detailed coverage of the vulnerability and its implications for the security community.

The comparison to Dirty Pipe (CVE-2022-0847) is instructive in the other direction. Dirty Pipe required precise pipe buffer manipulation, version-specific targeting, and had timing windows. Copy Fail is a straight-line logic flaw: no races, no timing, just four syscalls (socket, setsockopt, splice, sendmsg, recvmsg) repeated until the target binary is patched in cache. The PoC is around 700 bytes of Python and produces reliable results on Ubuntu 24.04, Amazon Linux 2023, RHEL 10.1, and SUSE 16, all default configurations.

From host to cluster: why “local” LPE is the wrong mental model for cloud environments

The standard framing of a local privilege escalation is: attacker already has a shell, attacker elevates to root. The implicit assumption is that you are reasoning about a single, bounded host. That assumption breaks down almost everywhere in modern infrastructure.

In a Kubernetes environment, the page cache is shared between the host and containers running on the same node. A workload running inside a container as an unprivileged user can use Copy Fail to corrupt the page cache of a setuid binary on the host, then execute that binary, obtain a root shell, and break out of the container entirely. The container boundary, which is logical rather than hardware-enforced, provides no protection here. OVHcloud, which published a detailed response covering their managed Kubernetes service, confirmed that unpatched nodes in multi-tenant clusters are fully exposed regardless of pod security policies or runtime restrictions.

This is a recurring pattern worth internalising. Every time an LPE affects the Linux kernel rather than userspace alone, the blast radius in cloud environments is systematically wider than the “local user” label suggests. An initial foothold through a web application vulnerability, a misconfigured CI/CD runner exposed to the internet, or a compromised dependency in a pipeline can be all that is needed to turn Copy Fail into a full node compromise, followed by lateral movement across whatever other workloads that node is running.

The affected kernel range, 4.14 through 7.0-rc, covers essentially every Linux system deployed since late 2017, including all the long-term stable branches that enterprise distributions continue to maintain: 6.12.x, 6.6.x, 5.15.x, and 5.10.x all carry the vulnerable backport. Tenable has published a comprehensive FAQ covering the technical details, affected versions, and remediation guidance.

Detection engineering: what you can actually see

Patch first. But on the realistic assumption that patching takes time, particularly across large fleets, distributed infrastructure, and embedded systems where kernel updates require maintenance windows, detection becomes the practical compensating control.

The good news is that Copy Fail is unusually detectable at the syscall level, precisely because the attack path requires creating an AF_ALG socket with SOCK_SEQPACKET type. This is not a common operation. The set of legitimate processes that need AF_ALG AEAD sockets is small and predictable: cryptsetup, veritysetup, integritysetup, systemd-cryptsetup, and a handful of kcapi utilities. Everything else creating this socket type is anomalous.

Sysdig’s threat research team has published a Falco rule that operationalises exactly this detection. The rule fires on AF_ALG SEQPACKET socket creation from any process outside the known disk-encryption toolchain. The rule is precise enough to be production-useful without massive false-positive tuning, although environments running kernel TLS (kTLS) will need to audit their workloads first, since kTLS also uses AF_ALG sockets.

- rule: Unexpected Process Using Kernel AEAD Crypto Socket
  desc: >
    Detects creation of an AF_ALG SEQPACKET socket from a process outside the
    known disk-encryption toolchain. Mandatory first step of CVE-2026-31431.
  condition: >
    successful_af_alg_seqpacket_socket and
    not proc.name in (known_af_alg_binaries)
  priority: CRITICAL
  tags: [host, container, kernel, CVE-2026-31431,
         MITRE_TA0004_privilege_escalation,
         MITRE_T1068_exploitation_for_privilege_escalation]

ReversingLabs has additionally published five YARA rules covering file-based and memory-based detection of the Python PoC and its variants, which are useful for threat hunting across images, repositories, and dropped files. These two layers complement each other: the Falco rule catches exploitation at runtime; the YARA rules catch the exploit script before or during staging.

If you are running a 24/7 SOC for a small team, the operational priority is straightforward. The Falco rule should go in as a high-priority alert with minimal tuning friction. The YARA rules are better used in a scheduled threat hunting pass over your artefact stores and pipeline caches rather than as real-time alerts, which would be noisy and slow.

The deeper forensic problem: audit log design for kernel-level attacks

Copy Fail exposes a gap that goes beyond this specific vulnerability. When an attacker exploits a kernel primitive to modify in-memory state without touching disk, the investigator’s ability to reconstruct what happened depends entirely on whether the right syscall-level events were captured at the time of exploitation. If the answer is no, the page cache modification is effectively invisible after the fact, especially on systems where memory is not preserved between reboot and investigation.

This connects directly to a principle that deserves more attention in DFIR readiness discussions: logging and telemetry need to be designed with forensic use in mind, not only with operational monitoring in mind. A SIEM that ingests only application-level logs, authentication events, and network flows will see the output of a Copy Fail exploitation (an unexpected root shell, unusual process relationships) but not the cause. An audit setup that captures socket, splice, sendmsg, and recvmsg syscalls with process context will see both. The difference is architectural, and it needs to be decided before the incident, not during.

Auditd rules targeting AF_ALG socket creation are trivial to write and impose minimal overhead. BPF-based monitoring frameworks such as Cilium Tetragon can capture this at even finer granularity, correlating scatterlist operations with the file pages they target. Neither approach helps retroactively. Both help the next time.

An eight-year-old lesson about performance and security

The root cause of Copy Fail is not a complex algorithmic error or a subtle interaction between subsystems. It is a performance optimisation applied to a security-sensitive component without sufficiently tracing its downstream effects on memory ownership. The 2017 commit worked correctly for the AEAD algorithms that were common at the time. The authencesn path, which uses a two-stage in-place write with Extended Sequence Number rearrangement, created a condition that was not anticipated.

This is not an unusual failure mode. Performance optimisations in low-level kernel code often have effects that are difficult to reason about fully at the time of writing, and they accumulate over kernel versions as new code paths interact with them in ways the original author did not anticipate. The same dynamic that produced Copy Fail produced Dirty Pipe in 2022, Dirty COW in 2016, and a range of other kernel vulnerabilities that traced back to optimisations rather than to classic memory safety errors.

The practical implication is not that kernel optimisations are bad, but that security analysis of performance-sensitive code paths needs to explicitly model what happens when memory ownership boundaries are crossed unexpectedly. For engineers working with kernel crypto APIs, the lesson is specific: in-place operations in crypto code paths that interact with splice and page cache are structurally dangerous, because splice brings file-backed pages into the kernel’s processing pipeline without the ownership transfer that would normally accompany such an operation.

For everyone else, Copy Fail is a useful reminder that operational resilience is not only about backup and recovery. It is about building systems where the evidence necessary to detect, investigate, and respond to exploitation exists by design, and where a single unpatched node in a shared environment cannot silently become the entry point for a full cluster compromise.