Skip to content

Virtual memory in XNU: pmap, the VM map, and the compressor

Every macOS process gets a private address space it can't possibly afford. Here's how XNU gives it one anyway — pmap, vm_map, the compressor, and jetsam.

Published 5 min read
XNU page fault flowPage fault from userspace enters vm_fault, which branches into a soft fault (page already resident), hard fault (must page in from compressor / swap / vnode), or permission fault (COW or SIGBUS), ending with a pmap_enter to install the new PTE.userspace access*p = 0; — faultvm_fault()osfmk/vm/vm_fault.cSOFT FAULTpage resident, missing PTEpmap_enter()install PTE; doneHARD FAULTpage not in memoryalloc frame + page invnode / compressor / swappmap_enterPERMISSION FAULTW on RO mappingCOW: copy + install RWelse: SIGBUSOn entry, vm_fault knows:• the faulting virtual address• the kind of access (R/W/X)• the task's vm_map• the task's pmapMost faults on a hot Mac are soft —pages live across multiple tasks viashared cache + fork COW.

A 16 GB Mac runs a few hundred processes. Each one sees a 64-bit virtual address space and acts like it owns it. If the kernel allocated physical memory eagerly for every page each process touched, the machine would die in seconds. It doesn't, because XNU is doing something interesting at every layer between the process's pointer and the actual DRAM row.

This is the article about those layers.

Two halves: machine-independent and machine-dependent

XNU splits virtual memory into:

  • vm_map — the machine-independent, per-task description: "process X has these ranges mapped, with these permissions, backed by these objects."
  • pmap — the machine-dependent layer: "process X's pmap is currently using these page tables on this architecture."

The split is original Mach. vm_map is the policy and bookkeeping; pmap is the page-table walker that knows about TTBR0_EL1 on arm64 or CR3 on x86_64.

apple-oss-distributions/xnuosfmk/vm/vm_map.hvm_map_t — the per-task list of mapped regions and their attributes.View on GitHub(line ) apple-oss-distributions/xnuosfmk/arm/pmap.cThe arm64 pmap. Read this if you want to see the ARMv8.4 translation regime up close.View on GitHub(line ) apple-oss-distributions/xnuosfmk/i386/pmap.cThe x86_64 pmap.View on GitHub(line )

A process's view of memory is its vm_map. The CPU's view is the pmap. Keeping these in sync is most of what the kernel does on every fault.

What's in a vm_map

A vm_map is a sorted list of vm_map_entry records. Each entry covers a contiguous virtual range and points at a vm_object that backs it. The vm_object knows where the pages live: in physical memory, paged out to the compressor, or in a file (mmap), or copy-on-write derived from another object.

apple-oss-distributions/xnuosfmk/vm/vm_object.hvm_object — the source of truth for what's at a page if it's not currently resident.View on GitHub(line )

This is why mmap(2) is cheap: you're creating a vm_map_entry pointing at a vm_object whose backing is a vnode. The pages don't materialize until you touch them. The first read of mmap'd memory faults, the fault handler consults the vm_object, the file's page comes in, the pmap gets an entry, and now the CPU can read it directly next time.

apple-oss-distributions/xnuosfmk/vm/vm_fault.cvm_fault — the soft-page-fault entry point. Every demand-paged read lands here.View on GitHub(line )

Allocation in userspace: it's all vm_allocate

malloc, mmap of anonymous pages, the stack, even pthread_create's thread stacks — they all bottom out at Mach's vm_allocate, which creates a new vm_map_entry backed by an anonymous vm_object. The pages are zero-fill on first touch.

This unification is why on macOS the boundary between "stack", "heap", and "mmap region" is much fuzzier than the Linux mental model: they're the same kind of thing under the hood, distinguished only by the userspace allocator's intent.

Faults: the kernel's busiest path

Three kinds:

  1. Soft fault. The page is mapped in the vm_map but not in the pmap. The kernel installs a pmap entry pointing at the already-resident page. Fast.
  2. Hard fault. The page is backed by a vm_object but the page isn't in memory — it's in a file, in the compressor, or zero-fill on first touch. The kernel materializes it, then installs the pmap entry.
  3. Permission fault. The mapping is read-only and the process tried to write. For COW pages: copy, install writable mapping. For genuinely read-only mappings: signal SIGBUS.

Read vm_fault once and you'll have a much clearer model of why a 50ms stutter sometimes happens on otherwise-idle code: it's the path from an unmapped pointer to an installed PTE, and it's not always fast.

The compressor: macOS's swap-without-disk

Modern macOS doesn't want to swap to SSD unless it has to. SSDs have finite writes, and swapping is a write-amplification festival. So XNU has a compressor: a pool of in-RAM compressed pages that act like a "soft swap." When memory gets tight, the kernel picks rarely-used pages, compresses them in place with WKdm, and frees the physical frames.

apple-oss-distributions/xnuosfmk/vm/vm_compressor.cThe compressor — WKdm, LZ4, and the policy that decides when to compress.View on GitHub(line )

The win: typical Mac workloads see 2–4× compression ratios, so the kernel reclaims half-to-three-quarters of the memory those pages occupied without ever touching the disk. The lose: compression is CPU work, and decompressing on access adds latency.

You can see it live in Activity Monitor → Memory → "Memory used" and "Compressed". When the second column starts climbing, the compressor is doing its job. When it climbs and Swap Used is also climbing, the compressor is full and the kernel is finally writing actual swap files.

apple-oss-distributions/xnuosfmk/vm/vm_compressor_swap_default.cThe fallback when in-memory compression isn't enough — actual swap file management.View on GitHub(line )

Jetsam: the memory pressure killer

Run a Mac on low RAM and you'll eventually see the kernel terminate processes. That's jetsam (named after ship cargo thrown overboard). Each process has a jetsam priority set by launchd based on its band (foreground app, background daemon, helper); when free memory drops below a band's threshold, the kernel kills the lowest-priority process in that band first.

apple-oss-distributions/xnubsd/kern/kern_memorystatus.cmemorystatus — jetsam, memory bands, kill policy. The kernel side of 'Mac ran out of memory'.View on GitHub(line )

This is identical machinery to what iOS uses to keep the foreground app alive when memory gets tight. On a Mac with abundant RAM you'll never trip it; on a 8 GB Mac running heavy creative apps, you will, and you'll see the apps re-launch when you return to them.

Why the arm64 transition mattered for VM

When Apple Silicon shipped, two things changed for the VM subsystem:

  1. Unified memory architecture. The GPU and CPU share the same physical RAM, with the same page tables in a lot of paths. IOSurface allocations don't get copied to a discrete GPU's VRAM because there isn't one.
  2. APRR / SPRR. Apple Silicon added a hardware mechanism for fast switching between read/write and read/execute permissions on the same page. This is what makes JIT compilers (JavaScriptCore, Rosetta 2's translation cache) cheap on the M-series — flipping a page from W to X used to cost a TLB shootdown across cores; APRR makes it almost free.
apple-oss-distributions/xnuosfmk/arm/pmap.cSearch this file for APRR / SPRR to see how XNU manages Apple-specific page permission bits.View on GitHub(line )

Walk a fault from start to finish:

  1. osfmk/arm/trap.c (or i386/trap.c) — the architecture's fault entry into the kernel.
  2. vm_fault.c — the dispatcher.
  3. vm_object.c — the page lookup.
  4. pmap.c — installing the final PTE.

That single trace is the spinal cord of the kernel's memory subsystem. Once you've walked it, every other memory feature — COW fork, mmap, JIT regions, the compressor's reclaim — is a variation on the same theme.

And the Mach IPC article is the natural follow-up: out-of-line message descriptors are how VM regions move between tasks. Same primitive, different message id.

Related

clonefile, fclonefileat, fs_snapshot — three syscalls that let you copy 50 GB in 50 milliseconds. Here's what happens under each one, and what doesn't get copied.
What changed in XNU when Apple shipped its own ARM silicon — P/E cores, APRR page-permission switching, the AMX matrix coprocessor, and Rosetta 2.
Same IOKit object model, userland process. Why kexts are dying, what DriverKit gives you, and how a USB driver actually crosses the boundary.