Kernel memory allocators in XNU: zalloc, kalloc, slab
The kernel's own malloc — a hierarchy of zone allocators, the kalloc heap, and slab caches for specific types. Different from user-side VM, and just as important.
The VM articles so far have covered how XNU manages memory for userspace — vm_map, pmap, the compressor, jetsam. The kernel needs to allocate its own data structures too: proc records, vm_objects, ipc_kmsgs, every single object you've encountered in this series. That's a separate allocator tier.
This article walks XNU's kernel-side allocator stack: the zone allocator at the bottom, kalloc as the general-purpose heap on top, and the per-type slab caches that make hot-path allocation cheap.
The page is the unit, but no one allocates pages directly
The lowest VM primitive is the page (4 KB or 16 KB depending on architecture). The kernel rarely allocates raw pages for itself — page granularity is too coarse for most kernel data structures (a vnode is a few hundred bytes; an ipc_kmsg even smaller).
So XNU stacks two more layers on top:
raw pages (vm_page)
↓ wholesale-to-retail
zone allocator (zalloc) — per-type slab caches
↓ general-purpose facade
kalloc — variable-size allocations, backed by zones
Drivers, the IPC subsystem, the scheduler — they all allocate from these layers. Tools like zprint(1) let you see every zone live.
Zone allocator (zalloc) — the per-type slab
A zone is a slab cache for objects of a single size or type. Each zone:
- Owns a list of pages it has allocated from the VM.
- Carves those pages into fixed-size elements (the type's size).
- Maintains a free-list of available elements.
- Allocates / frees in O(1) by pushing/popping the free-list head.
apple-oss-distributions/xnuosfmk/kern/zalloc.cThe zone allocator — XNU's foundational kernel allocator.View on GitHub(line —) apple-oss-distributions/xnuosfmk/kern/zalloc.hzone_t and the zone_alloc / zone_free / zone_create APIs.View on GitHub(line —)
When a subsystem needs to allocate many objects of the same type, it creates a dedicated zone at boot:
ipc_object_zones[IOT_PORT] = zone_create("ipc ports", sizeof(struct ipc_port), …);
Then it uses zalloc(zone) / zfree(zone, ptr) for every allocation. Hot path is a free-list pop — a handful of instructions.
Why a separate zone per type? A few reasons:
- Predictable size — every element in the zone is exactly N bytes, no fragmentation.
- Locality — elements of the same type cluster on the same pages; cache behavior is better.
- Quarantine and use-after-free detection — XNU's zone allocator has a "Guard" mode that doesn't immediately reuse freed elements, helping catch UAF bugs.
- Per-zone statistics —
zprintshows you exactly how much memory each subsystem is using.
zprint output looks like:
zone name elt_size cur_size max_size cur_count ...
ipc ports 88 524288 ∞ 5957
vnodes 272 3211264 ∞ 11800
threads 1024 4194304 ∞ 4096
This is the most useful kernel-memory observability tool on the system. If a kernel data structure is leaking, you'll see its zone's cur_count climbing.
kalloc — the general-purpose heap
Not every allocation has a dedicated zone. For one-off allocations of arbitrary size — kernel strings, temporary buffers, the kernel's equivalent of void *p = malloc(n) — XNU provides kalloc:
kalloc(n) rounds n up to the next power-of-two-ish bucket and allocates from the bucket's underlying zone. There's a zone for 16-byte allocations, one for 32-byte, 64-byte, all the way up to a megabyte. Beyond that, kalloc_large allocates whole pages from the VM.
char *buf = kalloc(128); // rounds up to 128-byte bucket
kfree(buf, 128); // caller has to remember the size!
Note the awkward kfree(ptr, size) — caller passes the original allocation size. This is so kfree knows which bucket to return to. Compared to libc free(ptr), which uses metadata to find the size, kernel kfree opts for "caller pays the bookkeeping" to keep allocation overhead low.
Why two layers?
You might ask: if kalloc is just a facade over zones, why isn't everything kalloc?
The answer is correctness and performance:
- Type-safety with named zones: when an
ipc_portallocation comes out ofipc_object_zones[IOT_PORT], the kernel knows everything in that zone is anipc_port. UAF or type confusion is detectable. - Per-type telemetry: when you ship a kernel with thousands of zones, you can answer "where is memory going?" precisely.
zprintshows exactly which subsystem is bloating. - No size argument on free: zone allocation knows its element size;
zfreedoesn't need the caller to track it.
So: subsystems that allocate many instances of a single type use a dedicated zone. Subsystems that allocate one-off buffers use kalloc.
Wired vs unwired
All kernel allocations are wired — they live in physical RAM, never paged out. The kernel's own data structures can't be paged because the pageout path itself needs to allocate kernel memory; circular dependencies would deadlock the system.
This is why kernel memory growth shows up as "wired memory" in Activity Monitor. A kernel leak doesn't compress, doesn't swap, doesn't trigger jetsam (well — eventually it does indirectly, because no memory is left for userspace).
The IOMalloc family
IOKit drivers use a slightly different facade — IOMalloc(n) / IOFree(ptr, n) — which under the hood calls kalloc for small sizes and allocates pages directly for large sizes. The IO* wrappers also exist for contiguous physical memory, DMA-suitable memory, etc.
A DMA buffer needs to be contiguous physical memory (the device's DMA engine sees physical addresses, not virtual). IOMallocContiguous allocates such a buffer; you can't get this from kalloc, which makes no contiguity guarantees.
A leak walked
Suppose a driver leaks IOKit allocations. The path to diagnose:
vm_statshows wired memory climbing.zprintshows a specific zone's cur_count growing without bound.- The zone name identifies the subsystem (e.g.,
IOMemoryDescriptor). - Source code of that subsystem reveals which
IOMallocisn't paired with aIOFree.
Compared to userspace leaks (where the leak is in some app's heap and tools like leaks/malloc_history apply), kernel leaks are easier to attribute to a subsystem but harder to attribute to a call site — Apple's internal kernel has additional KASAN-style instrumentation for this; the shipping kernel doesn't.
Lock-free fast paths
Modern XNU zone allocator has per-CPU caches: each CPU keeps a small free-list of elements it can pop without taking a global lock. Only when the per-CPU cache empties does the slow path take the zone's lock, refill the local cache, and retry.
The result: high-frequency allocations (small ipc_kmsgs during a Mach storm, mbufs during heavy network I/O) are essentially lock-free in the common case.
apple-oss-distributions/xnuosfmk/kern/zcache.cPer-CPU zone caches — the fast path for hot allocators.View on GitHub(line —)What surprises newcomers
- Most kernel allocations are slab-style, not heap-style. The "kernel heap" is mostly a thin facade over a tier of slab caches.
kfreerequires the size. This is intentional — saves per-allocation bookkeeping.- All kernel memory is wired. The kernel can never page out its own state; that's why kernel leaks are catastrophic.
zprintis your friend. If something on your Mac is bloating wired memory,zprintwill tell you which subsystem.
What to read next
apple-oss-distributions/xnuosfmk/kern/zone_internal.hThe internal zone structure — head/tail pointers, locks, statistics counters.View on GitHub(line —) apple-oss-distributions/xnuosfmk/vm/vm_kern.ckmem_alloc — the very-low-level allocator that gives raw page ranges to zones and kalloc.View on GitHub(line —)
And re-read the virtual memory overview — once you've seen the kernel-side allocators, the user-side VM machinery is the half that paged-in to the user.