Kernel synchronization in XNU: locks, atomics, and lock-free patterns

Every nontrivial kernel data structure needs concurrency control. XNU offers a small, opinionated set of synchronization primitives — fewer than Linux, more than a microkernel — and each has a specific niche. This article walks them in order of cost and use case, ending with the lock-class system that catches priority inversions and deadlocks.

The four primitives

The XNU primitives, from cheapest to most expensive:

Atomic operations (OSCompareAndSwap, atomic_store_explicit, etc.) — single-instruction read-modify-write, no lock at all.
Spinlocks (lck_spin_t) — bare-metal busy-wait. Cheap if uncontended; brutal under contention.
Mutexes (lck_mtx_t) — block-and-wait when contended. The default for most kernel data.
Reader-writer locks (lck_rw_t) — many readers OR one writer. Useful when reads dominate.

apple-oss-distributions/xnuosfmk/kern/locks.hThe lock type declarations — lck_spin_t, lck_mtx_t, lck_rw_t, plus the group infrastructure.View on GitHub(line —) apple-oss-distributions/xnuosfmk/kern/locks.cThe actual implementation — lock/unlock fast paths, contended slow paths, debug instrumentation.View on GitHub(line —)

Atomic operations — when no lock is needed

For simple counters, flags, and pointer swaps, you don't need a lock. The CPU's atomic instructions (compare-and-swap, fetch-and-add, exclusive load/store on ARM) are enough.

XNU exposes these via:

os/atomic.h — modern C11-style atomic_* operations, preferred for new code.
libkern/OSAtomic.h — older XNU-specific names (OSCompareAndSwap, OSAddAtomic), still everywhere in legacy code.

A reference count, for example:

os_atomic_inc(&obj->refcount, relaxed);   // increment, no ordering needed
if (os_atomic_dec(&obj->refcount, release) == 0) {  // decrement with release barrier
    free(obj);
}

The release ordering on the final decrement ensures any stores that produced obj's state happen-before any thread that sees refcount=0 and frees it.

Lock-free programming with atomics is hard to get right — the memory ordering specifiers (relaxed, acquire, release, seq_cst) need careful thinking. The kernel uses them in hot paths (zone allocator's per-CPU caches, scheduler runqueues), but most kernel code reaches for a lck_mtx first.

Spinlocks — for very short critical sections

A lck_spin_t is a busy-wait lock. A thread trying to acquire spins on an atomic flag until it can claim the lock. While spinning:

The CPU is not available for other work.
The holder of the lock had better be running on another CPU; if it's been preempted, the spinner is wasting time on a thread that's not making progress.

For these reasons, spinlocks are appropriate only when:

The critical section is very short (a few dozen instructions).
The holder won't block, sleep, or take any other lock that might block.
You're at a priority level where you can't sleep anyway (interrupt context, scheduler internals).

The kernel's top-half interrupt handlers (see the interrupt handling article) often need spinlocks because they run with interrupts disabled and can't sleep. The scheduler itself uses spinlocks to protect runqueue manipulation.

lck_spin_lock(&proc_list_spin);
// short, no-sleep critical section
lck_spin_unlock(&proc_list_spin);

Mutexes — the default

lck_mtx_t is the workhorse. Threads that can't acquire block and sleep; they're woken when the holder releases. Unlike spinlocks, the OS can put the waiting thread to use elsewhere.

lck_mtx_lock(&my_data_lock);
// arbitrarily long critical section, may sleep, may take other locks
lck_mtx_unlock(&my_data_lock);

Most kernel data structures protect themselves with a lck_mtx_t. The proc table, the vnode list, IOKit objects, file descriptors — all guarded by mutexes.

XNU mutexes have a few interesting properties:

Adaptive spinning: a short spin attempt before blocking, on the theory that the holder might release very soon and avoiding a context switch wins.
Priority inheritance: a low-priority thread holding a mutex contended by a high-priority thread gets boosted to the higher priority for the duration. Prevents priority inversion.
Lock ordering tracking (debug builds): the kernel records the order locks are acquired and panics if it ever sees a reverse-order acquisition that could deadlock.

Reader-writer locks — when reads dominate

lck_rw_t allows either:

Many threads holding the lock as readers simultaneously, or
One thread holding it as a writer, with no readers.

Useful when a data structure is read often and written rarely. The vnode name cache, the IORegistry tree, the network routing table — all read-heavy and protected by rwlocks.

lck_rw_lock_shared(&cache_rw);
// read-only access; multiple readers can be here at once
lck_rw_unlock_shared(&cache_rw);

lck_rw_lock_exclusive(&cache_rw);
// exclusive write access; readers and other writers blocked
lck_rw_unlock_exclusive(&cache_rw);

The cost of an rwlock is higher than a mutex (more state to track, more complex contention handling). For low-read-frequency data, a mutex is cheaper.

Lock groups — XNU's lock classification system

Every lock in XNU belongs to a lock group (lck_grp_t). The group is created once at module init:

lck_grp_t *my_grp = lck_grp_alloc_init("MySubsystem", LCK_GRP_ATTR_NULL);
lck_mtx_init(&my_mtx, my_grp, LCK_ATTR_NULL);

The group serves several purposes:

Per-group statistics: contention counts, hold times. Visible via lockstat and kernel telemetry.
Deadlock detection in debug builds: lock acquisition order is recorded per-group; cross-group ordering violations are panic-able.
Reset / cleanup: tearing down a subsystem can destroy all its locks via the group.

lockstat(1) is the diagnostic tool — when you're investigating "why is this kernel slow", looking at top contended lock groups is often the answer.

IRQ-safe variants

Spinlocks come in two flavors:

lck_spin_lock — regular. Caller responsible for ensuring no IRQ context.
lck_spin_lock_grp_irq_set — disables interrupts while held. Required if both an interrupt handler and a thread context can take the lock; without the IRQ disable, a thread holding the lock that's interrupted by a handler trying to acquire the same lock would deadlock on the same CPU.

This concern doesn't apply to mutexes — they can't be taken in interrupt context anyway, so there's no risk of an interrupt taking a mutex the current thread holds.

The cost ladder

Rough cost on a modern Apple Silicon Mac:

Primitive	Uncontended	Contended (high load)
Atomic op	~1-5 cycles	depends on cache line
Spinlock	~10-20 cycles	spins burning CPU
Mutex (no contention)	~30-50 cycles	block + reschedule
RWLock	~50-100 cycles	many readers OK

The right choice depends on how often you contend and how long the critical section is. The kernel has thousands of locks; the choice matters at scale.

Lock-free patterns in the wild

A few places where XNU avoids locks via clever data structure design:

Per-CPU zone caches — each CPU has its own free-list; allocations rarely cross CPUs, so the hot path is lock-free. See the kernel allocators article.
RCU-style patterns — readers see a consistent view via load-acquire on a pointer; writers swap the pointer with release ordering, deferring deallocation of the old object until no reader can be using it.
Wait-free queues — for high-frequency producer/consumer paths like the scheduler's runqueue and the IPC message queues.

These patterns are hard to write correctly. XNU uses them strategically — most code is mutex-protected, but the few percent that's hot is lock-free.

What surprises newcomers

Spinlocks are for very specific cases. Default to lck_mtx_t; reach for spin only when you can't block (interrupt context, scheduler internals).
Priority inheritance is built into mutexes. A low-QoS thread holding a mutex needed by a high-QoS one gets boosted. This is how QoS overrides propagate through lock chains too.
Lock groups are mandatory. You can't create a lock without naming a group; the group is what gives lockstat visibility.
Reverse-order lock acquisition panics in debug builds. XNU tracks order per group; an "always take A before B" rule, violated, will panic with a clear message in development kernels.

What to read next

apple-oss-distributions/xnuosfmk/kern/lck_attr.cLock attribute objects — debug flags, contention counting, priority inheritance options.View on GitHub(line —) apple-oss-distributions/xnuosfmk/kern/sched_prim.cThe scheduler — heavy user of spinlocks on the runqueue and per-processor structures.View on GitHub(line —)

And the context switch walkthrough — many of the locks XNU's hottest paths take live in the scheduler.