Skip to content

Kernel synchronization in XNU: locks, atomics, and lock-free patterns

Spinlocks, mutexes, reader-writer locks, and lock-class groups — the synchronization primitives XNU offers, when each is appropriate, and how the per-CPU caches stay fast under contention.

Published 6 min read
XNU kernel synchronization primitivesComparison of XNU's four kernel synchronization primitives: atomic operations, spinlocks, mutexes, and reader-writer locks.CHEAPEST → MOST EXPENSIVEAtomic opsos_atomic_*COST1–5 cyclesWHENsingle-word reads,counters, flags,lock-free patternsCAVEATmemory-ordering is yourproblemSpinlocklck_spin_*COST10–20 cycles uncontendedWHENvery short criticalsections in interruptor scheduler contextCAVEATburns CPU undercontention — caller mustnot blockMutex (default)lck_mtx_*COST30–50 cycles uncontendedWHENalmost everything —proc table, vnodes,IOKit objects, fdsCAVEATcan sleep; priorityinheritance built inReader/Writerlck_rw_*COST50–100 cyclesWHENread-heavy data —IORegistry, namecache, routing tableCAVEAThigher cost; only winswhen reads dominateDefaultsDefault to lck_mtx. Reach for spinlocks only when you can't sleep (interrupt context, scheduler internals).Reach for atomics when the data fits a single word and you understand memory ordering.

Every nontrivial kernel data structure needs concurrency control. XNU offers a small, opinionated set of synchronization primitives — fewer than Linux, more than a microkernel — and each has a specific niche. This article walks them in order of cost and use case, ending with the lock-class system that catches priority inversions and deadlocks.

The four primitives

The XNU primitives, from cheapest to most expensive:

  • Atomic operations (OSCompareAndSwap, atomic_store_explicit, etc.) — single-instruction read-modify-write, no lock at all.
  • Spinlocks (lck_spin_t) — bare-metal busy-wait. Cheap if uncontended; brutal under contention.
  • Mutexes (lck_mtx_t) — block-and-wait when contended. The default for most kernel data.
  • Reader-writer locks (lck_rw_t) — many readers OR one writer. Useful when reads dominate.

apple-oss-distributions/xnuosfmk/kern/locks.hThe lock type declarations — lck_spin_t, lck_mtx_t, lck_rw_t, plus the group infrastructure.View on GitHub(line ) apple-oss-distributions/xnuosfmk/kern/locks.cThe actual implementation — lock/unlock fast paths, contended slow paths, debug instrumentation.View on GitHub(line )

Atomic operations — when no lock is needed

For simple counters, flags, and pointer swaps, you don't need a lock. The CPU's atomic instructions (compare-and-swap, fetch-and-add, exclusive load/store on ARM) are enough.

XNU exposes these via:

  • os/atomic.h — modern C11-style atomic_* operations, preferred for new code.
  • libkern/OSAtomic.h — older XNU-specific names (OSCompareAndSwap, OSAddAtomic), still everywhere in legacy code.

A reference count, for example:

os_atomic_inc(&obj->refcount, relaxed);   // increment, no ordering needed
if (os_atomic_dec(&obj->refcount, release) == 0) {  // decrement with release barrier
    free(obj);
}

The release ordering on the final decrement ensures any stores that produced obj's state happen-before any thread that sees refcount=0 and frees it.

Lock-free programming with atomics is hard to get right — the memory ordering specifiers (relaxed, acquire, release, seq_cst) need careful thinking. The kernel uses them in hot paths (zone allocator's per-CPU caches, scheduler runqueues), but most kernel code reaches for a lck_mtx first.

Spinlocks — for very short critical sections

A lck_spin_t is a busy-wait lock. A thread trying to acquire spins on an atomic flag until it can claim the lock. While spinning:

  • The CPU is not available for other work.
  • The holder of the lock had better be running on another CPU; if it's been preempted, the spinner is wasting time on a thread that's not making progress.

For these reasons, spinlocks are appropriate only when:

  • The critical section is very short (a few dozen instructions).
  • The holder won't block, sleep, or take any other lock that might block.
  • You're at a priority level where you can't sleep anyway (interrupt context, scheduler internals).

The kernel's top-half interrupt handlers (see the interrupt handling article) often need spinlocks because they run with interrupts disabled and can't sleep. The scheduler itself uses spinlocks to protect runqueue manipulation.

lck_spin_lock(&proc_list_spin);
// short, no-sleep critical section
lck_spin_unlock(&proc_list_spin);

Mutexes — the default

lck_mtx_t is the workhorse. Threads that can't acquire block and sleep; they're woken when the holder releases. Unlike spinlocks, the OS can put the waiting thread to use elsewhere.

lck_mtx_lock(&my_data_lock);
// arbitrarily long critical section, may sleep, may take other locks
lck_mtx_unlock(&my_data_lock);

Most kernel data structures protect themselves with a lck_mtx_t. The proc table, the vnode list, IOKit objects, file descriptors — all guarded by mutexes.

XNU mutexes have a few interesting properties:

  • Adaptive spinning: a short spin attempt before blocking, on the theory that the holder might release very soon and avoiding a context switch wins.
  • Priority inheritance: a low-priority thread holding a mutex contended by a high-priority thread gets boosted to the higher priority for the duration. Prevents priority inversion.
  • Lock ordering tracking (debug builds): the kernel records the order locks are acquired and panics if it ever sees a reverse-order acquisition that could deadlock.

Reader-writer locks — when reads dominate

lck_rw_t allows either:

  • Many threads holding the lock as readers simultaneously, or
  • One thread holding it as a writer, with no readers.

Useful when a data structure is read often and written rarely. The vnode name cache, the IORegistry tree, the network routing table — all read-heavy and protected by rwlocks.

lck_rw_lock_shared(&cache_rw);
// read-only access; multiple readers can be here at once
lck_rw_unlock_shared(&cache_rw);

lck_rw_lock_exclusive(&cache_rw);
// exclusive write access; readers and other writers blocked
lck_rw_unlock_exclusive(&cache_rw);

The cost of an rwlock is higher than a mutex (more state to track, more complex contention handling). For low-read-frequency data, a mutex is cheaper.

Lock groups — XNU's lock classification system

Every lock in XNU belongs to a lock group (lck_grp_t). The group is created once at module init:

lck_grp_t *my_grp = lck_grp_alloc_init("MySubsystem", LCK_GRP_ATTR_NULL);
lck_mtx_init(&my_mtx, my_grp, LCK_ATTR_NULL);

The group serves several purposes:

  • Per-group statistics: contention counts, hold times. Visible via lockstat and kernel telemetry.
  • Deadlock detection in debug builds: lock acquisition order is recorded per-group; cross-group ordering violations are panic-able.
  • Reset / cleanup: tearing down a subsystem can destroy all its locks via the group.

lockstat(1) is the diagnostic tool — when you're investigating "why is this kernel slow", looking at top contended lock groups is often the answer.

IRQ-safe variants

Spinlocks come in two flavors:

  • lck_spin_lock — regular. Caller responsible for ensuring no IRQ context.
  • lck_spin_lock_grp_irq_set — disables interrupts while held. Required if both an interrupt handler and a thread context can take the lock; without the IRQ disable, a thread holding the lock that's interrupted by a handler trying to acquire the same lock would deadlock on the same CPU.

This concern doesn't apply to mutexes — they can't be taken in interrupt context anyway, so there's no risk of an interrupt taking a mutex the current thread holds.

The cost ladder

Rough cost on a modern Apple Silicon Mac:

PrimitiveUncontendedContended (high load)
Atomic op~1-5 cyclesdepends on cache line
Spinlock~10-20 cyclesspins burning CPU
Mutex (no contention)~30-50 cyclesblock + reschedule
RWLock~50-100 cyclesmany readers OK

The right choice depends on how often you contend and how long the critical section is. The kernel has thousands of locks; the choice matters at scale.

Lock-free patterns in the wild

A few places where XNU avoids locks via clever data structure design:

  • Per-CPU zone caches — each CPU has its own free-list; allocations rarely cross CPUs, so the hot path is lock-free. See the kernel allocators article.
  • RCU-style patterns — readers see a consistent view via load-acquire on a pointer; writers swap the pointer with release ordering, deferring deallocation of the old object until no reader can be using it.
  • Wait-free queues — for high-frequency producer/consumer paths like the scheduler's runqueue and the IPC message queues.

These patterns are hard to write correctly. XNU uses them strategically — most code is mutex-protected, but the few percent that's hot is lock-free.

What surprises newcomers

  • Spinlocks are for very specific cases. Default to lck_mtx_t; reach for spin only when you can't block (interrupt context, scheduler internals).
  • Priority inheritance is built into mutexes. A low-QoS thread holding a mutex needed by a high-QoS one gets boosted. This is how QoS overrides propagate through lock chains too.
  • Lock groups are mandatory. You can't create a lock without naming a group; the group is what gives lockstat visibility.
  • Reverse-order lock acquisition panics in debug builds. XNU tracks order per group; an "always take A before B" rule, violated, will panic with a clear message in development kernels.

apple-oss-distributions/xnuosfmk/kern/lck_attr.cLock attribute objects — debug flags, contention counting, priority inheritance options.View on GitHub(line ) apple-oss-distributions/xnuosfmk/kern/sched_prim.cThe scheduler — heavy user of spinlocks on the runqueue and per-processor structures.View on GitHub(line )

And the context switch walkthrough — many of the locks XNU's hottest paths take live in the scheduler.

Related

From the moment an interrupt fires to the moment a different thread is running on the core — trap, AST, thread_invoke, ASID switch, return.
Inside a Mach message: how it's allocated, queued, woken on, and copied. Plus vouchers — the QoS-and-resource-propagation system most people don't notice.
Real-time, fixed-priority, timeshare, idle — four scheduling classes, 128 priorities, and a QoS layer on top. Here's how XNU picks a thread to put on a core.