libdispatch internals: how GCD actually dispatches your blocks

Grand Central Dispatch (GCD) is the userland concurrency framework everyone on macOS/iOS uses, whether they know it or not. Every dispatch_async, every URLSession.dataTask's completion handler, every Swift Task, every +async Objective-C method — they route through libdispatch.

This article walks what's actually happening underneath: the queue layers, the worker thread pool, voucher adoption, and the kernel-side machinery that makes blocking blocks not block worker threads.

Queues all the way down

A dispatch queue is the basic unit of serialization. Submit blocks to a queue and they run in submission order, one at a time:

dispatch_queue_t q = dispatch_queue_create("com.acme.work", DISPATCH_QUEUE_SERIAL);
dispatch_async(q, ^{ /* runs eventually, after previous blocks */ });

Concurrent queues let many blocks run in parallel:

dispatch_queue_t q = dispatch_queue_create("com.acme.parallel", DISPATCH_QUEUE_CONCURRENT);
dispatch_async(q, ^{ /* may run in parallel with others on the same queue */ });

Every process has a few global queues, one per QoS class:

DISPATCH_QUEUE_PRIORITY_HIGH (USER_INITIATED).
DISPATCH_QUEUE_PRIORITY_DEFAULT.
DISPATCH_QUEUE_PRIORITY_LOW (UTILITY).
DISPATCH_QUEUE_PRIORITY_BACKGROUND (BACKGROUND).

User-created queues target one of the global queues. Your serial queue might target default global; submitted blocks run on the target's worker threads but serialized through your queue's mutex.

apple-oss-distributions/libdispatchsrc/queue.cdispatch_queue_create and the queue lifecycle.View on GitHub

The workqueue — kernel-side thread pool

The actual threads that run your blocks come from a kernel-managed pool called the workqueue (sometimes spelled pthread_workqueue). One pool per process, with subsystem buckets keyed by QoS.

When libdispatch needs to run a block:

It tries to acquire a free worker thread from the right QoS bucket.
If none is free and the pool has room, the kernel spawns a new worker thread.
The new (or reused) thread starts running blocks from the queue.

apple-oss-distributions/xnubsd/kern/pthread_shims.cThe kernel side of the workqueue — thread allocation, QoS bucketing, ASR (Apple Sensor Rate?) integration.View on GitHub(line —) apple-oss-distributions/libpthreadsrc/qos.cThe userspace side that requests workqueue threads from the kernel.View on GitHub(line —)

The kernel-managed pool is shared across libdispatch and direct pthread_workqueue users; the kernel manages thread counts dynamically based on system load and CPU count.

A key property: workqueue threads are kernel-resident. Userspace can't create or destroy them directly; the kernel does, on libdispatch's behalf.

Why dispatch_sync from a worker is dangerous

dispatch_sync(target, block) blocks the calling thread until block runs on target. From any non-worker thread, this is safe.

From a worker thread, calling dispatch_sync on a queue that's currently executing a block on another worker is a recipe for deadlock. The classic mistake: a UI thread submits work to a serial queue with dispatch_sync; the worker that handles that queue happens to need a result from the UI thread; deadlock.

libdispatch has heuristics to detect some of these — DISPATCH_DEBUG=1 in the environment surfaces warnings about cycles. Modern dispatch APIs (Swift async/await) avoid the problem at the language level.

Voucher adoption

Every dispatch_async implicitly captures the current voucher (Mach voucher) — the QoS, resource accounting bucket, and other context the calling thread is currently using. When the block runs on a worker, the worker adopts the captured voucher.

This is how QoS propagates through async work:

UI thread at USER_INTERACTIVE QoS calls dispatch_async(background_queue, ^{ ... }).
The async captures the current voucher (carrying USER_INTERACTIVE).
A worker thread runs the block. While running, the worker has the USER_INTERACTIVE voucher adopted, even though the queue is "background."
When the block returns, the worker releases the voucher and reverts to its default state.

The result: a user-interactive operation's async followups inherit the right priority. The system stays responsive even when work chains across queues.

apple-oss-distributions/libdispatchsrc/voucher.cVoucher capture, adoption, and release for dispatch blocks.View on GitHub

Dispatch sources — kqueue-backed event handlers

A dispatch source is a libdispatch object that fires a handler when a system event occurs:

DISPATCH_SOURCE_TYPE_READ / WRITE — file descriptor ready for I/O.
DISPATCH_SOURCE_TYPE_VNODE — vnode changes (file rename, write, delete).
DISPATCH_SOURCE_TYPE_PROC — process exit/fork.
DISPATCH_SOURCE_TYPE_TIMER — periodic or one-shot timer.
DISPATCH_SOURCE_TYPE_MACH_RECV — Mach port has a message.
DISPATCH_SOURCE_TYPE_DATA_ADD / OR — user-triggered coalescing.

Under the hood, every dispatch source is a kqueue registration. libdispatch maintains a per-process kqueue, registers the source's event filter, and runs the source's handler block on its target queue when the kqueue fires.

This is why every macOS / iOS framework that needs to react to I/O — URLSession, FileHandle observers, NSNotificationCenter on a dispatch queue — uses dispatch sources internally rather than spawning threads to block in read().

Concurrent work and the wide-vs-narrow tradeoff

Submitting 10,000 blocks to a concurrent queue does NOT spawn 10,000 threads. The workqueue throttles concurrency based on:

CPU count (typically a worker per logical CPU plus a small overhead).
QoS — higher QoS gets more workers earlier.
Memory pressure — under pressure, the pool shrinks.

You can request more concurrency with dispatch_apply (parallel for-loop), but the pool still caps it based on hardware availability. Submitting 10K blocks to a serial queue runs them strictly one at a time on a single worker.

Swift concurrency on top

Swift's async/await and Task are built on top of libdispatch's queue infrastructure. A Task runs on a "cooperative pool" which is itself a workqueue bucket; await suspends without blocking a worker (it's a continuation point); actor-isolated calls dispatch onto the actor's serial executor.

The end result: a Swift await doesn't block a worker thread. It frees the worker to run something else, then resumes on (potentially) a different worker when the awaited operation completes. The worker count stays bounded regardless of how many tasks are in flight.

This is the foundation of Swift Concurrency's scalability claim — "millions of tasks, bounded threads."

Inspecting dispatch behavior

DISPATCH_DEBUG=1 /path/to/your-app                # warn on suspicious patterns
DISPATCH_VOUCHER_DEBUG=1 /path/to/your-app        # trace voucher capture/adoption

In Instruments → System Trace, "Dispatch Activity" shows every block enqueue/dequeue. Useful for diagnosing "why is my UI laggy" — often the answer is "a serial queue is overloaded and the next block is blocked behind a slow earlier one."

What surprises newcomers

A serial queue runs blocks on whichever worker thread is free. Same queue, potentially different threads across blocks. (dispatch_get_specific can identify "the" queue if you need that.)
dispatch_sync from a worker to its own queue deadlocks instantly. libdispatch debug builds catch this.
Voucher adoption is invisible unless you're looking for it. It's what makes the system "feel responsive" without anyone having to manually pass QoS hints through every function call.
Dispatch sources are kqueue events. The same kernel event filter that powers low-level network apps powers the high-level NSURLSession.

What to read next

apple-oss-distributions/libdispatchsrc/source.cDispatch source implementation — kqueue integration, coalescing, handler dispatch.View on GitHub(line —) apple-oss-distributions/libdispatchsrc/queue_internal.hThe queue data structures — atomic state machines that drive enqueue/dequeue.View on GitHub(line —)

And the kernel synchronization article — libdispatch uses many of the same primitives the kernel uses internally.