libdispatch internals: how GCD actually dispatches your blocks
Dispatch queues, the workqueue thread pool, voucher adoption, async vs sync — what happens between your dispatch_async call and your block running on a worker thread.
Grand Central Dispatch (GCD) is the userland concurrency framework everyone on macOS/iOS uses, whether they know it or not. Every dispatch_async, every URLSession.dataTask's completion handler, every Swift Task, every +async Objective-C method — they route through libdispatch.
This article walks what's actually happening underneath: the queue layers, the worker thread pool, voucher adoption, and the kernel-side machinery that makes blocking blocks not block worker threads.
Queues all the way down
A dispatch queue is the basic unit of serialization. Submit blocks to a queue and they run in submission order, one at a time:
dispatch_queue_t q = dispatch_queue_create("com.acme.work", DISPATCH_QUEUE_SERIAL);
dispatch_async(q, ^{ /* runs eventually, after previous blocks */ });
Concurrent queues let many blocks run in parallel:
dispatch_queue_t q = dispatch_queue_create("com.acme.parallel", DISPATCH_QUEUE_CONCURRENT);
dispatch_async(q, ^{ /* may run in parallel with others on the same queue */ });
Every process has a few global queues, one per QoS class:
DISPATCH_QUEUE_PRIORITY_HIGH(USER_INITIATED).DISPATCH_QUEUE_PRIORITY_DEFAULT.DISPATCH_QUEUE_PRIORITY_LOW(UTILITY).DISPATCH_QUEUE_PRIORITY_BACKGROUND(BACKGROUND).
User-created queues target one of the global queues. Your serial queue might target default global; submitted blocks run on the target's worker threads but serialized through your queue's mutex.
The workqueue — kernel-side thread pool
The actual threads that run your blocks come from a kernel-managed pool called the workqueue (sometimes spelled pthread_workqueue). One pool per process, with subsystem buckets keyed by QoS.
When libdispatch needs to run a block:
- It tries to acquire a free worker thread from the right QoS bucket.
- If none is free and the pool has room, the kernel spawns a new worker thread.
- The new (or reused) thread starts running blocks from the queue.
apple-oss-distributions/xnubsd/kern/pthread_shims.cThe kernel side of the workqueue — thread allocation, QoS bucketing, ASR (Apple Sensor Rate?) integration.View on GitHub(line —) apple-oss-distributions/libpthreadsrc/qos.cThe userspace side that requests workqueue threads from the kernel.View on GitHub(line —)
The kernel-managed pool is shared across libdispatch and direct pthread_workqueue users; the kernel manages thread counts dynamically based on system load and CPU count.
A key property: workqueue threads are kernel-resident. Userspace can't create or destroy them directly; the kernel does, on libdispatch's behalf.
Why dispatch_sync from a worker is dangerous
dispatch_sync(target, block) blocks the calling thread until block runs on target. From any non-worker thread, this is safe.
From a worker thread, calling dispatch_sync on a queue that's currently executing a block on another worker is a recipe for deadlock. The classic mistake: a UI thread submits work to a serial queue with dispatch_sync; the worker that handles that queue happens to need a result from the UI thread; deadlock.
libdispatch has heuristics to detect some of these — DISPATCH_DEBUG=1 in the environment surfaces warnings about cycles. Modern dispatch APIs (Swift async/await) avoid the problem at the language level.
Voucher adoption
Every dispatch_async implicitly captures the current voucher (Mach voucher) — the QoS, resource accounting bucket, and other context the calling thread is currently using. When the block runs on a worker, the worker adopts the captured voucher.
This is how QoS propagates through async work:
- UI thread at USER_INTERACTIVE QoS calls
dispatch_async(background_queue, ^{ ... }). - The async captures the current voucher (carrying USER_INTERACTIVE).
- A worker thread runs the block. While running, the worker has the USER_INTERACTIVE voucher adopted, even though the queue is "background."
- When the block returns, the worker releases the voucher and reverts to its default state.
The result: a user-interactive operation's async followups inherit the right priority. The system stays responsive even when work chains across queues.
apple-oss-distributions/libdispatchsrc/voucher.cVoucher capture, adoption, and release for dispatch blocks.View on GitHub(line —)Dispatch sources — kqueue-backed event handlers
A dispatch source is a libdispatch object that fires a handler when a system event occurs:
DISPATCH_SOURCE_TYPE_READ/WRITE— file descriptor ready for I/O.DISPATCH_SOURCE_TYPE_VNODE— vnode changes (file rename, write, delete).DISPATCH_SOURCE_TYPE_PROC— process exit/fork.DISPATCH_SOURCE_TYPE_TIMER— periodic or one-shot timer.DISPATCH_SOURCE_TYPE_MACH_RECV— Mach port has a message.DISPATCH_SOURCE_TYPE_DATA_ADD/OR— user-triggered coalescing.
Under the hood, every dispatch source is a kqueue registration. libdispatch maintains a per-process kqueue, registers the source's event filter, and runs the source's handler block on its target queue when the kqueue fires.
This is why every macOS / iOS framework that needs to react to I/O — URLSession, FileHandle observers, NSNotificationCenter on a dispatch queue — uses dispatch sources internally rather than spawning threads to block in read().
Concurrent work and the wide-vs-narrow tradeoff
Submitting 10,000 blocks to a concurrent queue does NOT spawn 10,000 threads. The workqueue throttles concurrency based on:
- CPU count (typically a worker per logical CPU plus a small overhead).
- QoS — higher QoS gets more workers earlier.
- Memory pressure — under pressure, the pool shrinks.
You can request more concurrency with dispatch_apply (parallel for-loop), but the pool still caps it based on hardware availability. Submitting 10K blocks to a serial queue runs them strictly one at a time on a single worker.
Swift concurrency on top
Swift's async/await and Task are built on top of libdispatch's queue infrastructure. A Task runs on a "cooperative pool" which is itself a workqueue bucket; await suspends without blocking a worker (it's a continuation point); actor-isolated calls dispatch onto the actor's serial executor.
The end result: a Swift await doesn't block a worker thread. It frees the worker to run something else, then resumes on (potentially) a different worker when the awaited operation completes. The worker count stays bounded regardless of how many tasks are in flight.
This is the foundation of Swift Concurrency's scalability claim — "millions of tasks, bounded threads."
Inspecting dispatch behavior
DISPATCH_DEBUG=1 /path/to/your-app # warn on suspicious patterns
DISPATCH_VOUCHER_DEBUG=1 /path/to/your-app # trace voucher capture/adoption
In Instruments → System Trace, "Dispatch Activity" shows every block enqueue/dequeue. Useful for diagnosing "why is my UI laggy" — often the answer is "a serial queue is overloaded and the next block is blocked behind a slow earlier one."
What surprises newcomers
- A serial queue runs blocks on whichever worker thread is free. Same queue, potentially different threads across blocks. (
dispatch_get_specificcan identify "the" queue if you need that.) dispatch_syncfrom a worker to its own queue deadlocks instantly. libdispatch debug builds catch this.- Voucher adoption is invisible unless you're looking for it. It's what makes the system "feel responsive" without anyone having to manually pass QoS hints through every function call.
- Dispatch sources are kqueue events. The same kernel event filter that powers low-level network apps powers the high-level NSURLSession.
What to read next
apple-oss-distributions/libdispatchsrc/source.cDispatch source implementation — kqueue integration, coalescing, handler dispatch.View on GitHub(line —) apple-oss-distributions/libdispatchsrc/queue_internal.hThe queue data structures — atomic state machines that drive enqueue/dequeue.View on GitHub(line —)
And the kernel synchronization article — libdispatch uses many of the same primitives the kernel uses internally.