How fork(), exec(), and posix_spawn work on XNU

Every Unix book teaches process creation as fork() + exec(). On Linux that's a fine mental model — both are thin, well-loved syscalls. On macOS it's almost true: fork and exec exist and work, but they're the awkward way. The modern, preferred way is posix_spawn, and the reason traces straight back to Mach.

What fork() actually has to do

POSIX fork() duplicates the calling process. The parent and child both return from fork; the child has a fresh PID and inherits a copy of the parent's address space, open file descriptors, signal handlers, working directory, and credentials.

On a pure-BSD system the implementation is simple: clone the proc struct, copy the file descriptor table, set up COW for the VM, return twice. On XNU there's an extra layer — every BSD proc is married to a Mach task, and the Mach side has to be duplicated too:

Create a new Mach task with the parent's VM map cloned (COW).
Create a new Mach thread inside that task to be the child's first thread.
Allocate a new BSD proc and wire it to the new task.
Duplicate the file descriptor table.
Copy signal disposition state, credentials, the working directory, the umask, the controlling tty.
Wake the child's Mach thread; both return from fork.

apple-oss-distributions/xnubsd/kern/kern_fork.cfork1 — the entire fork machinery; reads as a slow tour through proc / task / fd duplication.View on GitHub

The thread duplication is the gnarly bit. Mach threads aren't trivially clonable; the child needs its own thread state set up so it returns from fork in userspace with the right register values. The BSD-side code reaches into the Mach side to make this happen.

Why fork is awkward on macOS

A multi-threaded program that forks is in immediate trouble. POSIX says only the forking thread exists in the child — the others are silently terminated. Any mutex the dead threads held in the parent is also held in the child, dead — leading to deadlocks when the surviving thread tries to acquire them.

On Linux this is solved by pthread_atfork, which lets a library register handlers to reset its locks across a fork. On macOS:

libdispatch (GCD) is not fork-safe. The worker threads vanish; the queue state is corrupted; any post-fork call into GCD likely deadlocks.
Foundation / Cocoa are not fork-safe. Most reach for some shared queue or Mach service connection during init.
Core Foundation is not fork-safe.

So in practice, a fork on macOS is only safe if you immediately exec — the slate gets wiped — or if you call only async-signal-safe syscalls between fork and exec, which means no Objective-C, no GCD, no NSLog. Most Mac apps that need to spawn helpers can't even use fork safely.

This is why posix_spawn is the preferred path on macOS, not fork+exec.

exec — replacing the address space without changing the process

execve() (and the libc wrappers like execl, execv) replaces the current process's address space with a fresh image:

Open and validate the new executable. Check Mach-O header, validate code signature, check entitlements.
Tear down the current task's VM map.
Build a new VM map: map the executable's segments, set up the stack with argv + envp + auxv.
Reset signal dispositions to default, close FDs marked FD_CLOEXEC, reset the umask, drop dropped-on-exec capabilities.
Hand control to dyld. The kernel doesn't run the user binary directly — it always starts in dyld, the dynamic linker, which maps libraries, runs initializers, then jumps to main.

apple-oss-distributions/xnubsd/kern/kern_exec.cexec_*, exec_mach_imgact — every exec on macOS goes through here.View on GitHub(line —) apple-oss-distributions/xnubsd/kern/mach_loader.cThe Mach-O loader — parses LC_SEGMENT, LC_CODE_SIGNATURE, LC_MAIN, sets up the new VM map.View on GitHub(line —)

The proc survives exec — same PID, same parent, same FDs (minus CLOEXEC), same credentials. The Mach task survives too, but its VM map is fully replaced. The original threads are terminated and a new main thread is created for the new image.

The fact that the new process always starts in dyld (not in the user binary's entry point) is why DYLD_* environment variables work — the kernel sets them up in auxv, dyld reads them on its way to running user code.

posix_spawn — the modern preferred path

posix_spawn is fork+exec collapsed into a single syscall, with a structured way to express the "what should be different in the child" intent up front. Rather than:

pid_t pid = fork();
if (pid == 0) {
    // child — do various setup
    dup2(pipe_fd, 1);
    chdir("/some/path");
    execve(prog, argv, envp);
}

you build a posix_spawnattr_t + posix_spawn_file_actions_t describing the changes, then call:

posix_spawn(&pid, prog, &file_actions, &attrs, argv, envp);

The kernel does everything atomically in the parent process's context. No half-broken state in the child. No GCD danger. No need to pretend the parent's locks aren't there.

apple-oss-distributions/xnubsd/kern/kern_exec.cposix_spawn — search for the function in this file; lives right next to exec.View on GitHub

On Apple platforms, posix_spawn is the path that:

launchd uses for every service it starts.
NSTask / Process (Swift) uses under the hood.
xcrun, xcodebuild, swiftc use to spawn compiler invocations.
Shell builtins like bash's $( … ) resolve to posix_spawn-equivalents through vfork+exec on Apple's libc.

Crucially, posix_spawn also has flags Apple adds beyond POSIX:

POSIX_SPAWN_SETEXEC — replace the current process (act like a plain exec).
POSIX_SPAWN_SETSID — start in a new session.
POSIX_SPAWN_START_SUSPENDED — child starts paused; you can attach a debugger.
Various _POSIX_SPAWN_OSX_* flags for sandboxing, jetsam priority, QoS.

These are how launchd configures a service exactly the way it wants it, in one syscall, with no race window.

What persists, what doesn't — quick reference

Resource	survives fork	survives exec	survives posix_spawn
PID	❌	✅	❌
Parent PID	✅	✅	✅
File descriptors	✅	✅ (no CLOEXEC)	configurable
Mach ports	❌¹	❌	❌
VM mappings	✅ (COW)	❌	❌
Threads (other than caller)	❌	❌	❌
Signal handlers	✅	reset to default	reset
Working directory	✅	✅	configurable
Credentials	✅	✅	✅

¹ Mach ports are a per-task resource. The child task has its own fresh ipc_space after fork; only ports the parent explicitly handed over via Mach IPC survive — and that's a different mechanism.

What surprises newcomers

fork is deprecated for app use on Apple platforms. Apple's own dev docs say so. Use posix_spawn or NSTask.
The kernel never executes a user binary directly. Every exec starts in dyld. Even /usr/bin/true.
The child of fork doesn't inherit the parent's threads but does inherit the memory those threads were operating on — including mid-flight locks. This is the source of nearly every fork bug.
exec on a script doesn't run the script directly — the kernel reads the shebang line and execs the interpreter, then the interpreter opens the script and reads it.

What to read next

For the loader:

apple-oss-distributions/xnubsd/kern/mach_loader.cMach-O loading — segments, code-signing validation, dyld handoff.View on GitHub(line —) apple-oss-distributions/dylddyld/main.cppdyld's main — where every process actually starts running its first line of user code.View on GitHub(line —)

And re-read the signals article once more — signal disposition is one of the trickier things fork and exec each handle differently.