The dyld shared cache and shared regions
Every system dylib pre-linked into one giant memory-mappable file, shared across every process on the system. Why a fresh process has 1 GB of virtual size but tiny resident memory.
Open Activity Monitor and look at the Virtual Memory column for any process. Even a tiny one — echo, cat, a hello-world program — sits around 1 GB of virtual size. The number doesn't change much between huge apps and trivial ones. The real number that scales with the app is Real Memory (resident set), and that's much smaller.
The explanation is the dyld shared cache — a single giant pre-linked file containing every system dylib, mapped into every process on the system at the same virtual address. This article walks how it works, why it exists, and what it means for processes.
What's in the shared cache
The shared cache lives at:
/System/Library/dyld/dyld_shared_cache_<arch>
On Apple Silicon, the file is around 3-5 GB. It contains:
- Every Apple-supplied
.dylibin the system:libSystem,libobjc,Foundation,AppKit,UIKit,CoreFoundation, every Metal library, every Core ML library, every PDFKit / WebKit / MapKit framework — hundreds of them. - The pre-resolved relocation chains between them.
- A hash table for fast symbol lookup.
- A minified version of dyld itself (the loader runs from here too).
Building the cache is a long, expensive offline process done at OS install time and at major system updates. The result: dyld can map every system dylib into a new process with one mmap call instead of hundreds of individual dlopens.
apple-oss-distributions/dylddyld/main.cppdyld's entry point — where the shared cache is mapped into a new process.View on GitHub(line —) apple-oss-distributions/dyldcache_builderThe offline cache builder — runs at OS install / update.View on GitHub(line —)
How it's mapped
When a new process starts, the kernel maps the shared cache into the new task's address space at a known address:
- The cache occupies a contiguous virtual range of around 1-2 GB.
- The mapping is shared across every process — the same physical pages back it everywhere.
- Permissions: text/code pages are read+execute, data pages are read+write (with COW so writes from one process don't affect others), const data is read-only.
Every process on the system maps the exact same virtual address range to the exact same physical pages for the cache. This is huge:
- Physical memory savings: every dylib is in RAM once, not per-process.
- L1/L2/L3 cache hits: dylib code executes from the same physical lines across processes; once one process warms up
objc_msgSendin the cache, every process benefits. - Page table sharing: the page-table entries for the cache region are identical across tasks; the kernel can share entire L3 page tables.
The performance win is substantial — process startup is dominated by mapping the shared cache, which is one mmap instead of hundreds of individual dylib opens.
Address-space slide (ASLR)
The cache's base address is randomized at boot (cache-level ASLR). Every process on the system maps it at the same address — but that address is different across boots, frustrating attackers who'd otherwise know exactly where objc_msgSend lives in any process's address space.
The slide is small (a few bits of randomness) but enough to defeat static gadget-address-based exploitation. Combined with per-process ASLR for the app's own code, the address layout is hard to predict.
apple-oss-distributions/dylddyld_shared_cacheThe shared-cache header and slide handling.View on GitHub(line —)What's NOT in the cache
- Third-party frameworks — apps that bundle their own copies of Sparkle, Crashlytics, Swift libraries, etc. Each app pays its own cost for these.
- App's own code — the binary itself plus its bundled frameworks. These are mapped per-process.
- Swift runtime libraries when not in the cache — older macOS versions had Swift's runtime outside the cache; modern versions include it.
- DriverKit dext frameworks — separate cache for those.
This is why a Swift app with many third-party dependencies has higher startup cost than an objc-only app — the shared cache benefit only applies to system frameworks.
How dyld uses it
When dyld starts a process:
- The kernel has already mapped the cache (and dyld itself, which lives in the cache).
- dyld reads the main executable's
LC_LOAD_DYLIBcommands. - For each declared dependency, dyld checks the cache's symbol table first.
- If the dependency is in the cache → skip — it's already mapped and resolved.
- If not in the cache →
dlopenit individually. - Resolve the main executable's external symbols against the loaded dylibs (cache + per-process).
- Run static initializers in dependency order.
- Call
main.
For a typical app, almost all dependencies are in the cache. The expensive parts (loading, linking) were done offline at OS install; per-process startup is mostly "map binary + run initializers."
apple-oss-distributions/dylddyld/PrebuiltLoader.cppPre-built loader objects in the cache — what makes dyld's job O(1) per dylib for cached ones.View on GitHub(line —)Inspecting the cache
A few tools:
dyld_info -shared_cache_info /usr/bin/some_binary— show which cache file the binary will use.dyld_info -dependents /path/to/app— list a binary's dependencies and where each one will be resolved from.vmmap <pid>— dump a running process's address space; the shared cache shows up as a giant__LINKEDIT/__DATA/__TEXTregion around a specific address.
A running process's vmmap output shows:
SUBMAP 10000000-1d0000000 [1.5G] r-x/r-x SM=COW shared cache
That's the shared cache region. Read-execute, copy-on-write semantics on writable pages, mapped-into-every-process.
Why the shared cache is invalidated by upgrades
When the OS updates:
- New system dylibs ship.
- The shared cache builder runs as part of the install.
- A fresh
dyld_shared_cache_*is written. - Reboot — every process from then on maps the new cache.
Major macOS upgrades take longer than they seem like they should because building the shared cache is expensive (the linker has to resolve every symbol across every system dylib).
What surprises newcomers
- Virtual size in Activity Monitor lies to you about real cost. The 1+ GB is mostly the shared cache, which is shared across every process.
- The cache is per-architecture. Intel Macs had Intel caches; Apple Silicon Macs have arm64e caches. Universal binaries get the right one mapped.
- You can technically bypass the cache with
DYLD_SHARED_REGION=avoid(developer-only, requires SIP relaxed) for debugging — but the app loads dramatically slower. - The cache is the reason
dlopen("libSystem")is essentially free — it's already mapped.
What to read next
For dyld's open-source implementation:
apple-oss-distributions/dylddyld/Loader.cppThe Loader class — what represents each dylib (cached or otherwise) in dyld's runtime.View on GitHub(line —)And the mmap article — the shared cache is a giant mmap-based mapping; understanding mmap makes the cache mechanics obvious.
For the cache's role in app startup, what happens when you launch an app — the cache map happens at step 3.