Skip to content

The dyld shared cache and shared regions

Every system dylib pre-linked into one giant memory-mappable file, shared across every process on the system. Why a fresh process has 1 GB of virtual size but tiny resident memory.

Published 5 min read
Shared cache in a process address spaceA typical macOS process address space. The dyld shared cache occupies the largest virtual region, shared across every process; the app's own segments and heap are per-process.ONE PROCESS'S VIRTUAL ADDRESS SPACEnull guard page0x0…0x1000main binary __TEXT + __DATA0x100000000…0x100100000dyld shared cache (≈1.5 GB)0x180000000…0x2A0000000SHARED · same physical pages across every processthird-party frameworks (per-process)0x2A0000000…0x2A0040000malloc heap (anon, zero-fill, COW)anonstacks per thread0x16f000000…PHYSICAL MEMORY BACKINGShared cache backingOne copy in physical RAMMapped into every processSame physical pagesdyld_shared_cache_arm64e≈ 3-5 GB on disk≈ 1-2 GB mapped per processPer-process backingMain binary, third-partyframeworks, heap, stack— distinct physical pagesper processWhy this mattersActivity Monitor "Virtual" includes shared."Real Memory" counts shared once.The huge VSize number is mostly free.

Open Activity Monitor and look at the Virtual Memory column for any process. Even a tiny one — echo, cat, a hello-world program — sits around 1 GB of virtual size. The number doesn't change much between huge apps and trivial ones. The real number that scales with the app is Real Memory (resident set), and that's much smaller.

The explanation is the dyld shared cache — a single giant pre-linked file containing every system dylib, mapped into every process on the system at the same virtual address. This article walks how it works, why it exists, and what it means for processes.

What's in the shared cache

The shared cache lives at:

/System/Library/dyld/dyld_shared_cache_<arch>

On Apple Silicon, the file is around 3-5 GB. It contains:

  • Every Apple-supplied .dylib in the system: libSystem, libobjc, Foundation, AppKit, UIKit, CoreFoundation, every Metal library, every Core ML library, every PDFKit / WebKit / MapKit framework — hundreds of them.
  • The pre-resolved relocation chains between them.
  • A hash table for fast symbol lookup.
  • A minified version of dyld itself (the loader runs from here too).

Building the cache is a long, expensive offline process done at OS install time and at major system updates. The result: dyld can map every system dylib into a new process with one mmap call instead of hundreds of individual dlopens.

apple-oss-distributions/dylddyld/main.cppdyld's entry point — where the shared cache is mapped into a new process.View on GitHub(line ) apple-oss-distributions/dyldcache_builderThe offline cache builder — runs at OS install / update.View on GitHub(line )

How it's mapped

When a new process starts, the kernel maps the shared cache into the new task's address space at a known address:

  • The cache occupies a contiguous virtual range of around 1-2 GB.
  • The mapping is shared across every process — the same physical pages back it everywhere.
  • Permissions: text/code pages are read+execute, data pages are read+write (with COW so writes from one process don't affect others), const data is read-only.

Every process on the system maps the exact same virtual address range to the exact same physical pages for the cache. This is huge:

  • Physical memory savings: every dylib is in RAM once, not per-process.
  • L1/L2/L3 cache hits: dylib code executes from the same physical lines across processes; once one process warms up objc_msgSend in the cache, every process benefits.
  • Page table sharing: the page-table entries for the cache region are identical across tasks; the kernel can share entire L3 page tables.

The performance win is substantial — process startup is dominated by mapping the shared cache, which is one mmap instead of hundreds of individual dylib opens.

Address-space slide (ASLR)

The cache's base address is randomized at boot (cache-level ASLR). Every process on the system maps it at the same address — but that address is different across boots, frustrating attackers who'd otherwise know exactly where objc_msgSend lives in any process's address space.

The slide is small (a few bits of randomness) but enough to defeat static gadget-address-based exploitation. Combined with per-process ASLR for the app's own code, the address layout is hard to predict.

apple-oss-distributions/dylddyld_shared_cacheThe shared-cache header and slide handling.View on GitHub(line )

What's NOT in the cache

  • Third-party frameworks — apps that bundle their own copies of Sparkle, Crashlytics, Swift libraries, etc. Each app pays its own cost for these.
  • App's own code — the binary itself plus its bundled frameworks. These are mapped per-process.
  • Swift runtime libraries when not in the cache — older macOS versions had Swift's runtime outside the cache; modern versions include it.
  • DriverKit dext frameworks — separate cache for those.

This is why a Swift app with many third-party dependencies has higher startup cost than an objc-only app — the shared cache benefit only applies to system frameworks.

How dyld uses it

When dyld starts a process:

  1. The kernel has already mapped the cache (and dyld itself, which lives in the cache).
  2. dyld reads the main executable's LC_LOAD_DYLIB commands.
  3. For each declared dependency, dyld checks the cache's symbol table first.
  4. If the dependency is in the cache → skip — it's already mapped and resolved.
  5. If not in the cache → dlopen it individually.
  6. Resolve the main executable's external symbols against the loaded dylibs (cache + per-process).
  7. Run static initializers in dependency order.
  8. Call main.

For a typical app, almost all dependencies are in the cache. The expensive parts (loading, linking) were done offline at OS install; per-process startup is mostly "map binary + run initializers."

apple-oss-distributions/dylddyld/PrebuiltLoader.cppPre-built loader objects in the cache — what makes dyld's job O(1) per dylib for cached ones.View on GitHub(line )

Inspecting the cache

A few tools:

  • dyld_info -shared_cache_info /usr/bin/some_binary — show which cache file the binary will use.
  • dyld_info -dependents /path/to/app — list a binary's dependencies and where each one will be resolved from.
  • vmmap <pid> — dump a running process's address space; the shared cache shows up as a giant __LINKEDIT/__DATA/__TEXT region around a specific address.

A running process's vmmap output shows:

SUBMAP                  10000000-1d0000000 [1.5G] r-x/r-x SM=COW shared cache

That's the shared cache region. Read-execute, copy-on-write semantics on writable pages, mapped-into-every-process.

Why the shared cache is invalidated by upgrades

When the OS updates:

  1. New system dylibs ship.
  2. The shared cache builder runs as part of the install.
  3. A fresh dyld_shared_cache_* is written.
  4. Reboot — every process from then on maps the new cache.

Major macOS upgrades take longer than they seem like they should because building the shared cache is expensive (the linker has to resolve every symbol across every system dylib).

What surprises newcomers

  • Virtual size in Activity Monitor lies to you about real cost. The 1+ GB is mostly the shared cache, which is shared across every process.
  • The cache is per-architecture. Intel Macs had Intel caches; Apple Silicon Macs have arm64e caches. Universal binaries get the right one mapped.
  • You can technically bypass the cache with DYLD_SHARED_REGION=avoid (developer-only, requires SIP relaxed) for debugging — but the app loads dramatically slower.
  • The cache is the reason dlopen("libSystem") is essentially free — it's already mapped.

For dyld's open-source implementation:

apple-oss-distributions/dylddyld/Loader.cppThe Loader class — what represents each dylib (cached or otherwise) in dyld's runtime.View on GitHub(line )

And the mmap article — the shared cache is a giant mmap-based mapping; understanding mmap makes the cache mechanics obvious.

For the cache's role in app startup, what happens when you launch an app — the cache map happens at step 3.

Related

From double-click to first window: LaunchServices, launchd, posix_spawn, AMFI, dyld, the shared cache, sandbox profile installation, the runloop. Six subsystems in three seconds.
The kernel's own malloc — a hierarchy of zone allocators, the kalloc heap, and slab caches for specific types. Different from user-side VM, and just as important.
How XNU responds when memory gets tight — the four-stage pressure pipeline from free pages through compression to swap to process termination, and what each stage costs.