Every conversation gets one working memory of a set size. Everything Claude can see at once — and reason over — has to fit on this desk.
Instructions, files, the task, and the running conversation all compete for the same desk.
Once it fills, the oldest material is summarized or dropped to make room. And loading all six profile files every session burns a third of the desk before you've even started — that's the part worth trimming.
Without help, every turn re-reads your rules, files, and task from scratch — so each reply gets slower and more expensive.
Caching keeps the stable blocks — your rules, files, and the task — ready to reuse. Claude skips re-reading them and jumps straight to the new turn.
Keep the session moving and you stay in the cache. Pause too long and it clears — the next turn pays full price again.