Claude Code /compact: Context Compression, Compaction Prompts, What Survives

How Claude Code /compact works: context compression, what survives compaction, prompt caching cost, PreCompact/PostCompact hooks, and practical recovery tips.

Claude Code /compact: Context Compression, Compaction Prompts, What Survives

There's a hidden mechanic in Claude Code that most people don't think about until it costs them an hour of work. Compaction is the automatic compression of your conversation context when you approach the token limit. I ran into it when my always-on agent "forgot" a critical rule from the beginning of a session after 200 minutes of continuous operation. Here's what I learned by digging into how it works.

Quick answer: what /compact does in Claude Code

/compact replaces your long conversation history with a structured summary so the session fits back into the context window. Claude Code then reloads most startup context from disk. The important nuance: project-root CLAUDE.md survives compaction, but older decisions, path-scoped rules, nested CLAUDE.md files, invoked skills, and subtle user preferences can still be summarized, truncated, or dropped.

So the goal is not “avoid compaction forever”. The goal is to decide what must live in durable files before compaction happens.

What is compaction and why it exists

Claude Code runs in your terminal and maintains long sessions. Unlike ChatGPT where each message is a separate request, Claude Code accumulates context: files it read, commands it executed, grep results, diffs, error messages. All of this stacks into one massive prompt.

The problem: models have context window limits. Claude Sonnet has 200K tokens, Opus goes up to 1M. But even a million tokens runs out when your agent is actively working through a codebase. Compaction kicks in when context reaches roughly 95% of the limit.

What happens: Claude Code takes the entire conversation history, sends it to a separate model call with a prompt like "compress this into key facts," and replaces the full history with a condensed summary. You don't see this happening — at some point, the agent starts working from "memory" of past interactions rather than the full transcript.

What survives /compact in Claude Code

Anthropic's newer docs are more explicit about the context window. Before you type anything, Claude Code loads startup context: system prompt, output style, project rules, CLAUDE.md, auto memory, MCP tool names, and skill descriptions. During a session, every file read, command result, diff, and tool output adds more context. When compaction happens, the conversation is replaced by a summary and startup context is reloaded.

Context sourceAfter compactionPractical consequence
System prompt / output styleUnchangedYou do not need to restate it.
Project-root CLAUDE.mdRe-read and re-injectedPut critical always-on rules here.
Nested CLAUDE.md / path rulesReload when matching files are read againDo not rely on nested rules for global constraints.
Conversation historyCompressed into summaryOld “why we chose this” context is easiest to lose.
Invoked skillsCarried forward within token budget; can be truncated/droppedPut the most important skill instructions at the top.
MCP toolsTool names/descriptions reload; previous outputs are summarizedRe-query important external facts instead of trusting an old summary.
Subagent researchOnly the returned summary remainsAsk subagents for concise, source-linked conclusions.

What survives compaction, what doesn't

From my experience running agents daily, compaction reliably preserves:

  • The current task and its immediate context
  • File names that were recently modified
  • Recent errors and their solutions
  • General project architecture

What consistently gets lost:

  • Instructions from the start of the session — "don't touch this file," "use this format"
  • Intermediate decisions — why you chose approach A over B
  • Specific code snippets discussed 50 messages ago
  • Subtle style rules — "no emoji," "no Co-Authored-By in commits"

This makes sense: compaction optimizes for "what to do next," not "why we did what we did." Decision context is the first casualty of compression.

Custom compaction prompts

Claude Code lets you override the compaction prompt. This might be the most underrated setting. In settings.json (at ~/.claude/settings.json), there's a parameter:

{
  "compactPrompt": "Preserve ALL rules from CLAUDE.md verbatim. Keep file paths, error messages, and architectural decisions. Summarize tool outputs but keep their conclusions."
}

I've tested multiple variations. The default prompt is a generic "summarize the conversation." It works fine for short sessions, but for agents running for hours, you need more precision.

My current compaction prompt:

{
  "compactPrompt": "When compacting, you MUST preserve:\n1. All rules and constraints from CLAUDE.md and system prompt — copy them verbatim\n2. Current task context and progress\n3. File paths that were modified\n4. Specific error messages and their solutions\n5. User corrections and preferences stated during the session\n\nYou MAY summarize:\n- Tool call outputs (keep conclusions, drop raw output)\n- File contents that were read (keep what was learned, drop the text)\n- Exploratory steps that led nowhere"
}

The difference is noticeable. After switching to this custom prompt, my agents stopped "forgetting" CLAUDE.md rules during long sessions.

CLAUDE.md vs compaction: where to put what

The key insight from three months of running Claude Code agents: never rely on compaction for critical rules. Everything the agent must always remember should live in CLAUDE.md.

CLAUDE.md loads as part of the system prompt at the start of every session. Compaction doesn't touch it — it exists outside the conversation history. It's the only place that's guaranteed to survive any compression.

My strategy:

  • CLAUDE.md — architectural rules, code style, prohibitions, credential paths, naming conventions. Anything that applies to every session
  • Conversation context — current task, intermediate results, discussions. It's OK to lose this during compaction
  • Project files — TODO.md, ARCHITECTURE.md, notes. The agent can re-read these if compaction erased the context

Rule of thumb: if you catch yourself repeating the same instruction to the agent — move it to CLAUDE.md. I covered CLAUDE.md configuration in detail in my Claude Code setup guide.

How to tell when compaction happened

Claude Code shows a [context compacted] indicator in the terminal. But if you're multitasking, it's easy to miss. A more reliable signal: watch the cost counter in the bottom-right corner. A sudden cost spike (compaction is a separate API call) plus a context counter reset means compaction just fired.

Another telltale sign: the agent starts re-asking things you already discussed. "Which file has the config?" — even though you edited that file 20 minutes ago. That's a clear signal that context was compressed and details were lost.

When compaction causes real problems

A concrete case from my work. I run a server-agent — Claude Code running 24/7 on a VPS, accessible via Telegram. The agent receives a message, processes it, waits for the next one. Sessions can last hours.

The problem: after compaction, the agent would "forget" access control rules. I have a permission matrix — which files can be read from which chat context. These rules were in the initial prompt, but after compression the model lost them and started responding to requests it should have blocked.

The fix: I moved all security rules into CLAUDE.local.md (which loads on top of the base CLAUDE.md). Now they're in the system prompt, not in the conversation history — compaction can't touch them.

When to run /compact manually

Manual /compact is best at natural task boundaries: after a PR is merged, after a debugging path is understood, or before switching from research to implementation. If you wait for auto-compaction, it can happen mid-task, exactly when the model is juggling the most fragile context.

Use /rewind instead when you want to abandon a bad direction entirely. Compaction summarizes the path you took; rewind discards it.

Prompt caching cost of compaction

Claude Code uses prompt caching automatically. Compaction invalidates the conversation-history layer because the next request has a new shorter history. The system prompt layer can still be reused, but the next turn after compact may feel slower or more expensive. This is fine when you are keeping useful context; it is wasteful if you compact a messy failed branch that should have been rewound.

PreCompact and PostCompact hooks

Claude Code exposes hook events around compaction: PreCompact before the compact operation and PostCompact after it completes. This is underrated for long-running agents.

A practical hook can write a tiny checkpoint before compact:

  • current task;
  • files modified;
  • commands/tests already run;
  • critical user constraints;
  • open questions.

After compact, ask Claude to read that checkpoint and verify it still remembers the constraints. This is much more reliable than hoping a generic summary preserved everything.

Auto-compaction thrashing

If Claude Code compacts and immediately fills the context window again, docs call this auto-compaction thrashing. The usual cause is a huge file or tool output being re-read. Recovery is boring but effective: read smaller line ranges, ask for summaries, move durable rules into CLAUDE.md, and compact with a focused instruction.

Practical tips for managing compaction

Customize your compactPrompt. Tailor it to your workflow. Working with data? Ask it to preserve SQL queries. Writing articles? Ask it to keep style rules. The default is too generic for specialized work.

Keep sessions short when possible. A fresh session means clean context plus full CLAUDE.md. Sometimes that's better than a long session with two compaction events. I've found that breaking work into focused 30-60 minute sessions produces better results than marathon 4-hour runs.

Use files as external memory. Ask the agent to write key decisions to a TODO.md or NOTES.md file in your project. After compaction, it can re-read them. This pattern is especially important for Obsidian-based workflows where the vault acts as long-term memory.

Put everything critical in CLAUDE.md. Rules that must always apply go there, nowhere else. Don't hope that compaction will preserve them — assume it won't.

Test compaction intentionally. Run a long session, wait for compression, then check — does the agent still remember the rules from the beginning? This will save you hours of debugging later.

Compaction isn't a bug — it's a necessary compromise. Context windows are growing, but as long as they're finite, compression will be part of working with AI agents. Better to understand how it works than to be surprised when your agent "forgets" a critical rule at the 200-minute mark.


Follow me on Twitter/X @danokhlopkov for more on AI agents and automation.