ドキュメント

Caching & Compression

How lean-ctx caches files, compresses shell output, and manages the context window to achieve 74-99% token savings.

lean-ctx uses a multi-layered caching and compression system to minimize token usage. Understanding these layers helps you get the most out of the system.

Session Cache

Every file read via ctx_read is stored in a per-session in-memory cache with a BLAKE3 content hash. When the same file is read again:

  • Content unchanged: Returns a compact cache-hit stub (~13 tokens) instead of the full file
  • Content changed: Returns the full new content and updates the cache
  • Different mode requested: Re-reads with the new mode

Cache Lifecycle

# 1. First read: full content cached
ctx_read src/auth.ts
→ F1=auth.ts 123L [full content...]  (~450 tokens)

# 2. Second read: cache hit
ctx_read src/auth.ts
→ F1=auth.ts cached 2t 123L  (~13 tokens, 97% saved)

# 3. File was edited externally, third read: detects change
ctx_read src/auth.ts
→ F1=auth.ts 125L [new full content...]

# 4. Force bypass cache
ctx_read src/auth.ts fresh=true
→ F1=auth.ts 125L [re-read full content...]

Cache Management

CommandEffect
ctx_cache action="status"Show cached files, sizes, and hit rates
ctx_cache action="clear"Clear entire session cache
ctx_cache action="invalidate" path="..."Invalidate a specific file
ctx_read fresh=trueBypass cache for a single read

The cache auto-clears after 5 minutes of inactivity to prevent stale data.

File References (F1, F2, ...)

Each file read in a session gets a persistent short ID: F1, F2, etc. These IDs survive across the entire session and can be used instead of full paths to save tokens.

F1=auth.ts 123L      → Use "F1" instead of "src/auth/service.ts"
F2=server.rs 262L    → Use "F2" instead of "src/http/server.rs"
F3=db.ts 64L         → Use "F3" instead of "src/database/db.ts"

In TDD mode, even longer identifiers within file content are mapped to short symbols (α1, α2...) for further compression.

Shell Output Compression

ctx_shell applies pattern-based compression to the output of 95+ recognized developer tools. Each tool has a specialized compressor that preserves actionable information while stripping boilerplate.

How It Works

  1. Command Detection: Identifies the tool from the command string (git, npm, docker, etc.)
  2. Pattern Matching: Applies the tool-specific compression pattern
  3. Structured Output: Returns only the essential information with token savings count
  4. Fallback: Unrecognized commands get generic compression (ANSI stripping, empty line removal)

Compression Examples

CommandRaw OutputCompressedSavings
git status~600 tokens~80 tokens87%
npm install~300 tokens~85 tokens71%
npm test~2000 tokens~200 tokens90%
docker compose ps~400 tokens~100 tokens75%
kubectl get pods~800 tokens~200 tokens75%

Error Recovery (Tee)

When a command fails (non-zero exit code), the full uncompressed output is automatically saved to ~/.lean-ctx/tee/. Use lean-ctx tee last to recover the full output. This ensures compression never hides error details.

Tool Result Archive

When enabled, the archive system stores full tool results to disk when they exceed a token threshold. The compressed response includes an [ARCHIVE: <id>] reference that the agent can use with ctx_expand to retrieve the full content on demand.

Flow

  1. Tool result exceeds threshold (default: 500 tokens)
  2. Full result stored in ~/.lean-ctx/archive/
  3. Compressed response + archive ID sent to the agent
  4. Agent calls ctx_expand id="..." when full detail is needed
  5. Archived entries auto-expire after TTL (default: 120 minutes)

Configuration

# config.toml
[archive]
enabled = true
threshold_tokens = 500
ttl_minutes = 120

Zero-Loss Archive (ctx_expand)

The archive system stores large tool outputs to disk so they never consume context window space - but unlike simple truncation, nothing is lost. The full content is always available on demand via ctx_expand.

How It Works

  1. A tool result exceeds the configured token threshold
  2. The full output is written to ~/.lean-ctx/archives/ with a unique ID
  3. The model receives a compact hint instead of the full output:
    [ARCHIVE:a7f3c2] auth.ts analysis (2,847 tokens) - 14 functions, 3 classes
      Key exports: AuthService, TokenManager, validateJWT
      Use ctx_expand id="a7f3c2" for full content
  4. The agent calls ctx_expand id="a7f3c2" only when full detail is actually needed

Configuration

# config.toml
[archive]
enabled = true
threshold_tokens = 500   # Archive results larger than this
ttl_minutes = 120        # Auto-expire after 2 hours
max_disk_mb = 256        # Disk space limit for archives
mask_secrets = true      # Redact detected secrets before archiving
OptionDefaultDescription
enabledtrueEnable/disable the archive system
threshold_tokens500Minimum token count to trigger archiving
ttl_minutes120Time-to-live before auto-expiration
max_disk_mb256Maximum total disk usage for archives
mask_secretstrueRedact API keys, tokens, and passwords before writing to disk

Secret Masking

When mask_secrets is enabled, lean-ctx scans archived content for common secret patterns (API keys, JWT tokens, connection strings, private keys) and replaces them with [REDACTED:type] placeholders before writing to disk. This ensures sensitive data never persists in the archive directory.

Cache-Safe Guarantee

lean-ctx provides a cache-safe guarantee: content already present in the model's context window is never mutated or corrupted by lean-ctx operations. This is a critical invariant that prevents subtle bugs from stale or inconsistent data.

What This Means

  • No silent overwrites: Once a file is cached as F1 with a specific hash, the F1 reference always points to that exact content until explicitly invalidated
  • Hash-based validation: Every cache hit verifies the BLAKE3 content hash - if the file changed on disk, the cache entry is invalidated and a full re-read occurs
  • Immutable archive entries: Archived content (ctx_expand IDs) is immutable once written - the same ID always returns the same content
  • No partial reads: If a read fails mid-stream, no partial content enters the cache

Doctor Cache-Safety Check

lean-ctx doctor includes a cache-safety validation step that verifies:

  • All cached file hashes match current disk content
  • No archive entries have been externally modified
  • Session state is consistent with the file reference table
  • No orphaned cache entries exist from crashed sessions
lean-ctx doctor
→ Cache safety: ✓ All 12 cached files verified
  Archive integrity: ✓ 8 entries, 0 corrupted
  Session state: ✓ Consistent
  Orphaned entries: ✓ None found

Context Compaction

ctx_compress creates a checkpoint of the current session state for long conversations. It summarizes all cached files, their signatures, and the session context into a compact format that can survive context window truncation.

When to use: After 15-20 tool calls, or when approaching context window limits. lean-ctx auto-triggers checkpoints at configurable intervals.

ctx_compress
→ Session checkpoint created:
  12 files cached (F1-F12)
  3 signatures preserved
  Session context: 2 tasks, 1 workflow
  Checkpoint size: ~800 tokens (vs ~15000 tokens for full state)