lean-ctx uses a multi-layered caching and compression system to minimize token usage. Understanding these layers helps you get the most out of the system.
Session Cache
Every file read via ctx_read is stored in a per-session in-memory cache with a BLAKE3 content hash.
When the same file is read again:
- Content unchanged: Returns a compact cache-hit stub (~13 tokens) instead of the full file
- Content changed: Returns the full new content and updates the cache
- Different mode requested: Re-reads with the new mode
Cache Lifecycle
# 1. First read: full content cached
ctx_read src/auth.ts
→ F1=auth.ts 123L [full content...] (~450 tokens)
# 2. Second read: cache hit
ctx_read src/auth.ts
→ F1=auth.ts cached 2t 123L (~13 tokens, 97% saved)
# 3. File was edited externally, third read: detects change
ctx_read src/auth.ts
→ F1=auth.ts 125L [new full content...]
# 4. Force bypass cache
ctx_read src/auth.ts fresh=true
→ F1=auth.ts 125L [re-read full content...] Cache Management
| Command | Effect |
|---|---|
ctx_cache action="status" | Show cached files, sizes, and hit rates |
ctx_cache action="clear" | Clear entire session cache |
ctx_cache action="invalidate" path="..." | Invalidate a specific file |
ctx_read fresh=true | Bypass cache for a single read |
The cache auto-clears after 5 minutes of inactivity to prevent stale data.
File References (F1, F2, ...)
Each file read in a session gets a persistent short ID: F1, F2, etc.
These IDs survive across the entire session and can be used instead of full paths to save tokens.
F1=auth.ts 123L → Use "F1" instead of "src/auth/service.ts"
F2=server.rs 262L → Use "F2" instead of "src/http/server.rs"
F3=db.ts 64L → Use "F3" instead of "src/database/db.ts"
In TDD mode, even longer identifiers within file content are mapped to short symbols
(α1, α2...) for further compression.
Shell Output Compression
ctx_shell applies pattern-based compression to the output of 95+ recognized developer tools.
Each tool has a specialized compressor that preserves actionable information while stripping boilerplate.
How It Works
- Command Detection: Identifies the tool from the command string (git, npm, docker, etc.)
- Pattern Matching: Applies the tool-specific compression pattern
- Structured Output: Returns only the essential information with token savings count
- Fallback: Unrecognized commands get generic compression (ANSI stripping, empty line removal)
Compression Examples
| Command | Raw Output | Compressed | Savings |
|---|---|---|---|
git status | ~600 tokens | ~80 tokens | 87% |
npm install | ~300 tokens | ~85 tokens | 71% |
npm test | ~2000 tokens | ~200 tokens | 90% |
docker compose ps | ~400 tokens | ~100 tokens | 75% |
kubectl get pods | ~800 tokens | ~200 tokens | 75% |
Error Recovery (Tee)
When a command fails (non-zero exit code), the full uncompressed output is automatically saved
to ~/.lean-ctx/tee/. Use lean-ctx tee last to recover the full output.
This ensures compression never hides error details.
Tool Result Archive
When enabled, the archive system stores full tool results to disk when they exceed a token threshold.
The compressed response includes an [ARCHIVE: <id>] reference that the agent
can use with ctx_expand to retrieve the full content on demand.
Flow
- Tool result exceeds threshold (default: 500 tokens)
- Full result stored in
~/.lean-ctx/archive/ - Compressed response + archive ID sent to the agent
- Agent calls
ctx_expand id="..."when full detail is needed - Archived entries auto-expire after TTL (default: 120 minutes)
Configuration
# config.toml
[archive]
enabled = true
threshold_tokens = 500
ttl_minutes = 120 Zero-Loss Archive (ctx_expand)
The archive system stores large tool outputs to disk so they never consume context window space -
but unlike simple truncation, nothing is lost. The full content is always available
on demand via ctx_expand.
How It Works
- A tool result exceeds the configured token threshold
- The full output is written to
~/.lean-ctx/archives/with a unique ID - The model receives a compact hint instead of the full output:
[ARCHIVE:a7f3c2] auth.ts analysis (2,847 tokens) - 14 functions, 3 classes Key exports: AuthService, TokenManager, validateJWT Use ctx_expand id="a7f3c2" for full content - The agent calls
ctx_expand id="a7f3c2"only when full detail is actually needed
Configuration
# config.toml
[archive]
enabled = true
threshold_tokens = 500 # Archive results larger than this
ttl_minutes = 120 # Auto-expire after 2 hours
max_disk_mb = 256 # Disk space limit for archives
mask_secrets = true # Redact detected secrets before archiving | Option | Default | Description |
|---|---|---|
enabled | true | Enable/disable the archive system |
threshold_tokens | 500 | Minimum token count to trigger archiving |
ttl_minutes | 120 | Time-to-live before auto-expiration |
max_disk_mb | 256 | Maximum total disk usage for archives |
mask_secrets | true | Redact API keys, tokens, and passwords before writing to disk |
Secret Masking
When mask_secrets is enabled, lean-ctx scans archived content for common secret patterns
(API keys, JWT tokens, connection strings, private keys) and replaces them with
[REDACTED:type] placeholders before writing to disk. This ensures sensitive data
never persists in the archive directory.
Cache-Safe Guarantee
lean-ctx provides a cache-safe guarantee: content already present in the model's context window is never mutated or corrupted by lean-ctx operations. This is a critical invariant that prevents subtle bugs from stale or inconsistent data.
What This Means
- No silent overwrites: Once a file is cached as F1 with a specific hash, the F1 reference always points to that exact content until explicitly invalidated
- Hash-based validation: Every cache hit verifies the BLAKE3 content hash - if the file changed on disk, the cache entry is invalidated and a full re-read occurs
- Immutable archive entries: Archived content (
ctx_expandIDs) is immutable once written - the same ID always returns the same content - No partial reads: If a read fails mid-stream, no partial content enters the cache
Doctor Cache-Safety Check
lean-ctx doctor includes a cache-safety validation step that verifies:
- All cached file hashes match current disk content
- No archive entries have been externally modified
- Session state is consistent with the file reference table
- No orphaned cache entries exist from crashed sessions
lean-ctx doctor
→ Cache safety: ✓ All 12 cached files verified
Archive integrity: ✓ 8 entries, 0 corrupted
Session state: ✓ Consistent
Orphaned entries: ✓ None found Context Compaction
ctx_compress creates a checkpoint of the current session state for long conversations.
It summarizes all cached files, their signatures, and the session context into a compact format
that can survive context window truncation.
When to use: After 15-20 tool calls, or when approaching context window limits. lean-ctx auto-triggers checkpoints at configurable intervals.
ctx_compress
→ Session checkpoint created:
12 files cached (F1-F12)
3 signatures preserved
Session context: 2 tasks, 1 workflow
Checkpoint size: ~800 tokens (vs ~15000 tokens for full state)