Your AI agent picks the wrong tools, ignores budgets, and wastes compute on trivial tasks.
LeanCTX routes every request through a graph-powered adaptive pipeline. It classifies task intent, fuses BM25, semantic embeddings, and graph proximity via Reciprocal Rank Fusion (RRF) for search, enforces token and cost budgets, and enriches overviews with knowledge facts and graph hotspots. Progressive throttling and compression levels (Off/Lite/Standard/Max) keep long sessions efficient.
Every Request Gets the Same Treatment
Without intelligent routing, every request gets the same generic treatment. A simple rename gets the same heavy context as a complex refactor. There is no task awareness, no budget control, no optimization.
ctx_intent classifies your task and automatically selects the optimal read mode, budget, and pipeline strategy.
11 tools
Intent Routing
Automatic task classification routes requests to the optimal processing mode.
Mode Selection
Learned mode predictor selects the best compression strategy based on task type and context.
Budget Enforcement
Token, cost, and time budgets with SLO actions: warn, throttle, or block.
Adaptive Pipeline
Six pipeline stages with per-profile toggles and real-time metrics.
Hybrid Search Fusion
Combines BM25 keyword matching, semantic embeddings, and graph proximity scores via Reciprocal Rank Fusion (RRF).
Knowledge-Enriched Overview
ctx_overview surfaces relevant knowledge facts and Property Graph hotspots alongside the architecture summary.
Progressive Search Throttling
Escalating hints guide the agent when repeated searches return diminishing results, reducing wasted tokens.
Compression Levels
Unified CompressionLevel system (Off/Lite/Standard/Max) controls output density across all tools. Set via <code>lean-ctx compression <level></code>, ideal for tuning verbosity per session.
Context Potential (Φ)
Every context item is scored by a six-factor potential function combining relevance, structure, recency, history, cost, and redundancy.
Context Compiler
Greedy Φ-ranked selection builds minimal context packages within any token budget, with automatic redundancy elimination.
Context Handles
Lazy sparse references (@F1, @K3) that defer content loading until needed — zero-cost context pointers for token efficiency.
Cognitive Efficiency Protocol (CEP)
A structured protocol for maximizing AI reasoning quality through optimized context delivery.
10 MCP tools
LeanCTX routes every request through a graph-powered adaptive pipeline. It classifies task intent, fuses BM25, semantic embeddings, and graph proximity via Reciprocal Rank Fusion (RRF) for search, enforces token and cost budgets, and enriches overviews with knowledge facts and graph hotspots. Progressive throttling and compression levels (Off/Lite/Standard/Max) keep long sessions efficient.
ctx_intent Structured intent input (optional) — submit compact JSON or short text; server also infers intents automatically from tool calls.
ctx_overview Task-relevant project map — use at session start.
ctx_preload Proactive context loader — caches task-relevant files, returns L-curve-optimized summary (~50-100 tokens vs ~5000 for individual reads).
ctx_prefetch Predictive prefetch — prewarm cache for blast radius files (graph + task signals) within budgets.
ctx_dedup Cross-file dedup: analyze or apply shared block references.
ctx_response Compress LLM response text (remove filler, apply TDD).
ctx_benchmark Benchmark compression modes for a file or project.
ctx_context Session context overview — cached files, seen files, session state.
ctx_routes List HTTP routes/endpoints extracted from the project. Supports Express, Flask, FastAPI, Actix, Spring, Rails, Next.js.
ctx_feedback Harness feedback for LLM output tokens/latency (local-first). Actions: record|report|json|reset|status.
Every output carries proof
LeanCTX generates proof artifacts for every session: which files were read, what was compressed, which checks passed, and how tokens were spent. This makes AI work auditable, replayable, and trustworthy.
Explore Intelligence Tools
LeanCTX routes every request through a graph-powered adaptive pipeline. It classifies task intent, fuses BM25, semantic embeddings, and graph proximity via Reciprocal Rank Fusion (RRF) for search, enforces token and cost budgets, and enriches overviews with knowledge facts and graph hotspots. Progressive throttling and compression levels (Off/Lite/Standard/Max) keep long sessions efficient.