Context Engine

How LeanCTX's Context Engine architecture enriches AI context with cross-source intelligence, relevance ranking, and active inference.

The Context Engine is LeanCTX's intelligence layer that goes beyond simple file caching. It connects multiple data sources, ranks content by relevance, and proactively prefetches context your AI agent is likely to need next.

Think of it as a search engine for your entire development context, code, issues, documentation, and external APIs, unified into a single query interface.

How It Works

Ingest

Content chunks from files, providers, and APIs are indexed with BM25 full-text search.

Link

Cross-source edges connect related content (e.g., a GitHub issue mentioning a file path).

Rank

Saliency scoring combines recency, access frequency, and semantic relevance.

Predict

Active inference uses tool call patterns to prefetch context before you ask for it.

ctx_read

BM25 Index

Saliency

Hints

Cross-Source Intelligence

When you read a file, the Context Engine automatically surfaces related context from other sources. A ctx_read of auth.rs might append hints from a GitHub issue about JWT expiry or a Jira ticket about the authentication refactor.

ctx_read with cross-source hints

ctx_read(path: "src/auth.rs", mode: "map")

exports: authenticate(), validate_jwt(), refresh_token()

deps: jsonwebtoken, chrono, serde

tokens: 2,400 → 120 (95% saved)

cross-source hints:

github#142 JWT expiry not handled in refresh flow

jira/AUTH-89 Refactor auth middleware for SSO support

Provider Bandit

LeanCTX uses Thompson sampling to learn which providers deliver the most useful context for your workflow. Providers that consistently return relevant results get queried more often; noisy providers are automatically deprioritized.

GitHub 0.85

Jira 0.62

Internal API 0.34

Thompson sampling adjusts query probability based on historical relevance scores. Higher-scoring providers receive more queries.

Active Inference

Based on your recent tool calls, the Context Engine predicts what context you'll need next and prefetches it in the background. If you just read auth.rs and middleware.rs, it might pre-query GitHub issues tagged with 'authentication'.

Active inference prefetch

# Recent tool calls:

ctx_read src/auth.rs

ctx_read src/middleware.rs

ctx_search "authenticate"

cortex prediction:

→ prefetch github issues tagged "authentication"

→ prefetch jira sprint items for "auth"

→ preload src/routes/login.rs

Consolidation Engine

The consolidation engine processes provider data into all four context stores, making external knowledge a first-class citizen alongside your code. A background thread handles the heavy lifting so tool responses stay fast.

Stage	Action
Collect	Gather chunks from all active providers
Deduplicate	Remove identical or near-identical content across sources
Rank	Score by saliency (recency + frequency + semantic match)
Budget	Trim to free energy budget (configurable token cap)
Index (BM25)	Ingest chunks into BM25 — searchable via `ctx_semantic_search`
Graph	Create cross-source edges — file-to-issue/PR links for `ctx_read` hints
Knowledge	Extract facts — recallable via `ctx_knowledge` (bugs, features, data models)
Cache	Merge into session cache for instant retrieval

Controlled by providers.auto_index = true (default). Set to false for cache-only mode.