Fold more in.
Smart context management for LLMs. Overcome token limits with intelligent condensation, hierarchical memory, and real-time optimization. Build agents that remember everything without paying for everything.
Everything You Need for Context Control
A complete toolkit for managing LLM context windows, from basic compression to sophisticated multi-tier memory systems.
Simple Integration, Powerful Results
Drop-in context management that works with any LLM provider.
Connect Your Agent
Integrate with a few lines of code. Works with OpenAI, Anthropic, and any OpenAI-compatible API.
Configure Your Strategy
Choose from condensation, RAG, summarization, or custom strategies. Fine-tune retention policies to match your use case.
Scale with Confidence
Monitor token usage, track context quality, and optimize costs in real-time through your dashboard.
import { Fold } from "@fold/sdk";
const fold = new Fold({ budget: 8000 });
// Before your LLM call
const optimizedMessages = await fold.prepare(conversationHistory);
// After the response
await fold.update(assistantResponse);
// That's it. Fold handles compression, storage, and retrieval.Built for Modern AI Applications
From simple chatbots to complex multi-agent systems, context management that scales with your ambitions.
Pay for intelligence, not repetition.
A "fold" = one context optimization operation. Most agents use 1 fold per turn.