Documentation
Fold SDK Documentation
Smart context management for LLMs. Reduce token costs by 50-78% without sacrificing task performance.
# Install the SDK
npm install @fold/sdk
# Use it in your code
import { fold } from '@fold/sdk'
const ctx = fold() // That's it!
ctx.system("You are a helpful assistant")
ctx.think("I need to search for information...")
ctx.act({ tool: "search", query: "fold sdk" }, "search")
ctx.observe("Found 3 results...", "search")
// Get optimized messages for your LLM
const messages = ctx.messages()
// Check your savings
console.log(ctx.saved())
// { tokens: 5000, percent: 45, cost: 0.05 }Get Started
Quick Start
Get up and running in under 5 minutes
Learn more Coding Agents
Build agents like Claude Code or Cursor
Learn more API Reference
Complete SDK documentation
Learn more What is Fold?
Fold is an intelligent context compression platform for LLM-powered agents. LLM agents operate in loops: reason → act → observe → repeat. Each iteration adds to the context window, causing costs to scale quadratically.
A 50-turn agent conversation can cost 2,500x more than a single prompt. Fold solves this through:
- Masking — Replace old observations with placeholders (cheap, fast)
- Summarization — Compress context via LLM when needed (powerful, selective)
- Anchor Detection — Protect important turns from optimization
- Stop Signal Detection — Prevent agents from wasting tokens on impossible tasks
Presets
fold() // Default: 100K budget, 10 turn window
fold("chat") // 32K budget, 20 turn window
fold("coding") // 100K budget, 15 turn window
fold("research") // 128K budget, 10 turn window
fold("long-running") // 200K budget, 8 turn window
// Or custom
fold({ budget: 50_000, model: "gpt-4o", window: 15 })Framework Support
OpenAI SDK
@fold/sdk/openaiAnthropic SDK
@fold/sdk/openaiVercel AI SDK
@fold/sdk/vercel-aiLangChain / LangGraph
@fold/sdk/langchain