Documentation

Fold SDK Documentation

Smart context management for LLMs. Reduce token costs by 50-78% without sacrificing task performance.

# Install the SDK
npm install @fold/sdk
# Use it in your code
import { fold } from '@fold/sdk'

const ctx = fold()  // That's it!

ctx.system("You are a helpful assistant")
ctx.think("I need to search for information...")
ctx.act({ tool: "search", query: "fold sdk" }, "search")
ctx.observe("Found 3 results...", "search")

// Get optimized messages for your LLM
const messages = ctx.messages()

// Check your savings
console.log(ctx.saved())
// { tokens: 5000, percent: 45, cost: 0.05 }

Get Started

Quick Start
Get up and running in under 5 minutes
Learn more
Coding Agents
Build agents like Claude Code or Cursor
Learn more
API Reference
Complete SDK documentation
Learn more

What is Fold?

Fold is an intelligent context compression platform for LLM-powered agents. LLM agents operate in loops: reason → act → observe → repeat. Each iteration adds to the context window, causing costs to scale quadratically.

A 50-turn agent conversation can cost 2,500x more than a single prompt. Fold solves this through:

  • Masking — Replace old observations with placeholders (cheap, fast)
  • Summarization — Compress context via LLM when needed (powerful, selective)
  • Anchor Detection — Protect important turns from optimization
  • Stop Signal Detection — Prevent agents from wasting tokens on impossible tasks

Presets

fold()              // Default: 100K budget, 10 turn window
fold("chat")        // 32K budget, 20 turn window
fold("coding")      // 100K budget, 15 turn window
fold("research")    // 128K budget, 10 turn window
fold("long-running") // 200K budget, 8 turn window

// Or custom
fold({ budget: 50_000, model: "gpt-4o", window: 15 })

Framework Support

OpenAI SDK
@fold/sdk/openai
Anthropic SDK
@fold/sdk/openai
Vercel AI SDK
@fold/sdk/vercel-ai
LangChain / LangGraph
@fold/sdk/langchain