API Reference

@fold/sdk API Reference

Complete reference for all functions, classes, and types in the Fold SDK.

Easy API

The simplest way to use Fold. Handles all optimization automatically.

fold(options?)
Create a context optimizer with sensible defaults
import { fold } from '@fold/sdk'

// Default: 100K budget, 10 turn window
const ctx = fold()

// Use a preset
const ctx = fold("coding")

// Custom options
const ctx = fold({
  budget: 50_000,
  model: "gpt-4o",
  window: 15
})

Presets

PresetBudgetWindow
(default)100K10 turns
"chat"32K20 turns
"coding"100K15 turns
"research"128K10 turns
"long-running"200K8 turns

Returns

FoldContext - A context optimizer instance

restore(data)
Restore a context from saved state
import { fold, restore } from '@fold/sdk'

// Save context
const ctx = fold()
ctx.observe("Hello!", "user")
const saved = ctx.save()
localStorage.setItem('context', JSON.stringify(saved))

// Later, restore it
const data = JSON.parse(localStorage.getItem('context'))
const restored = restore(data)

Parameters

  • data - Serialized context from ctx.save()

FoldContext Methods

Methods available on the context object returned by fold().

Content Methods
Add content to the context

system(prompt: string)

Set the system prompt. Called once at the start.

ctx.system("You are a helpful coding assistant.")

think(content: string)

Add reasoning or thoughts. Protected from aggressive optimization.

ctx.think("I should search for the configuration file first.")

act(action: object, tool?: string)

Add an action or tool call. Include the tool name for better categorization.

ctx.act({ path: "/app/config.json" }, "read_file")
ctx.act({ query: "SELECT * FROM users" }, "database")

observe(result: string, source?: string)

Add an observation or result. These are the most aggressively optimized.

ctx.observe("File contents: { port: 3000 }", "read_file")
ctx.observe("Query returned 42 rows", "database")
Output Methods
Get optimized context for your LLM

messages()

Get optimized messages array for LLM APIs. Compatible with OpenAI and Anthropic.

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: ctx.messages(),  // Optimized!
})

compile()

Get full compilation result with messages and optimization stats.

const { messages, optimization, stopSignal } = ctx.compile()
console.log(optimization.savingsPercent) // 45
Stop Signal Methods
Detect when to stop the agent loop

stop()

Check if the agent should stop. Returns true for loops, failures, or goal completion.

while (true) {
  // ... agent logic
  if (ctx.stop()) break
}

reason()

Get the stop reason when stop() returns true.

if (ctx.stop()) {
  const signal = ctx.reason()
  console.log(signal.reason)      // "repeated_failure"
  console.log(signal.confidence)  // 0.85
}

Stop Reasons

ReasonDescription
repeated_failure3+ consecutive errors
no_progressSame action repeated 5x
goal_achievedSuccess indicators detected
turn_limit50+ turns executed
Statistics Methods
Track optimization and savings

saved()

Quick summary of token savings and cost reduction.

console.log(ctx.saved())
// { tokens: 5000, percent: 45, cost: 0.05 }

stats()

Detailed statistics about context and optimization.

console.log(ctx.stats())
// {
//   turnCount: 25,
//   totalTokens: 12000,
//   maskedTokens: 7000,
//   summarizedTokens: 0,
//   compressionRatio: 0.58
// }
Persistence Methods
Save and restore context across sessions

save()

Export context for persistence. Returns a serializable object.

const data = ctx.save()
localStorage.setItem('context', JSON.stringify(data))

Advanced API (ContextSession)

For full control over optimization behavior, use ContextSession directly.

new ContextSession(config)
Create a context session with full configuration
import { ContextSession } from '@fold/sdk'

const session = new ContextSession({
  budget: 100_000,
  model: 'claude-sonnet-4-5-20250929',
  masking: {
    enabled: true,
    strategy: 'hybrid',
    windowSize: 15,
    preserveAnchors: true,
  },
  summarization: {
    enabled: true,
    trigger: 'budget_pressure',
    threshold: 0.8,
  },
})

Configuration Options

OptionTypeDescription
budgetnumberToken budget limit
modelstringModel name for tokenizer selection
masking.enabledbooleanEnable observation masking
masking.strategystring"rolling_window" | "token_budget" | "hybrid"
masking.windowSizenumberNumber of turns to keep unmasked
summarization.enabledbooleanEnable automatic summarization
summarization.triggerstring"budget_pressure" | "turn_count"

Masking Strategies

rolling_window
Keep the last N turns full, mask earlier observations with placeholders. Best for general use.
token_budget
Mask oldest observations until under budget. Best for strict token limits.
relevance_scored
Score relevance of each turn and mask lowest first. Best for query-heavy tasks.
hybrid
Combine window + budget + relevance strategies. Maximum efficiency (recommended).

Summarizers

Built-in summarizer factories for different LLM providers.

import {
  createOpenAISummarizer,      // OpenAI API
  createAnthropicSummarizer,   // Claude API
  createFetchSummarizer,       // Any OpenAI-compatible API
  createSimpleSummarizer,      // Offline/heuristic-based
} from '@fold/sdk/summarizers'

// OpenAI
session.setSummarizationCallback(
  createOpenAISummarizer(new OpenAI(), { model: 'gpt-4o-mini' })
)

// Groq (fast inference)
session.setSummarizationCallback(
  createFetchSummarizer({
    url: 'https://api.groq.com/openai/v1/chat/completions',
    apiKey: process.env.GROQ_API_KEY,
    model: 'llama-3.1-8b-instant',
  })
)

// Local Ollama
session.setSummarizationCallback(
  createFetchSummarizer({
    url: 'http://localhost:11434/v1/chat/completions',
    apiKey: 'ollama',
    model: 'llama3.2',
  })
)