Integrations
MCP Server Integration
Build MCP (Model Context Protocol) servers that use Fold for persistent, optimized context management. Works with Claude Code, Cursor, and any MCP-compatible tool.
What is MCP?
The Model Context Protocol (MCP) is an open protocol that allows AI assistants to interact with external tools and data sources. Both Claude Code and Cursor support MCP servers.
Why Fold + MCP?
Persistent Context
MCP servers run as separate processes. Fold helps maintain optimized context across multiple tool calls and sessions.
Cost Efficiency
When your MCP server needs to call LLMs internally, Fold reduces token costs by 50-78%.
Session Management
Track and restore context across restarts using Fold's save/restore functionality.
Stop Signal Detection
Prevent runaway agent loops with Fold's built-in failure and loop detection.
Installation
pnpm add @fold/sdkBasic MCP Server with Fold
Here's a minimal MCP server that uses Fold to manage context sessions:
import { Server } from "@modelcontextprotocol/sdk/server/index.js"
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"
import { fold, restore } from '@fold/sdk'
const sessions = new Map<string, ReturnType<typeof fold>>()
const server = new Server({
name: "fold-context-manager",
version: "1.0.0",
}, {
capabilities: { tools: {} },
})
// Register tools
server.setRequestHandler("tools/list", async () => ({
tools: [
{
name: "context_observe",
description: "Add an observation to the context (e.g., file contents, API response)",
inputSchema: {
type: "object",
properties: {
sessionId: { type: "string", description: "Session identifier" },
content: { type: "string", description: "The content to add" },
source: { type: "string", description: "Source of the observation" }
},
required: ["sessionId", "content"]
}
},
{
name: "context_stats",
description: "Get context statistics and savings",
inputSchema: {
type: "object",
properties: {
sessionId: { type: "string" }
},
required: ["sessionId"]
}
}
]
}))
// Handle tool calls
server.setRequestHandler("tools/call", async (request) => {
const { name, arguments: args } = request.params
// Get or create session
let ctx = sessions.get(args.sessionId)
if (!ctx && name !== "context_stats") {
ctx = fold("long-running")
sessions.set(args.sessionId, ctx)
}
switch (name) {
case "context_observe":
ctx!.observe(args.content, args.source)
return {
content: [{
type: "text",
text: `Added to context. Saved ${ctx!.saved().percent.toFixed(0)}% tokens.`
}]
}
case "context_stats":
if (!ctx) {
return { content: [{ type: "text", text: "Session not found" }] }
}
return {
content: [{
type: "text",
text: JSON.stringify(ctx.stats(), null, 2)
}]
}
default:
return { content: [{ type: "text", text: "Unknown tool" }] }
}
})
// Start server
const transport = new StdioServerTransport()
await server.connect(transport)Advanced: AI-Powered MCP Tool
Here's an MCP server that uses Fold to manage context for an internal LLM:
import { Server } from "@modelcontextprotocol/sdk/server/index.js"
import { fold } from '@fold/sdk'
import OpenAI from 'openai'
const openai = new OpenAI()
const sessions = new Map<string, ReturnType<typeof fold>>()
const server = new Server({
name: "fold-ai-assistant",
version: "1.0.0",
}, {
capabilities: { tools: {} },
})
server.setRequestHandler("tools/list", async () => ({
tools: [{
name: "analyze_code",
description: "Analyze code with context-aware AI. Remembers previous analyses.",
inputSchema: {
type: "object",
properties: {
sessionId: { type: "string" },
code: { type: "string", description: "Code to analyze" },
question: { type: "string", description: "Question about the code" }
},
required: ["sessionId", "code", "question"]
}
}]
}))
server.setRequestHandler("tools/call", async (request) => {
const { arguments: args } = request.params
// Get or create Fold session
let ctx = sessions.get(args.sessionId)
if (!ctx) {
ctx = fold("coding")
ctx.system(`You are a code analysis assistant.
Provide clear, actionable insights about code quality,
bugs, and improvements.`)
sessions.set(args.sessionId, ctx)
}
// Add the code as an observation
ctx.observe(`Code to analyze:
\`\`\`
${args.code}
\`\`\``, "user")
ctx.observe(args.question, "user")
// Call LLM with optimized context
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: ctx.messages(), // Fold-optimized!
})
const answer = response.choices[0].message.content ?? ""
ctx.think(answer)
return {
content: [{
type: "text",
text: `${answer}
---
Context savings: ${ctx.saved().percent.toFixed(0)}%`
}]
}
})Using with Claude Code
Add your MCP server to Claude Code's configuration:
~/.config/claude-code/config.json
// ~/.config/claude-code/config.json
{
"mcpServers": {
"fold-context": {
"command": "node",
"args": ["/path/to/your/fold-mcp-server.js"],
"env": {
"OPENAI_API_KEY": "your-key-here"
}
}
}
}Using with Cursor
Add your MCP server to Cursor's settings:
.cursor/mcp.json
// .cursor/mcp.json (in your project root)
{
"mcpServers": {
"fold-context": {
"command": "node",
"args": ["./mcp-servers/fold-context.js"]
}
}
}Best Practices
Use session IDs consistently
Use consistent session IDs (e.g., workspace path, project name) to maintain context across tool calls.
Persist sessions to disk
Use
ctx.save() to persist sessions to disk periodically, and restore them on server restart.Clean up old sessions
Implement a TTL or LRU cache for sessions to prevent memory leaks in long-running servers.