Integrations
Vercel AI SDK Integration
Integrate Fold with the Vercel AI SDK for automatic context optimization. Works withgenerateText,streamText, and React hooks.
Installation
pnpm add @fold/sdkServer-Side Usage
Use withFold() in server actions or API routes:
import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { withFold } from '@fold/sdk/vercel-ai'
export async function POST(request: Request) {
const { messages } = await request.json()
// Create optimizer
const fold = withFold({ budget: 100_000 })
const result = await streamText({
model: openai('gpt-4o'),
messages: fold.optimize(messages), // Optimized!
})
// Log savings
console.log(fold.saved())
// { tokens: 12000, percent: 62, cost: 0.12 }
return result.toDataStreamResponse()
}React Client Hook
Use useFold() in React components with Vercel AI's useChat:
'use client'
import { useChat } from 'ai/react'
import { useFold } from '@fold/sdk/vercel-ai'
export function Chat() {
const fold = useFold({ budget: 50_000 })
const { messages, input, handleInputChange, handleSubmit } = useChat({
// Optimize messages before sending
experimental_prepareRequestBody: ({ messages }) => ({
messages: fold.optimize(messages)
})
})
return (
<div className="flex flex-col h-screen">
{/* Messages */}
<div className="flex-1 overflow-y-auto p-4">
{messages.map((m) => (
<div key={m.id} className="mb-4">
<strong>{m.role}:</strong> {m.content}
</div>
))}
</div>
{/* Savings indicator */}
<div className="px-4 py-2 bg-muted text-sm">
Saved {fold.saved().percent.toFixed(0)}% tokens
</div>
{/* Input */}
<form onSubmit={handleSubmit} className="p-4 border-t">
<input
value={input}
onChange={handleInputChange}
placeholder="Say something..."
className="w-full p-2 border rounded"
/>
</form>
</div>
)
}With generateText
Non-streaming text generation:
import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { withFold } from '@fold/sdk/vercel-ai'
const fold = withFold({ budget: 100_000 })
const { text } = await generateText({
model: openai('gpt-4o'),
messages: fold.optimize(messages),
})
console.log(text)
console.log(fold.saved())Multi-Provider Support
Fold works with any Vercel AI provider:
import { streamText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { anthropic } from '@ai-sdk/anthropic'
import { google } from '@ai-sdk/google'
import { withFold } from '@fold/sdk/vercel-ai'
const fold = withFold({ budget: 100_000 })
const optimized = fold.optimize(messages)
// OpenAI
await streamText({
model: openai('gpt-4o'),
messages: optimized,
})
// Anthropic
await streamText({
model: anthropic('claude-sonnet-4-5-20250929'),
messages: optimized,
})
// Google
await streamText({
model: google('gemini-1.5-pro'),
messages: optimized,
})API Reference
withFold(options)
Create a server-side context optimizer
Options
| Option | Type | Description |
|---|---|---|
| budget | number | Token budget (required) |
| model | string | Model for tokenization (default: "gpt-4o") |
| window | number | Turns to keep unmasked (default: 10) |
Returns
optimize(messages)- Optimize messages arraysaved()- Get savings{ tokens, percent, cost }stats()- Get detailed statistics
useFold(options)
React hook for client-side optimization
Options
Same options as withFold()
Returns
optimize(messages)- Optimize messages (memoized)saved()- Get current savingsreset()- Reset optimizer state
Best Practices
Optimize on the server when possible
Use
withFold() in API routes or server actions. This keeps your token optimization logic server-side.Use the client hook for real-time feedback
useFold() is great for showing users their savings in real-time. Use it with useChat's prepare callback.Persist context for long sessions
For multi-turn conversations across page reloads, use
fold() directly with its save() and restore() methods.