Documentation Index
Fetch the complete documentation index at: https://docs.runcascade.com/llms.txt
Use this file to discover all available pages before exploring further.
Vercel AI SDK Integration
Automatically track all Vercel AI SDK calls with full input/output logging, token usage, and tool executions. Two setup paths available depending on your use case.
What Gets Tracked Automatically
| Category | Captured |
|---|
| LLM calls | Model, provider, prompt, system message, messages |
| Responses | Generated text, finish reason |
| Token usage | Input tokens, output tokens, total tokens |
| Tool calls | Tool name, input args, output result per call |
| Latency | Duration in milliseconds |
Installation
npm install @runcascade/cascade-sdk ai @ai-sdk/anthropic
# OpenAI
npm install @ai-sdk/openai
# For Next.js (optional, for automatic OTEL registration)
npm install @vercel/otel
Setup Options
There are two ways to configure tracing depending on your needs.
Option A: initTracing (recommended for scripts and multi-agent systems)
Use this when you need session tracking, multi-agent traces grouped under one root, or explicit control over flushing. Best for standalone scripts, Lambda, and multi-agent pipelines.
import { initTracing, flushTracing } from '@runcascade/cascade-sdk';
initTracing({
project: 'my-agent', // visible in dashboard
apiKey: process.env.CASCADE_API_KEY, // or set CASCADE_API_KEY env var
endpoint: process.env.CASCADE_ENDPOINT,
});
const result = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
prompt: 'Hello',
experimental_telemetry: { isEnabled: true },
});
// Required in serverless (Lambda, Vercel Functions) before returning response
await flushTracing();
Basic Usage
Enable experimental_telemetry on every AI call. That’s all that’s needed — Cascade captures everything automatically.
import { initTracing, flushTracing } from '@runcascade/cascade-sdk';
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
initTracing({
project: 'my-agent',
apiKey: process.env.CASCADE_API_KEY,
endpoint: process.env.CASCADE_ENDPOINT,
});
const { text } = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
prompt: 'What is the capital of France?',
experimental_telemetry: { isEnabled: true },
});
await flushTracing();
Tool calls are automatically traced as child spans with tool.name, tool.input, and tool.output:
import { generateText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';
const { text } = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
prompt: "What's the weather in San Francisco?",
tools: {
getWeather: tool({
description: 'Get weather for a location',
inputSchema: z.object({ location: z.string() }),
execute: async ({ location }) => ({ temperature: 72, conditions: 'sunny' }),
}),
},
experimental_telemetry: { isEnabled: true },
});
Single Trace with traceRun
Important: Without traceRun, every generateText call becomes its own separate trace in the dashboard, named ai.generateText. To give your traces a meaningful name and group related AI calls together, always wrap them in traceRun.
Wrap a generateText call (including its tool executions) under a named root trace using traceRun.
import { initTracing, traceRun, flushTracing } from '@runcascade/cascade-sdk';
import { generateText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';
initTracing({
project: 'my-agent',
apiKey: process.env.CASCADE_API_KEY,
endpoint: process.env.CASCADE_ENDPOINT,
});
await traceRun('WeatherAgent', {}, async () => {
await generateText({
model: anthropic('claude-sonnet-4-20250514'),
prompt: "What's the weather in Tokyo?",
tools: {
getWeather: tool({
description: 'Get weather for a city',
inputSchema: z.object({ city: z.string() }),
execute: async ({ city }) => ({ city, temperature: 22, conditions: 'sunny' }),
}),
},
experimental_telemetry: { isEnabled: true },
});
});
await flushTracing();
This produces one trace:
WeatherAgent (root)
ai.generateText
ai.generateText.doGenerate ← LLM decides to call tool
getWeather ← tool executes
ai.generateText.doGenerate ← LLM generates final answer
Multi-Agent Traces (Single Trace)
Important: Without traceRun, each generateText call in a multi-agent pipeline creates a completely separate trace. To see all agents as one unified execution, wrap all of them inside a single traceRun. The outer traceRun becomes the root, and every generateText call inside becomes a child.
Wrap multiple generateText calls under one root trace using traceRun. Use traceAgent to label each agent — this creates named agent spans in the dashboard so you can tell which LLM calls and tool calls belong to which agent.
import { initTracing, traceRun, traceAgent, flushTracing } from '@runcascade/cascade-sdk';
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
initTracing({
project: 'my-agent',
apiKey: process.env.CASCADE_API_KEY,
endpoint: process.env.CASCADE_ENDPOINT,
});
await traceRun('ResearchOrchestrator', {}, async () => {
// Phase 1: Planner agent — traceAgent labels this section
const planResult = await traceAgent('PlannerAgent', {}, async () => {
const result = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
prompt: 'Plan a research approach for: AI in healthcare',
experimental_telemetry: { isEnabled: true },
});
return result.text;
});
// Phase 2: Researcher agent — separate labelled section
await traceAgent('ResearcherAgent', {}, async () => {
await generateText({
model: anthropic('claude-sonnet-4-20250514'),
prompt: `Based on this plan: ${planResult}\nNow gather supporting data.`,
experimental_telemetry: { isEnabled: true },
});
});
});
await flushTracing();
This produces one trace with clearly labelled agents:
ResearchOrchestrator (root)
PlannerAgent
ai.generateText
ai.generateText.doGenerate
ResearcherAgent
ai.generateText
ai.generateText.doGenerate
traceAgent is optional — without it the agents still appear under the root, but all ai.generateText spans look identical in the sidebar. With traceAgent, each agent gets its own named span with a distinct icon, making it easy to navigate complex multi-agent traces.
Sessions (Multi-Turn Conversations)
Group multiple traces from one conversation under a session. Each turn is its own trace, all linked together on the Sessions page.
import { initTracing, traceRun, setSessionId, endSession, flushTracing } from '@runcascade/cascade-sdk';
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
initTracing({
project: 'my-chatbot',
apiKey: process.env.CASCADE_API_KEY,
endpoint: process.env.CASCADE_ENDPOINT,
});
const sessionId = `chat-${Date.now()}`;
setSessionId(sessionId);
// Turn 1
await traceRun('ChatTurn', { session_id: sessionId, turn: 1 }, async () => {
await generateText({
model: anthropic('claude-sonnet-4-20250514'),
messages: [{ role: 'user', content: 'What flights are available to London?' }],
experimental_telemetry: { isEnabled: true, metadata: { session_id: sessionId } },
});
});
// Turn 2
await traceRun('ChatTurn', { session_id: sessionId, turn: 2 }, async () => {
await generateText({
model: anthropic('claude-sonnet-4-20250514'),
messages: [
{ role: 'user', content: 'What flights are available to London?' },
{ role: 'assistant', content: 'Here are the available flights...' },
{ role: 'user', content: 'Book the cheapest one' },
],
experimental_telemetry: { isEnabled: true, metadata: { session_id: sessionId } },
});
});
await endSession(sessionId);
await flushTracing();
Key rule: Always wrap each turn in traceRun with session_id in metadata. Without traceRun, setSessionId has no effect because the Vercel AI SDK creates its own root spans independently.
Next.js API Routes
// instrumentation.ts (project root or src/)
import { initTracing } from '@runcascade/cascade-sdk';
export function register() {
initTracing({
project: 'my-nextjs-app',
apiKey: process.env.CASCADE_API_KEY,
endpoint: process.env.CASCADE_ENDPOINT,
});
}
// app/api/chat/route.ts
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { flushTracing } from '@runcascade/cascade-sdk';
export async function POST(request: Request) {
const { message } = await request.json();
const { text } = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
messages: [
{ role: 'system', content: 'You are helpful.' },
{ role: 'user', content: message },
],
experimental_telemetry: { isEnabled: true },
});
// Required: flush before returning in serverless
await flushTracing();
return Response.json({ response: text });
}
Environment Variables
| Variable | Required | Description |
|---|
CASCADE_API_KEY | Yes | Your Cascade API key (csk_live_...) |
CASCADE_ENDPOINT | No | Custom backend URL. Defaults to https://api.runcascade.com/v1/traces |
Span Hierarchy
The Vercel AI SDK creates the following OTEL span structure, all captured by Cascade:
ai.generateText — container span for the full generateText() call
ai.generateText.doGenerate — one actual HTTP call to the LLM
<tool-name> — each tool execution (named by tool function)
ai.generateText.doGenerate — follow-up LLM call after tool results
For streamText, the pattern is ai.streamText → ai.streamText.doStream.
Flushing in Serverless
If you use traceRun, it automatically calls forceFlush when the root span ends. For simple scripts with a single traceRun, no extra flush call is needed.
If you don’t use traceRun (or in serverless where you need a hard guarantee before returning the HTTP response), always call await flushTracing() explicitly:
// ✅ With traceRun — flush is automatic, but explicit call is safe extra guarantee
await traceRun('MyAgent', {}, async () => {
await generateText({ ..., experimental_telemetry: { isEnabled: true } });
});
// flushTracing() optional here for scripts, recommended for serverless
// ✅ Without traceRun — always flush manually
const result = await generateText({ ..., experimental_telemetry: { isEnabled: true } });
await flushTracing();
return Response.json({ text: result.text });
// ❌ No traceRun, no flush — root span lost in serverless
const result = await generateText({ ... });
return Response.json({ text: result.text });
The SDK also registers a process.on('beforeExit') hook as a safety net, but explicit flushing is more reliable in serverless environments where the process is killed externally.