Skip to main content

Vercel AI SDK Integration

Automatically track all Vercel AI SDK calls with full input/output logging, token usage, and tool executions. Two setup paths available depending on your use case.

What Gets Tracked Automatically

CategoryCaptured
LLM callsModel, provider, prompt, system message, messages
ResponsesGenerated text, finish reason
Token usageInput tokens, output tokens, total tokens
Tool callsTool name, input args, output result per call
LatencyDuration in milliseconds

Installation

npm install @runcascade/cascade-sdk ai @ai-sdk/anthropic
# OpenAI
npm install @ai-sdk/openai
# For Next.js (optional, for automatic OTEL registration)
npm install @vercel/otel

Setup Options

There are two ways to configure tracing depending on your needs. Use this when you need session tracking, multi-agent traces grouped under one root, or explicit control over flushing. Best for standalone scripts, Lambda, and multi-agent pipelines.
import { initTracing, flushTracing } from '@runcascade/cascade-sdk';

initTracing({
  project: 'my-agent',                          // visible in dashboard
  apiKey: process.env.CASCADE_API_KEY,           // or set CASCADE_API_KEY env var
  endpoint: process.env.CASCADE_ENDPOINT,
});

const result = await generateText({
  model: anthropic('claude-sonnet-4-20250514'),
  prompt: 'Hello',
  experimental_telemetry: { isEnabled: true },
});

// Required in serverless (Lambda, Vercel Functions) before returning response
await flushTracing();

Basic Usage

Enable experimental_telemetry on every AI call. That’s all that’s needed — Cascade captures everything automatically.
import { initTracing, flushTracing } from '@runcascade/cascade-sdk';
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';

initTracing({
  project: 'my-agent',
  apiKey: process.env.CASCADE_API_KEY,
  endpoint: process.env.CASCADE_ENDPOINT,
});

const { text } = await generateText({
  model: anthropic('claude-sonnet-4-20250514'),
  prompt: 'What is the capital of France?',
  experimental_telemetry: { isEnabled: true },
});

await flushTracing();

With Tools

Tool calls are automatically traced as child spans with tool.name, tool.input, and tool.output:
import { generateText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';

const { text } = await generateText({
  model: anthropic('claude-sonnet-4-20250514'),
  prompt: "What's the weather in San Francisco?",
  tools: {
    getWeather: tool({
      description: 'Get weather for a location',
      inputSchema: z.object({ location: z.string() }),
      execute: async ({ location }) => ({ temperature: 72, conditions: 'sunny' }),
    }),
  },
  experimental_telemetry: { isEnabled: true },
});

Single Trace with traceRun

Important: Without traceRun, every generateText call becomes its own separate trace in the dashboard, named ai.generateText. To give your traces a meaningful name and group related AI calls together, always wrap them in traceRun.
Wrap a generateText call (including its tool executions) under a named root trace using traceRun.
import { initTracing, traceRun, flushTracing } from '@runcascade/cascade-sdk';
import { generateText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';

initTracing({
  project: 'my-agent',
  apiKey: process.env.CASCADE_API_KEY,
  endpoint: process.env.CASCADE_ENDPOINT,
});

await traceRun('WeatherAgent', {}, async () => {
  await generateText({
    model: anthropic('claude-sonnet-4-20250514'),
    prompt: "What's the weather in Tokyo?",
    tools: {
      getWeather: tool({
        description: 'Get weather for a city',
        inputSchema: z.object({ city: z.string() }),
        execute: async ({ city }) => ({ city, temperature: 22, conditions: 'sunny' }),
      }),
    },
    experimental_telemetry: { isEnabled: true },
  });
});

await flushTracing();
This produces one trace:
WeatherAgent (root)
  ai.generateText
    ai.generateText.doGenerate   ← LLM decides to call tool
    getWeather                   ← tool executes
    ai.generateText.doGenerate   ← LLM generates final answer

Multi-Agent Traces (Single Trace)

Important: Without traceRun, each generateText call in a multi-agent pipeline creates a completely separate trace. To see all agents as one unified execution, wrap all of them inside a single traceRun. The outer traceRun becomes the root, and every generateText call inside becomes a child.
Wrap multiple generateText calls under one root trace using traceRun. Use traceAgent to label each agent — this creates named agent spans in the dashboard so you can tell which LLM calls and tool calls belong to which agent.
import { initTracing, traceRun, traceAgent, flushTracing } from '@runcascade/cascade-sdk';
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';

initTracing({
  project: 'my-agent',
  apiKey: process.env.CASCADE_API_KEY,
  endpoint: process.env.CASCADE_ENDPOINT,
});

await traceRun('ResearchOrchestrator', {}, async () => {

  // Phase 1: Planner agent — traceAgent labels this section
  const planResult = await traceAgent('PlannerAgent', {}, async () => {
    const result = await generateText({
      model: anthropic('claude-sonnet-4-20250514'),
      prompt: 'Plan a research approach for: AI in healthcare',
      experimental_telemetry: { isEnabled: true },
    });
    return result.text;
  });

  // Phase 2: Researcher agent — separate labelled section
  await traceAgent('ResearcherAgent', {}, async () => {
    await generateText({
      model: anthropic('claude-sonnet-4-20250514'),
      prompt: `Based on this plan: ${planResult}\nNow gather supporting data.`,
      experimental_telemetry: { isEnabled: true },
    });
  });

});

await flushTracing();
This produces one trace with clearly labelled agents:
ResearchOrchestrator (root)
  PlannerAgent
    ai.generateText
      ai.generateText.doGenerate
  ResearcherAgent
    ai.generateText
      ai.generateText.doGenerate
traceAgent is optional — without it the agents still appear under the root, but all ai.generateText spans look identical in the sidebar. With traceAgent, each agent gets its own named span with a distinct icon, making it easy to navigate complex multi-agent traces.

Sessions (Multi-Turn Conversations)

Group multiple traces from one conversation under a session. Each turn is its own trace, all linked together on the Sessions page.
import { initTracing, traceRun, setSessionId, endSession, flushTracing } from '@runcascade/cascade-sdk';
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';

initTracing({
  project: 'my-chatbot',
  apiKey: process.env.CASCADE_API_KEY,
  endpoint: process.env.CASCADE_ENDPOINT,
});

const sessionId = `chat-${Date.now()}`;
setSessionId(sessionId);

// Turn 1
await traceRun('ChatTurn', { session_id: sessionId, turn: 1 }, async () => {
  await generateText({
    model: anthropic('claude-sonnet-4-20250514'),
    messages: [{ role: 'user', content: 'What flights are available to London?' }],
    experimental_telemetry: { isEnabled: true, metadata: { session_id: sessionId } },
  });
});

// Turn 2
await traceRun('ChatTurn', { session_id: sessionId, turn: 2 }, async () => {
  await generateText({
    model: anthropic('claude-sonnet-4-20250514'),
    messages: [
      { role: 'user', content: 'What flights are available to London?' },
      { role: 'assistant', content: 'Here are the available flights...' },
      { role: 'user', content: 'Book the cheapest one' },
    ],
    experimental_telemetry: { isEnabled: true, metadata: { session_id: sessionId } },
  });
});

await endSession(sessionId);
await flushTracing();
Key rule: Always wrap each turn in traceRun with session_id in metadata. Without traceRun, setSessionId has no effect because the Vercel AI SDK creates its own root spans independently.

Next.js API Routes

// instrumentation.ts (project root or src/)
import { initTracing } from '@runcascade/cascade-sdk';

export function register() {
  initTracing({
    project: 'my-nextjs-app',
    apiKey: process.env.CASCADE_API_KEY,
    endpoint: process.env.CASCADE_ENDPOINT,
  });
}
// app/api/chat/route.ts
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { flushTracing } from '@runcascade/cascade-sdk';

export async function POST(request: Request) {
  const { message } = await request.json();

  const { text } = await generateText({
    model: anthropic('claude-sonnet-4-20250514'),
    messages: [
      { role: 'system', content: 'You are helpful.' },
      { role: 'user', content: message },
    ],
    experimental_telemetry: { isEnabled: true },
  });

  // Required: flush before returning in serverless
  await flushTracing();

  return Response.json({ response: text });
}

Environment Variables

VariableRequiredDescription
CASCADE_API_KEYYesYour Cascade API key (csk_live_...)
CASCADE_ENDPOINTNoCustom backend URL. Defaults to https://api.runcascade.com/v1/traces

Span Hierarchy

The Vercel AI SDK creates the following OTEL span structure, all captured by Cascade:
ai.generateText          — container span for the full generateText() call
  ai.generateText.doGenerate  — one actual HTTP call to the LLM
  <tool-name>                 — each tool execution (named by tool function)
  ai.generateText.doGenerate  — follow-up LLM call after tool results
For streamText, the pattern is ai.streamTextai.streamText.doStream.

Flushing in Serverless

If you use traceRun, it automatically calls forceFlush when the root span ends. For simple scripts with a single traceRun, no extra flush call is needed. If you don’t use traceRun (or in serverless where you need a hard guarantee before returning the HTTP response), always call await flushTracing() explicitly:
// ✅ With traceRun — flush is automatic, but explicit call is safe extra guarantee
await traceRun('MyAgent', {}, async () => {
  await generateText({ ..., experimental_telemetry: { isEnabled: true } });
});
// flushTracing() optional here for scripts, recommended for serverless

// ✅ Without traceRun — always flush manually
const result = await generateText({ ..., experimental_telemetry: { isEnabled: true } });
await flushTracing();
return Response.json({ text: result.text });

// ❌ No traceRun, no flush — root span lost in serverless
const result = await generateText({ ... });
return Response.json({ text: result.text });
The SDK also registers a process.on('beforeExit') hook as a safety net, but explicit flushing is more reliable in serverless environments where the process is killed externally.