Sessions & Tracing

Cascade uses OpenTelemetry under the hood to capture a full execution tree of your agent: every LLM call, tool invocation, sub-agent delegation, and function, captured as a hierarchical trace.

Initialize tracing

Call init_tracing() once at the top of your application. Everything else is automatic.

from cascade import init_tracing

# With a project name (recommended - organizes traces in the dashboard)
init_tracing(project="customer_support_chatbot")

Trace a run

Wrap your agent’s entry point with trace_run() to create the root span. Any execution triggered from within the block is captured—including LLM calls, tool invocations, and sub-agent delegations that occur in nested functions or framework code. For example, if your agent logic lives in a single call (e.g. plan = planner.run(task)), placing that call inside trace_run is enough; everything inside run() is traced as child spans.

from cascade import init_tracing, trace_run

init_tracing(project="my_agent")

with trace_run("MyAgent") as run:
    # Everything inside this block is traced
    result = my_agent.execute(task)

Trace sub-agents

If you have multiple agents or sub-agents, use trace_agent() to create a named sub-agent span. All tool calls and LLM calls inside the block are automatically tagged with the agent name.

from cascade import trace_run, trace_agent

with trace_run("Orchestrator"):
    with trace_agent("PlannerAgent"):
        plan = planner.run(task)

    with trace_agent("ExecutorAgent"):
        result = executor.run(plan)

Trace multi-turn sessions

For multi-turn conversations, group traces under a session so they appear together in the dashboard. Create a session ID, call set_session_id() to set it in context, and call end_session() when the conversation ends. Example:

from cascade import init_tracing, trace_run, trace_agent, set_session_id, end_session
import uuid
import time

init_tracing(project="chatbot")

# Create a session ID for this conversation
session_id = f"chat-{uuid.uuid4().hex[:8]}-{int(time.time())}"
set_session_id(session_id)

try:
    turn = 0
    while True:
        user_input = input("You: ")
        if user_input.lower() in ("quit", "exit"):
            break
        with trace_run("ChatTurn", metadata={"turn": turn}):
            with trace_agent("Assistant"):
                reply = agent.respond(user_input)
        print(reply)
        turn += 1
finally:
    end_session(session_id)

Each trace_run() inherits the session ID from context.

Trace tools

Decorate any function with @tool to trace it as a tool call. Works with both sync and async functions.

from cascade import tool

@tool
def search_database(query: str) -> list:
    """Search the knowledge base for relevant documents."""
    return db.search(query)

@tool
async def call_api(endpoint: str, payload: dict) -> dict:
    """Make an external API call."""
    async with aiohttp.ClientSession() as session:
        async with session.get(endpoint, json=payload) as resp:
            return await resp.json()

# Custom span name
@tool(name="VectorSearch")
def search(self, query: str) -> list:
    ...

The @tool decorator automatically records:

Input parameters
Output value
Execution time
Errors (with full exception info)

Trace functions

Use @function for internal utility functions that support your tools. These appear as distinct “function” spans in the trace tree.

from cascade import tool, function

@function
def parse_response(raw: str) -> dict:
    """Parse the raw API response into structured data."""
    return json.loads(raw)

@tool
def fetch_and_parse(url: str) -> dict:
    raw = requests.get(url).text
    return parse_response(raw)  # ← traced as a child span

Wrap LLM clients

Use wrap_llm_client() to automatically trace every LLM call (prompts, completions, token counts, latency, and cost) with zero changes to your existing code.

Anthropic
OpenAI
OpenRouter

from cascade import init_tracing, wrap_llm_client
from anthropic import Anthropic

init_tracing(project="my_agent")

client = wrap_llm_client(Anthropic())
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)
# Automatically traced!

from cascade import init_tracing, wrap_llm_client
from openai import OpenAI

init_tracing(project="my_agent")

client = wrap_llm_client(OpenAI())
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}],
)
# Automatically traced!

from cascade import init_tracing, wrap_llm_client
from openai import OpenAI

init_tracing(project="my_agent")

client = wrap_llm_client(OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-...",
))
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Hello!"}],
)
# Automatically traced!

What gets captured per LLM call

Attribute	Description
`llm.model`	Model name (e.g. `claude-sonnet-4-20250514`, `gpt-4`)
`llm.provider`	Provider (`anthropic`, `openai`)
`llm.prompt`	Input prompt/messages
`llm.completion`	Full completion text
`llm.input_tokens`	Input/prompt token count
`llm.output_tokens`	Output/completion token count
`llm.total_tokens`	Total tokens
`llm.latency_ms`	Latency in milliseconds
`llm.cost_usd`	Estimated cost in USD

Streaming support

Streaming is fully supported for both Anthropic and OpenAI:

# Anthropic streaming
with client.messages.stream(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=messages
) as stream:
    for event in stream:
        ...
    final = stream.get_final_message()

# OpenAI streaming
stream = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    stream=True
)
for chunk in stream:
    ...

Full example

A complete example combining all tracing features:

from cascade import init_tracing, trace_run, trace_agent, wrap_llm_client, tool
from anthropic import Anthropic

# 1. Initialize
init_tracing(project="travel_agent")
client = wrap_llm_client(Anthropic())

# 2. Define tools
@tool
def search_flights(origin: str, destination: str, date: str) -> list:
    """Search available flights."""
    return flight_api.search(origin, destination, date)

@tool
def book_flight(flight_id: str) -> dict:
    """Book a specific flight."""
    return flight_api.book(flight_id)

# 3. Run with tracing
with trace_run("TravelAgent"):
    with trace_agent("PlannerAgent"):
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=[
                {
                    "role": "user",
                    "content": "Find me a flight from NYC to Tokyo next Friday"
                }
            ],
        )
        flights = search_flights("JFK", "NRT", "2026-02-20")

    with trace_agent("BookingAgent"):
        booking = book_flight(flights[0]["id"])

Getting started

Performance Evaluation

Dashboard Overview

Integrations

Model providers

Developers

Security

Sessions & Tracing

Initialize tracing

Trace a run

Trace sub-agents

Trace multi-turn sessions

Trace tools

Trace functions

Wrap LLM clients

What gets captured per LLM call

Streaming support

Full example

Getting started

Performance Evaluation

Dashboard Overview

Integrations

Model providers

Developers

Security

​Initialize tracing

​Trace a run

​Trace sub-agents

​Trace multi-turn sessions

​Trace tools

​Trace functions

​Wrap LLM clients

​What gets captured per LLM call

​Streaming support

​Full example

Initialize tracing

Trace a run

Trace sub-agents

Trace multi-turn sessions

Trace tools

Trace functions

Wrap LLM clients

What gets captured per LLM call

Streaming support

Full example