Documentation Index
Fetch the complete documentation index at: https://docs.runcascade.com/llms.txt
Use this file to discover all available pages before exploring further.
Cascade uses OpenTelemetry under the hood to capture a full execution tree of your agent: every LLM call, tool invocation, sub-agent delegation, and function, captured as a hierarchical trace.
Initialize tracing
Call init_tracing() once at the top of your application. Everything else is automatic.
from cascade import init_tracing
# With a project name (recommended - organizes traces in the dashboard)
init_tracing(project="customer_support_chatbot")
Trace a run
Wrap your agent’s entry point with trace_run() to create the root span. Any execution triggered from within the block is captured—including LLM calls, tool invocations, and sub-agent delegations that occur in nested functions or framework code. For example, if your agent logic lives in a single call (e.g. plan = planner.run(task)), placing that call inside trace_run is enough; everything inside run() is traced as child spans.
from cascade import init_tracing, trace_run
init_tracing(project="my_agent")
with trace_run("MyAgent") as run:
# Everything inside this block is traced
result = my_agent.execute(task)
Trace sub-agents
If you have multiple agents or sub-agents, use trace_agent() to create a named sub-agent span. All tool calls and LLM calls inside the block are automatically tagged with the agent name.
from cascade import trace_run, trace_agent
with trace_run("Orchestrator"):
with trace_agent("PlannerAgent"):
plan = planner.run(task)
with trace_agent("ExecutorAgent"):
result = executor.run(plan)
Trace multi-turn sessions
For multi-turn conversations, group traces under a session so they appear together in the dashboard. Create a session ID, call set_session_id() to set it in context, and call end_session() when the conversation ends.
Example:
from cascade import init_tracing, trace_run, trace_agent, set_session_id, end_session
import uuid
import time
init_tracing(project="chatbot")
# Create a session ID for this conversation
session_id = f"chat-{uuid.uuid4().hex[:8]}-{int(time.time())}"
set_session_id(session_id)
try:
turn = 0
while True:
user_input = input("You: ")
if user_input.lower() in ("quit", "exit"):
break
with trace_run("ChatTurn", metadata={"turn": turn}):
with trace_agent("Assistant"):
reply = agent.respond(user_input)
print(reply)
turn += 1
finally:
end_session(session_id)
Each trace_run() inherits the session ID from context.
Decorate any function with @tool to trace it as a tool call. Works with both sync and async functions.
from cascade import tool
@tool
def search_database(query: str) -> list:
"""Search the knowledge base for relevant documents."""
return db.search(query)
@tool
async def call_api(endpoint: str, payload: dict) -> dict:
"""Make an external API call."""
async with aiohttp.ClientSession() as session:
async with session.get(endpoint, json=payload) as resp:
return await resp.json()
# Custom span name
@tool(name="VectorSearch")
def search(self, query: str) -> list:
...
The @tool decorator automatically records:
- Input parameters
- Output value
- Execution time
- Errors (with full exception info)
Trace functions
Use @function for internal utility functions that support your tools. These appear as distinct “function” spans in the trace tree.
from cascade import tool, function
@function
def parse_response(raw: str) -> dict:
"""Parse the raw API response into structured data."""
return json.loads(raw)
@tool
def fetch_and_parse(url: str) -> dict:
raw = requests.get(url).text
return parse_response(raw) # ← traced as a child span
Wrap LLM clients
Use wrap_llm_client() to automatically trace every LLM call (prompts, completions, token counts, latency, and cost) with zero changes to your existing code.
Anthropic
OpenAI
OpenRouter
from cascade import init_tracing, wrap_llm_client
from anthropic import Anthropic
init_tracing(project="my_agent")
client = wrap_llm_client(Anthropic())
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
)
# Automatically traced!
from cascade import init_tracing, wrap_llm_client
from openai import OpenAI
init_tracing(project="my_agent")
client = wrap_llm_client(OpenAI())
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
)
# Automatically traced!
from cascade import init_tracing, wrap_llm_client
from openai import OpenAI
init_tracing(project="my_agent")
client = wrap_llm_client(OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key="sk-or-...",
))
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4-20250514",
messages=[{"role": "user", "content": "Hello!"}],
)
# Automatically traced!
What gets captured per LLM call
| Attribute | Description |
|---|
llm.model | Model name (e.g. claude-sonnet-4-20250514, gpt-4) |
llm.provider | Provider (anthropic, openai) |
llm.prompt | Input prompt/messages |
llm.completion | Full completion text |
llm.input_tokens | Input/prompt token count |
llm.output_tokens | Output/completion token count |
llm.total_tokens | Total tokens |
llm.latency_ms | Latency in milliseconds |
llm.cost_usd | Estimated cost in USD |
Streaming support
Streaming is fully supported for both Anthropic and OpenAI:
# Anthropic streaming
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
) as stream:
for event in stream:
...
final = stream.get_final_message()
# OpenAI streaming
stream = client.chat.completions.create(
model="gpt-4",
messages=messages,
stream=True
)
for chunk in stream:
...
Full example
A complete example combining all tracing features:
from cascade import init_tracing, trace_run, trace_agent, wrap_llm_client, tool
from anthropic import Anthropic
# 1. Initialize
init_tracing(project="travel_agent")
client = wrap_llm_client(Anthropic())
# 2. Define tools
@tool
def search_flights(origin: str, destination: str, date: str) -> list:
"""Search available flights."""
return flight_api.search(origin, destination, date)
@tool
def book_flight(flight_id: str) -> dict:
"""Book a specific flight."""
return flight_api.book(flight_id)
# 3. Run with tracing
with trace_run("TravelAgent"):
with trace_agent("PlannerAgent"):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "Find me a flight from NYC to Tokyo next Friday"
}
],
)
flights = search_flights("JFK", "NRT", "2026-02-20")
with trace_agent("BookingAgent"):
booking = book_flight(flights[0]["id"])