Overview

Safety in Cascade is the process of ensuring that AI agents behave within explicitly defined boundaries during execution. Rather than prescribing a fixed definition of “safe”, Cascade allows teams to define their own safety policies that are continuously evaluated against agent behavior at runtime. Policies are evaluated using traces captured from LLM calls, tool usage, and reasoning steps, producing structured signals that inform enforcement decisions. This approach gives teams precise, context-aware control over what their agents are allowed to do and what behavior must be prevented.

How Safety Works in Cascade

Safety in Cascade is policy-driven and signal-based. As an agent executes, Cascade’s DeepStream system captures detailed traces of its behavior. These traces are evaluated in real time against user-defined safety policies. The evaluation produces Safety Signals that indicate policy compliance, behavioral drift, and semantic classification of agent actions. These signals enable teams to monitor agent behavior, identify violations, and take appropriate action based on the configured enforcement mode.

Safety Policies

Cascade supports three policy types, each designed to address a different class of agent behavior.

Tool Policies

Control which tools an agent can use. Evaluated at invocation to prevent access to destructive or sensitive operations.

Categorization Policies

Classify agent outputs into semantic categories and block out-of-policy content like harmful or unsafe responses.

Semantic Policies

Define behavioral constraints using natural language. Enforce complex, context-dependent rules through semantic reasoning.

Policies are defined as JSON configuration files and can be attached globally or per workflow. They can be updated without changing agent code, enabling rapid iteration and safe rollout.

Safety Signals

Safety Signals are the runtime indicators Cascade uses to evaluate agent behavior. There are three primary signal types:

Policy Violation Signals: Indicate when agent behavior does not comply with active safety policies. These signals are generated in real time and drive enforcement actions.
Drift Signals: Surface statistically significant changes in agent behavior across executions. These signals help teams understand how changes in prompts, tools, or models affect behavior over time.
Classification Signals: Label agent behavior into semantic categories. These classifications provide structured inputs that categorization policies act upon.

Safety Signals are aggregated into metrics that provide visibility into agent behavior and policy effectiveness across time and workflows.

Runtime Evaluation

Safety evaluation happens continuously while an agent runs. As traces are generated, Cascade:

Extracts relevant behavioral signals from tool calls, reasoning traces, and outputs
Evaluates all active policies against the current execution context
Classifies agent behavior into semantic categories
Detects drift against historical baselines
Produces safety findings tied to the trace

This enables safety decisions to be made using full execution context rather than static prompts or offline checks.

Who Safety Is For

The Safety system is designed for:

Developers building agent workflows
Platform teams managing agent behavior at scale
Product teams defining acceptable outcomes
Compliance and security teams monitoring behavioral risk

Next steps

Policies

Learn how to define Tool, Categorization, and Semantic policies for your agents

Safety Signals

Understand how violations, drift, and classifications are surfaced

Getting started

Observability

Enforcement Modes

Safety

Security

How Safety Works in Cascade

Safety Policies

Tool Policies

Categorization Policies

Semantic Policies

Safety Signals

Runtime Evaluation

Who Safety Is For

Next steps

Policies

Safety Signals

Getting started

Observability

Enforcement Modes

Safety

Security

​How Safety Works in Cascade

​Safety Policies

Tool Policies

Categorization Policies

Semantic Policies

​Safety Signals

​Runtime Evaluation

​Who Safety Is For

​Next steps

Policies

Safety Signals

How Safety Works in Cascade

Safety Policies

Safety Signals

Runtime Evaluation

Who Safety Is For

Next steps