From 69dce1c34d88f693fc0c94195d260653e879e382 Mon Sep 17 00:00:00 2001 From: Wania Kazmi <112770629+Wania-Kazmi@users.noreply.github.com> Date: Thu, 15 May 2025 17:40:09 +0500 Subject: [PATCH 1/2] docs: clarify input and output guardrails in agent documentation + Correct Output guardrail first step. Enhanced the documentation for input and output guardrails by specifying that input guardrails apply only to the first agent in a sequence, while output guardrails apply only to the last agent. This clarification improves understanding of how guardrails function in agent chains. --- docs/guardrails.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/guardrails.md b/docs/guardrails.md index 2f0be0f2..849f3465 100644 --- a/docs/guardrails.md +++ b/docs/guardrails.md @@ -17,19 +17,19 @@ Input guardrails run in 3 steps: !!! Note - Input guardrails are intended to run on user input, so an agent's guardrails only run if the agent is the *first* agent. You might wonder, why is the `guardrails` property on the agent instead of passed to `Runner.run`? It's because guardrails tend to be related to the actual Agent - you'd run different guardrails for different agents, so colocating the code is useful for readability. + Input guardrails are intended to run on user input, so an agent's guardrails only run if the agent is the *first* agent. **In a sequence or chain of agents, the 'first agent' is the entry point – the agent that directly receives the initial user's input. Therefore, input guardrails only check this agent’s input.** You might wonder, why is the `guardrails` property on the agent instead of passed to `Runner.run`? It's because guardrails tend to be related to the actual Agent - you'd run different guardrails for different agents, so colocating the code is useful for readability. ## Output guardrails Output guardrails run in 3 steps: -1. First, the guardrail receives the same input passed to the agent. +1. First, the guardrail receives the same output passed to the agent. 2. Next, the guardrail function runs to produce a [`GuardrailFunctionOutput`][agents.guardrail.GuardrailFunctionOutput], which is then wrapped in an [`OutputGuardrailResult`][agents.guardrail.OutputGuardrailResult] 3. Finally, we check if [`.tripwire_triggered`][agents.guardrail.GuardrailFunctionOutput.tripwire_triggered] is true. If true, an [`OutputGuardrailTripwireTriggered`][agents.exceptions.OutputGuardrailTripwireTriggered] exception is raised, so you can appropriately respond to the user or handle the exception. !!! Note - Output guardrails are intended to run on the final agent output, so an agent's guardrails only run if the agent is the *last* agent. Similar to the input guardrails, we do this because guardrails tend to be related to the actual Agent - you'd run different guardrails for different agents, so colocating the code is useful for readability. + Output guardrails are intended to run on the final agent output, so an agent's guardrails only run if the agent is the *last* agent. **In a sequence or chain of agents, the 'last agent' is the one that produces the final output returned to the user. Therefore, output guardrails only check this agent’s output.** Similar to the input guardrails, we do this because guardrails tend to be related to the actual Agent - you'd run different guardrails for different agents, so colocating the code is useful for readability. ## Tripwires From 90977ce0e218b364f0ba0d732293d7bdd242e6b2 Mon Sep 17 00:00:00 2001 From: Wania Kazmi <112770629+Wania-Kazmi@users.noreply.github.com> Date: Thu, 15 May 2025 20:24:45 +0500 Subject: [PATCH 2/2] docs: refine output guardrail description for clarity Updated the documentation for output guardrails to specify that the guardrail receives the output produced by the agent, enhancing clarity in the guardrail process. --- docs/guardrails.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/guardrails.md b/docs/guardrails.md index 849f3465..a7012fd8 100644 --- a/docs/guardrails.md +++ b/docs/guardrails.md @@ -23,7 +23,7 @@ Input guardrails run in 3 steps: Output guardrails run in 3 steps: -1. First, the guardrail receives the same output passed to the agent. +1. First, the guardrail receives the same output produced by the agent. 2. Next, the guardrail function runs to produce a [`GuardrailFunctionOutput`][agents.guardrail.GuardrailFunctionOutput], which is then wrapped in an [`OutputGuardrailResult`][agents.guardrail.OutputGuardrailResult] 3. Finally, we check if [`.tripwire_triggered`][agents.guardrail.GuardrailFunctionOutput.tripwire_triggered] is true. If true, an [`OutputGuardrailTripwireTriggered`][agents.exceptions.OutputGuardrailTripwireTriggered] exception is raised, so you can appropriately respond to the user or handle the exception.