-
Notifications
You must be signed in to change notification settings - Fork 21
feat: Add Human in the Loop confirmation strategies for tool execution in Agent #369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Some renaming and example how I'd like the interface to look Alternative implementation formatting Continue work on alternative Fixes Add more examples Change name of file
Pull Request Test Coverage Report for Build 18378714213Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some high-level comments:
-
Skimming other frameworks, it doesn't seem like there's a single right way to do HITL (Tool level or Agent level?). If this approach also helps with serialization headaches, it could be a good direction.
-
I see a design tension between giving users freedom to provide their own
ConfirmationStrategy
(there is a protocol for this andConfirmationResult.action
is a string) and requiring a specific stricter strategy (the current implementation of_handle_confirmation_strategies
only expects confirm/reject/modify). If we want to let users free to implement their strategies, I would suggest delegating most of the logic in_handle_confirmation_strategies
to theConfirmationStrategy
implementation. (Otherwise, we could be stricter in definingConfirmationResult
and removing the protocol...) -
about names: maybe we can rename
UserInterface
→ConfirmationUI
but if we go with the approach proposed in this PR, I recommend checking in with Bilge.
) * Exploring supporting hitl using breakpoints * Add example script * Update example so using only break points to stop and start execution works * Keep working on the examples * Updating example * Making progress, the confirm, and modify both work now. The rejection option still has a problem. * Get example to work * Work on some of the TODOs * Add tool_id to ToolExecutionDecision * Use the serialized chat messages instead of just the tool calls * Minor formatting update * Refactoring * Refactor to move confirmation strategy logic to Agent instead of ToolInvoker * More refactoring and prep for getting BreakpointConfirmationStrategy to work * Add missing run_async to Agent and remove unused test file * Add missing import * First version of HiTL + Breakpoints is working! * Some refactoring and adding a TODO * Refactoring example script to be a bit more robust * Fixed some bugs * Some cleanup * Rename file * Fix a bug to get multiple sequential tool calls to work * Cleanup * Cleanup and formatting * Cleanup and refactoring * Refactoring * More refactoring * More refactoring and updated _get_tool_calls_and_descriptions properly creates the final set of tool arguments * Refactoring based on comments * More cleanup * Update example * Formatting * fix license header * Fix sede bug * PR comments and simplification * Do more monkey patching * PR comments * PR comments * Refactoring and cleanup of utilities in strategies.py * Fix issue. Use tool name as fall back to match tool execution decision to tool call * ignore pylint import issues * Resolve todo * Two tool call question now works as expected * Fixing typing and formatting * Formatting * Update readme and docs * remove tool invoker docs since not needed anymore * Add some integration tests for Agent using confirmation strategies * Add more unit tests * Update haystack_experimental/components/agents/human_in_the_loop/breakpoint.py Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com> * Refactoring based on PR comments * formatting and types --------- Co-authored-by: Stefano Fiorucci <stefanofiorucci@gmail.com>
|
@@ -0,0 +1,186 @@ | ||
# SPDX-FileCopyrightText: 2022-present deepset GmbH <info@deepset.ai> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mpangrazzi as we discussed, I'd appreciate it if you could take a look at this script (from the point of view of integrate HiTL in Hayhooks)
…to hitl-alternative
@anakin87 I have increased the test coverage specifically focusing on the human_in_the_loop folder. I've also added integration tests for Agent using confirmation strategies (both blocking and breakpoint confirmation strategies). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my point of view, this PR looks good. Thanks for adding the tests!
I'd still appreciate reviews from others, especially to confirm that the BreakpointConfirmationStrategy
(which introduces much complexity) truly unlocks the Hayhooks/platform use cases.
From my conversation with @mpangrazzi the ability to pass a callback to be triggered by a tool call (basically what the |
@sjrl @anakin87 It should be possible, but IMHO the best way to be sure that this works would be to try it out and check how clean is the output implementation. We may want to try to e.g.:
We can also skip using AG-UI on point 3, but we still need to have websockets support on pipeline wrappers (to be able to do a bi-directional communication). I'll try to find some time next week to try it out. |
Sounds good! @mpangrazzi and @anakin87 I'll go ahead and merge this then and we can create follow-up PRs/issues if we find that we need to make changes to get this to work with hayhooks |
👍 |
Related Issues
Proposed Changes:
This PR introduces a human-in-the-loop (HITL) mechanism for tool execution in Agent, allowing users to confirm, reject, or modify tool invocations interactively.
TODOs
Key Changes
haystack_experimental/components/agents/human_in_the_loop/
dataclasses.py
ConfirmationUIResult
andToolExecutionDecision
dataclasses to standardize user feedback and tool execution decisionspolicies.py
ConfirmationPolicy
base class and concrete policies:AlwaysAskPolicy
,NeverAskPolicy
, andAskOncePolicy
user_interfaces.py
ConfirmationUI
base class and two implementations:RichConsoleUI
(using Rich library) andSimpleConsoleUI
(using standard input), supporting interactive user feedback and parameter modification.hitl_multi_agent_example.py
to see a case where this occurs.protocol.py
ConfirmationStrategy
protocol. I'm not 100% if this should be a protocol or a base class.strategies.py
HumanInTheLoopStrategy
to orchestrate the confirmation policy and the confirmation UI, returning aToolExecutionDecision
.haystack_experimental/components/tools/tool_invoker.py
ToolInvoker
to support per-tool confirmation strategies.haystack_experimental/components/agents/agent.py
Agent
use the new experimentalToolInvoker
and added theconfirmation_strategies
explicitly.How did you test it?
Provided an example script called
hitl_intro_example.py
which shows how to use these new utility functions to convert existing tools into ones that ask for confirmation. I've also reproduced it hereSingle Agent Example highlighting different Confirmation Policies and Confirmation UIs
Multi-Agent Example: Tools within Sub-Agents use HiTL Confirmation
Notes for the reviewer
@julian-risch this now follows more closely your idea of passing the tool confirmation strategies to the Agent (or ToolInvoker) rather than applying it at the tool level. I think I like this design better since it has better separation and it makes the various policies easy to understand.
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.