Open
Description
✨ Feature Request: Allow Separate Models for Tool Execution and Final Response in OpenAI Agent SDK
Summary
Please add support in the OpenAI Agent SDK to split model usage between:
- One model for tool reasoning/execution (e.g.,
gpt-3.5-turbo
) - Another model for final response generation (e.g.,
o1
)
Motivation
In many real-world agent workflows, the agent needs to:
- Execute one or more tools (e.g., retrieve context, classify intent, generate a prompt)
- Use the output to generate a high-quality final response
Using a single high-end model (like o1) for all steps increases cost unnecessarily.
Using a cheaper model (like gpt-3.5-turbo
) for all steps reduces the final response quality.
Proposed API
Agent(
name="Vector Assistant",
reasoning_model="gpt-3.5-turbo", # Handles all tool execution and logic
response_model="o1", # Used only once, after tools run
...
)
Metadata
Metadata
Assignees
Labels
New feature or requestNew feature or request