AI Agent Framework and ChatGXY 2.0 #21434

dannon · Dec 11, 2025

Pull Request: AI Agent Framework for Galaxy

Summary

This PR introduces an AI-powered multi-agent framework for Galaxy. The focus is on establishing the framework and APIs - the agents themselves are functional but will continue to evolve, and of course we want to add more. This provides the foundation for intelligent assistance in error analysis, tool recommendations, custom tool creation, and training material discovery. This is NOT meant to be merged as-is -- we want to iterate on the framework and basic agents with a wider audience and this will likely be trimmed down prior to merge. The GTN agent works well and is useful, but I don't love the deployment mechanism and we want to use chatgxy-materials embeddings (likely) anyway, which can happen in a separate branch.

Key Points

Framework-first approach: The core value here is the infrastructure - agent base classes, registry, service layer, API endpoints, and configuration system. Individual agents will improve over time.
Dual API surface: Both conversational (/api/chat) and direct programmatic access (/api/ai/agents/*) endpoints
Flexible configuration: Per-agent model/temperature settings with sensible defaults
Model-agnostic: Works with OpenAI, Anthropic, Google, and local models (vLLM, Ollama)

What's Included

Framework Components

lib/galaxy/agents/base.py - Base agent classes with pydantic-ai integration
lib/galaxy/agents/registry.py - Dynamic agent registration and lookup
lib/galaxy/managers/agents.py - Centralized AgentService for execution
lib/galaxy/schema/agents.py - Pydantic API schemas

API Endpoints

GET /api/ai/agents - List available agents
POST /api/ai/agents/query - Unified query with auto-routing
POST /api/ai/agents/error-analysis - Direct error analysis
POST /api/ai/agents/tool-recommendation - Direct tool recommendations
POST /api/ai/agents/custom-tool - Direct custom tool creation
Enhanced /api/chat for ChatGXY

Initial Agents (5)

Router - Analyzes user queries and routes them to the appropriate specialist agent. Uses intent classification to determine whether someone needs help debugging, finding tools, learning, etc.
Error Analysis - Examines job failures, stderr output, and tool context to diagnose what went wrong and suggest fixes. Can recommend alternative tools if the current one isn't suitable.
Tool Recommendation - Helps users find the right Galaxy tool for their analysis task. Searches the toolbox and provides guidance on parameters and usage.
Custom Tool - Generates Galaxy tool XML/YAML definitions from natural language descriptions. Useful for wrapping command-line tools or scripts.
GTN Training Helper - Searches the Galaxy Training Network using full-text search to find relevant learning materials for any topic.

UI Integration

ChatGXY as full-page activity in activity bar
GalaxyWizard integration for error analysis
Action suggestions with interactive buttons

Configuration

# Minimal - uses defaults (this is the pre-existing config options)
ai_api_key: ${OPENAI_API_KEY}
ai_model: gpt-4o-mini

# Or new per-agent configuration falling back to 'default'
inference_services:
  default:
    model: "llama-4-scout"
    api_key: "sk-local-test-master-key"
    api_base_url: "http://localhost:4000/v1/"
    temperature: 0.7
    max_tokens: 2000
  custom_tool:
    model: "anthropic:claude-sonnet-4-5"
    api_key: "sk-ant-api03-SUPERSECRETKEY"

What's NOT Included (Future Work)

Dataset analyzer agent (Junhao has a branch based on this one using this framework to implement it)
MCP (Model Context Protocol) integration - separated into galaxy-mcp branch
Workflow generation from natural language
Per-user API key management
Additional specialized agents

Testing

To make this easy, we plan to deploy on test and will allow folks to primarily exercise it that way.
There is a start of a comprehensive pytest suite with mocked and live LLM modes
mypy and lint checks passing
Tested with claude (sonnet and haiku) and local LLMs via LiteLLM

# Run mocked tests
pytest lib/galaxy_test/api/test_agents.py -k Mocked -v

# Run live LLM tests
GALAXY_TEST_ENABLE_LIVE_LLM=1 pytest lib/galaxy_test/api/test_agents.py -k LiveLLM -v

Notes

The GTN search database (gtn_search.db) needs to be built at deployment time - not included in repo (well, it is now, as John noticed, but it won't be)
Agents gracefully handle models without structured output support

lib/galaxy/agents/orchestrator.py

lib/galaxy/webapps/galaxy/api/chat.py

client/src/entry/analysis/router.js

client/webpack.config.js

jmchilton · Dec 11, 2025

lib/galaxy/agents/tools.py

+        Returns:
+            List of tool categories
+        """
+        return [


I'd love to see these agents come in one at a time after the framework - I would feel better critiquing them in that context. If the cost of getting the infrastructure in is this though - I guess I'm fine - I would love to see progress and we could clean this up later.

Yes, that's totally fine, we just have to have some set of things to test initially. Plan is to prune it down and open separate PRs.

lib/galaxy_test/api/test_agents.py

lib/galaxy_test/driver/driver_util.py

jmchilton · Dec 11, 2025

Binary file addedBIN +24.8 MB
lib/galaxy/agents/gtn/data/gtn_search.db

I think this is still in and the PR description says it is out?

dannon · Dec 11, 2025

Binary file addedBIN +24.8 MB lib/galaxy/agents/gtn/data/gtn_search.db

I think this is still in and the PR description says it is out?

It is for now so folks don't have to build it. It'll be rebased out (probably along with the agent itself, to be included separately)

lib/galaxy/agents/prompts/custom_tool_text.md

jmchilton · Dec 11, 2025

I've pushed 967ba75 that makes some changes to the tests. Please feel free to rebase or squash it as much as you'd like. I've removed some tests to get things green but I've tracked them in #21437.

There are many parts of the code that do unqualified imports of pydantic-ai - I think it really needs to be a hard dependency. For instance pytest collection fails the way things are now (https://github.com/galaxyproject/galaxy/actions/runs/20148161399/job/57834129303?pr=21434). Also I needed to install some pytest asyncio library to get these tests to run that will probably need to be dev dependency - I guess CI will make that clear.

I think the remaining tasks:

Client linting.
Fix dependencies (pydantic-ai should be unconditional).
Make the APIs as beta and for consumption only by the UI.
Drop GTN agent (restore in a separate PR after this is merged).
Drop tool recommendation agent (restore in a separate PR after this merged).

lib/galaxy/agents/base.py

- Move unit tests out of integration test file and into test/unit. - Move remaining "API" tests into test/integration because they use ``self.app`` and mock out the API - they are what we would call integration tests in Galaxy. - Have unit and integration tests use the same common environment variables all now prefixed with GALAXY_TEST_ (we do this for object store stuff). - Small tweak to orchestrator.py to make the unit tests pass. - Clean up the mocked tests method to work cleanly without any infrastructure - some tests that used the LLM had gotten in there. Tracked the tests I removed as galaxyproject#21437.

Changed the agent endpoints to return proper AgentResponse types instead of Dict[str, Any], which gives us typed responses in the client. Updated execute_agent in the manager to return the schema object directly. Fixed type assertion in GalaxyWizard for accessing metadata.error.

route_and_execute now returns AgentResponse instead of dict, fixed the dict-style access that was breaking agent queries. Also added unused-ignore to type comments so mypy passes whether or not pydantic_ai is installed.

Added pytestmark_live_llm to unittest_utils for skipping live LLM tests. Added importorskip for pydantic_ai in unit tests so they skip gracefully when the optional dependency isn't installed.

things a bit.

- Add get_agent callable to GalaxyAgentDependencies to avoid circular imports when agents call other agents via tool functions. A bit more contrived than the late import it replaces, but may be strictly cleaner architecturally. - Replace regex-based agent_type derivation with explicit class attribute requirement for clarity - Improve _validate_model_capabilities docstring

Tool-rec code was too entangled across commits to cleanly rebase out, so just deleting the files and updating references instead. The agent-based-ai-tool-rec branch has the full implementation for a follow-up PR.

The GTN agent was extracted to agent-based-ai-gtn branch previously, but references remained in tests and prompts. Cleaning those up now.

Move pydantic-ai from conditional to required dependencies since the agents code imports it unconditionally.

Replace str(e) in user-facing responses with generic error messages. The detailed exception info is still logged server-side for debugging.

Fixes protobuf version conflict for Python 3.9 (temporalio requires protobuf<6) and picks up a few minor version bumps.

Since pydantic-ai is now a required dependency, remove all the HAS_PYDANTIC_AI conditional checks and simplify the imports. Also fix ModelRequest/ModelResponse construction to use proper parts-based API.

Use proper pydantic-ai types and APIs, fix list typing in chat history, correct Agent.run_sync() usage. Add generic type parameters to Agent class attributes and fix provider variable naming in base.py.

Match base class Agent[GalaxyAgentDependencies, Any] signature. Output types vary at runtime based on model structured output support.

github-actions bot added area/documentation area/testing area/API area/admin area/dependencies area/testing/api area/client labels Dec 11, 2025

dannon changed the title ~~Agent-based AI~~ AI Agent Framework and ChatGXY 2.0 Dec 11, 2025

github-advanced-security bot found potential problems Dec 11, 2025

View reviewed changes

lib/galaxy/agents/orchestrator.py Dismissed Show dismissed Hide dismissed

lib/galaxy/agents/orchestrator.py Dismissed Show dismissed Hide dismissed

lib/galaxy/webapps/galaxy/api/chat.py Fixed Show fixed Hide fixed

jmchilton reviewed Dec 11, 2025

View reviewed changes

client/src/entry/analysis/router.js Show resolved Hide resolved

jmchilton reviewed Dec 11, 2025

View reviewed changes

client/webpack.config.js Outdated Show resolved Hide resolved

jmchilton reviewed Dec 11, 2025

View reviewed changes

lib/galaxy_test/api/test_agents.py Outdated Show resolved Hide resolved

jmchilton reviewed Dec 11, 2025

View reviewed changes

lib/galaxy_test/driver/driver_util.py Outdated Show resolved Hide resolved

mvdbeek reviewed Dec 11, 2025

View reviewed changes

lib/galaxy/agents/prompts/custom_tool_text.md Outdated Show resolved Hide resolved

dannon closed this Dec 11, 2025

dannon reopened this Dec 11, 2025

dannon force-pushed the agent-based-ai branch 3 times, most recently from e65c242 to 643ec7c Compare December 11, 2025 15:56

guerler reviewed Dec 12, 2025

View reviewed changes

lib/galaxy/agents/base.py Outdated Show resolved Hide resolved

guerler reviewed Dec 12, 2025

View reviewed changes

lib/galaxy/agents/base.py Show resolved Hide resolved

guerler reviewed Dec 12, 2025

View reviewed changes

lib/galaxy/agents/base.py Outdated Show resolved Hide resolved

dannon force-pushed the agent-based-ai branch 4 times, most recently from bcce7d5 to 306dae4 Compare December 19, 2025 20:20

dannon added 22 commits December 22, 2025 16:40

Fix AgentResponse usage and mypy unused-ignore warnings

fe6e5c2

route_and_execute now returns AgentResponse instead of dict, fixed the dict-style access that was breaking agent queries. Also added unused-ignore to type comments so mypy passes whether or not pydantic_ai is installed.

Fix test imports for optional pydantic_ai dependency

886dac8

Added pytestmark_live_llm to unittest_utils for skipping live LLM tests. Added importorskip for pydantic_ai in unit tests so they skip gracefully when the optional dependency isn't installed.

Add unstable parameter for marking experimental API endpoints

d55b7e3

Mark AI/chat API endpoints as unstable

20f5915

Apply unstable marker to existing experimental endpoints to standardize

34df7f4

things a bit.

Fix AgentResponse return type in chat.py and regenerate schema

521d8f2

Remove unused ConfidenceLevel import

ee05be1

Add explicit agent_type to all agent subclasses

358d0b4

Remove tool recommendation agent for separate PR

684bb54

Tool-rec code was too entangled across commits to cleanly rebase out, so just deleting the files and updating references instead. The agent-based-ai-tool-rec branch has the full implementation for a follow-up PR.

Remove more GTN agent references (also extracted separately)

541c6b0

The GTN agent was extracted to agent-based-ai-gtn branch previously, but references remained in tests and prompts. Cleaning those up now.

Add pydantic-ai as required dependency

c66ac4f

Move pydantic-ai from conditional to required dependencies since the agents code imports it unconditionally.

Don't expose raw exception details to users in chat API

7b8d9ae

Replace str(e) in user-facing responses with generic error messages. The detailed exception info is still logged server-side for debugging.

Re-solve dependencies with uv

ec77f44

Fixes protobuf version conflict for Python 3.9 (temporalio requires protobuf<6) and picks up a few minor version bumps.

Update client api schema.

2cde8d2

Make pydantic-ai a required dependency (remove optional checks)

c628cd1

Since pydantic-ai is now a required dependency, remove all the HAS_PYDANTIC_AI conditional checks and simplify the imports. Also fix ModelRequest/ModelResponse construction to use proper parts-based API.

Fix mypy type errors in agent and chat code

7f0775b

Use proper pydantic-ai types and APIs, fix list typing in chat history, correct Agent.run_sync() usage. Add generic type parameters to Agent class attributes and fix provider variable naming in base.py.

Fix _create_agent return types in agent subclasses

b9857d3

Match base class Agent[GalaxyAgentDependencies, Any] signature. Output types vary at runtime based on model structured output support.

Add type ignore for skipped test with outdated pydantic-ai API

ea116ac

Add pydantic-ai to galaxy-app package dependencies

4b79efd

dannon force-pushed the agent-based-ai branch from 504df8d to 4b79efd Compare December 22, 2025 21:40

dannon added 2 commits December 22, 2025 18:55

Revert mypy and social-auth-core version bumps

ec6e8a7

Add agents symlink to galaxy-app package

80c1652

dannon force-pushed the agent-based-ai branch from 7153967 to 80c1652 Compare December 23, 2025 00:00

Fix mypy errors for optional huggingface imports

b73e2e8

dannon force-pushed the agent-based-ai branch from 31510eb to b73e2e8 Compare December 23, 2025 00:50

dannon added 2 commits December 22, 2025 21:28

Fix mypy error for npz.zip.filelist optional access

5af2580

Add pytest-asyncio to packages/app test dependencies

505a93f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AI Agent Framework and ChatGXY 2.0 #21434

AI Agent Framework and ChatGXY 2.0 #21434

Uh oh!

dannon commented Dec 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jmchilton Dec 11, 2025

Uh oh!

dannon Dec 11, 2025

Uh oh!

Uh oh!

Uh oh!

jmchilton commented Dec 11, 2025

Uh oh!

dannon commented Dec 11, 2025 •

edited

Loading

Uh oh!

Uh oh!

jmchilton commented Dec 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Search code, repositories, users, issues, pull requests...

AI Agent Framework and ChatGXY 2.0 #21434

Are you sure you want to change the base?

AI Agent Framework and ChatGXY 2.0 #21434

Uh oh!

Conversation

dannon commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request: AI Agent Framework for Galaxy

Summary

Key Points

What's Included

Framework Components

API Endpoints

Initial Agents (5)

UI Integration

Configuration

What's NOT Included (Future Work)

Testing

Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jmchilton Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

dannon Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jmchilton commented Dec 11, 2025

Uh oh!

dannon commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jmchilton commented Dec 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dannon commented Dec 11, 2025 •

edited

Loading

dannon commented Dec 11, 2025 •

edited

Loading