Pydantic AI SQL Analysis Agent Demo

A production-ready demo showcasing Pydantic AI core features for experienced developers. This agent converts natural language questions into SQL queries, executes them, and returns type-safe structured results.

New to this project? Start with docs/QUICKSTART.md for setup and validation. Then explore docs/NAVIGATION.md for a complete learning path.

🎯 Features Demonstrated

1. Function Calling & Tool Integration

The agent uses two registered tools to solve problems:

write_sql_draft() - Generates SQL from natural language questions
execute_sql_query() - Safely executes validated SQL and returns results

The agent autonomously decides which tools to use and chains them together.

2. Structured Outputs with Type Safety

All agent responses are validated against Pydantic models:

class AnalysisResult(BaseModel):
    user_question: str
    query: str                              # The executed SQL
    summary: str                            # Human-readable findings
    insights: list[str]                     # Key observations
    row_count: int                          # Result size
    sample_data: list[dict[str, Any]]       # Data preview

If the agent violates the schema, Pydantic validates and the agent retries automatically.

3. Dependency Injection & Context

The agent receives dependencies through RunContext:

@agent.tool
async def execute_sql_query(
    ctx: RunContext[AgentDeps],  # Type-safe context
    sql: str,
    description: str = "",
) -> QueryResult:
    result = ctx.deps.db.execute_query(sql)  # Access injected DB
    return QueryResult(...)

4. Full Type Safety

Function signatures provide IDE auto-completion
Pydantic validates all inputs/outputs
Type hints catch errors at write-time, not runtime

� Documentation

Document	Purpose
docs/QUICKSTART.md	5-minute setup and validation — Start here!
docs/NAVIGATION.md	Project overview and learning paths
docs/MCP_SERVER_GUIDE.md	MCP pattern explanation and tool documentation
docs/MCP_IMPLEMENTATION_GUIDE.md	Deep-dive code walkthrough and architecture
docs/MCP_DEMO_SUMMARY.md	What was built: features and test results
docs/EXTENSIONS.md	Advanced features: FastAPI, caching, monitoring, multi-agent patterns
docs/CI_CD_SETUP.md	Travis CI & GitHub Actions pipeline configuration
docs/EXTERNAL_MCP_INTEGRATION.md	Integrating SaaS MCP endpoints (Context7, custom services)
docs/CONTEXT7_TUTORIAL.md	Quick tutorial: using Context7 documentation AI service

📋 Project Structure

pyagent/
├── src/
│   ├── __init__.py          # Package definition
│   ├── models.py            # Pydantic output schemas
│   ├── database.py          # SQLite layer with sample data
│   ├── agent.py             # Agent definition and tools
│   ├── mcp_server.py        # Local MCP analytics server
│   ├── external_mcp_client.py # SaaS MCP client (Context7, custom)
│   └── main.py              # CLI interface
├── tests/                   # Unit and integration tests
├── docs/                    # Complete documentation guides
│   ├── QUICKSTART.md        # Setup and validation (start here)
│   ├── NAVIGATION.md        # Overview and learning paths
│   ├── MCP_SERVER_GUIDE.md  # Tool documentation
│   ├── MCP_IMPLEMENTATION_GUIDE.md  # Deep-dive walkthrough
│   ├── MCP_DEMO_SUMMARY.md  # Feature summary
│   ├── EXTENSIONS.md        # Advanced patterns
│   ├── CI_CD_SETUP.md       # Pipeline configuration
│   └── EXTERNAL_MCP_INTEGRATION.md  # SaaS MCP integration
├── examples_mcp.py          # MCP server usage examples
├── examples_external_mcp.py # External/SaaS MCP integration examples
├── .github/workflows/       # GitHub Actions workflows
├── .travis.yml              # Travis CI configuration
├── pyproject.toml           # Dependencies and build config
├── .env.example             # Environment template
├── .gitignore               # Git ignore rules
├── LICENSE                  # MIT License
└── README.md                # This file

🚀 Quick Start

1. Install Dependencies

# Clone the repository
cd /home/drew/code/pyagent

# Create virtual environment (optional but recommended)
python3.10 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -e .

2. Set Up API Key

# Copy the template
cp .env.example .env

# Edit .env and add your OpenAI API key
# Get it from: https://platform.openai.com/api-keys
OPENAI_API_KEY=sk-proj-xxxxx...

3. Run the Demo

python -m src.main

💡 Example Queries

The demo includes 4 pre-built examples:

1️⃣ Top Customers by Lifetime Value

Question: "Who are the top 3 customers by lifetime value? Include their email and country."

Expected Output:

Generated SQL query with proper JOINs and ORDER BY
Results showing customers sorted by spending
Insights about high-value customer locations

2️⃣ Orders by Status Distribution

Question: "How many orders are in each status (delivered, pending, shipped)? Show the count for each status."

Expected Output:

SQL with GROUP BY clause
Order counts per status
Distribution insights

3️⃣ Average Order Value by Category

Question: "What is the average order value for products in each category?"

Expected Output:

SQL with JOINs and aggregations
Average spend per product category
Category comparison insights

4️⃣ Customer Purchase Patterns

Question: "Which customers have made more than one purchase? Show their name and order count."

Expected Output:

SQL with GROUP BY and HAVING
Repeat customer identification
Pattern insights

Custom Questions

You can also ask your own questions about the data!

📊 Sample Data

The demo includes a pre-populated SQLite database:

Customers (8 records)

IDs 1-8 with names, emails, signup dates, countries
Lifetime value tracking from $1,800 to $6,100

Products (8 records)

Electronics:     Laptop Pro, Wireless Mouse, USB-C Cable, Mechanical Keyboard
Office:          Desk Lamp, Ergonomic Chair, Standing Desk
Stationery:      Notebook Set

Orders (12 records)

Various statuses: delivered, pending, shipped
Links customers to products with quantities and amounts
Date range: January 2024 - March 2024

🏗️ Code Architecture

Agent Definition (`src/agent.py`)

agent = Agent(
    model="anthropic:claude-3-5-sonnet-20241022",
    deps_type=AgentDeps,              # Injected dependencies
    output_type=AnalysisResult,       # Type-safe output
    system_prompt="""..."""           # Instructions
)

@agent.tool
async def execute_sql_query(
    ctx: RunContext[AgentDeps],
    sql: str,
) -> QueryResult:
    """Tool description for LLM - shown to agent"""
    result = ctx.deps.db.execute_query(sql)
    return QueryResult(...)

Key Points:

model= specifies Claude (easily swap for GPT-4, Gemini, etc.)
deps_type= provides type-safe dependency injection
output_type= guarantees structured JSON output
@agent.tool decorator registers functions as LLM-callable tools

Database Layer (`src/database.py`)

db = Database(":memory:")  # In-memory SQLite
result = db.execute_query("SELECT...")

# Returns:
{
    "columns": ["id", "name", ...],
    "rows": [{"id": 1, "name": "Alice", ...}, ...],
    "row_count": 42
}

Safety Features:

validate_sql() checks for SQL injection and dangerous operations
Only SELECT queries allowed (no UPDATE/DELETE/DROP)
Parsed with sqlglot for syntax validation

Models (`src/models.py`)

class AnalysisResult(BaseModel):
    user_question: str           # Original question
    query: str                   # Generated SQL
    summary: str                 # Human-readable summary
    insights: list[str]          # Key findings
    row_count: int               # Number of rows
    sample_data: list[dict[str, Any]]  # Data preview

All fields are validated by Pydantic; agent retries on validation failure.

🧪 Testing

Run the included tests:

# Install dev dependencies
pip install -e .[dev]

# Run tests
pytest tests/

# Run with coverage
pytest --cov=src tests/

🔄 Extensibility

Adding More Tools

@agent.tool
async def get_customer_graph(
    ctx: RunContext[AgentDeps],
    customer_id: int
) -> dict[str, Any]:
    """Build network graph of customer connections."""
    # Implementation
    return graph_data

Switching Models

Change one line to use a different provider:

# Using OpenAI (default)
agent = Agent(model="openai:gpt-4o", ...)

# Try with Anthropic Claude
agent = Agent(model="anthropic:claude-3-5-sonnet-20241022", ...)

# Try with Google Gemini
agent = Agent(model="google-gla:gemini-2.0-flash", ...)

# Try with local Ollama
agent = Agent(model="ollama:llama2", ...)

Adding Custom Output Validators

@agent.result_validator
def validate_insights(result: AnalysisResult) -> AnalysisResult:
    """Custom validation beyond Pydantic model."""
    if len(result.insights) == 0:
        raise ValueError("Must provide at least one insight")
    return result

🎓 Learning Resources

Understanding the Features

Function Calling: The agent sees tool definitions and decides when to call them
- Tools are Python functions with docstrings
- Agent sees docstrings and uses them as instructions
- Return types become part of agent context
Structured Outputs: Pydantic models ensure type-safe responses
- Agent is constrained to return valid JSON
- Automatic retry if validation fails
- IDE auto-completion for result fields
Type Safety: Full type annotations enable:
- Write-time error detection
- IDE auto-completion for tool parameters
- Runtime validation of all inputs/outputs

Official Docs

⚖️ License

MIT License - See LICENSE file

🚢 Production Considerations

This demo covers essential patterns for production use:

✅ Error Handling: Query validation and graceful failures
✅ Type Safety: Full Pydantic validation of inputs/outputs
✅ Scalability: Async/await support for concurrent requests
✅ Observability: Structured outputs enable detailed logging
✅ Security: SQL injection prevention with query validation

For production, additionally consider:

Model fallback strategies (use cheaper model if primary fails)
Rate limiting and quota management
Usage tracking and cost monitoring
Fine-tuned prompts for your specific domain
Logging and observability integration

🤔 FAQ

Q: How does the agent decide which tool to use?
A: The LLM reads tool docstrings and signatures, then decides which tool is appropriate for the user's question.

Q: What happens if the SQL query is invalid?
A: The database validation layer returns an error, which is fed back to the agent for retry.

Q: Can I use this with my own database?
A: Yes! Modify Database class in src/database.py to connect to your actual database.

Q: Does it work with other LLMs?
A: Yes! Change the model parameter in src/agent.py to any supported provider (OpenAI, Google, Groq, etc.)

Name	Name	Last commit message	Last commit date
Latest commit History 6 Commits 6 Commits
.github/workflows	.github/workflows
docs	docs
src	src
tests	tests
.CI_CD_QUICK_START	.CI_CD_QUICK_START
.CI_CD_SUMMARY	.CI_CD_SUMMARY
.env.example	.env.example
.gitignore	.gitignore
.travis.yml	.travis.yml
BUILD_SCRIPT_GUIDE.md	BUILD_SCRIPT_GUIDE.md
LICENSE	LICENSE
README.md	README.md
build.sh	build.sh
examples_external_mcp.py	examples_external_mcp.py
examples_mcp.py	examples_mcp.py
pyproject.toml	pyproject.toml

Search code, repositories, users, issues, pull requests...

License

drewtech/pyagent

Folders and files

Latest commit

History

Repository files navigation

Pydantic AI SQL Analysis Agent Demo

🎯 Features Demonstrated

1. Function Calling & Tool Integration

2. Structured Outputs with Type Safety

3. Dependency Injection & Context

4. Full Type Safety

� Documentation

📋 Project Structure

🚀 Quick Start

1. Install Dependencies

2. Set Up API Key

3. Run the Demo

💡 Example Queries

1️⃣ Top Customers by Lifetime Value

2️⃣ Orders by Status Distribution

3️⃣ Average Order Value by Category

4️⃣ Customer Purchase Patterns

Custom Questions

📊 Sample Data

Customers (8 records)

Products (8 records)

Orders (12 records)

🏗️ Code Architecture

Agent Definition (src/agent.py)

Database Layer (src/database.py)

Models (src/models.py)

🧪 Testing

🔄 Extensibility

Adding More Tools

Switching Models

Adding Custom Output Validators

🎓 Learning Resources

Understanding the Features

Official Docs

⚖️ License

🚢 Production Considerations

🤔 FAQ

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Agent Definition (`src/agent.py`)

Database Layer (`src/database.py`)

Models (`src/models.py`)

Packages