Build your own AI SRE agents. The open source toolkit for the AI era ✨
-
Updated
May 6, 2026 - Python
Build your own AI SRE agents. The open source toolkit for the AI era ✨
AI-powered SRE platform for automated incident investigation
A production-ready framework for composing AI agents from declarative TOML configuration, with MCP tool integration, RAG pipelines, and an OpenAI-compatible web API.
Open-source AI SRE agent that investigates production incidents using episodic memory and Neo4j knowledge graph. 46 production skills. Self-hosted.
AI SRE tools for RCA, Incident Response, Cost-Saving, Infra management, DevOps and more
Unpage is the open source framework for building SRE agents with infrastructure context and secure access to any dev tool.
Multi-strategy RAG system achieving 74% Recall@10 on MultiHop-RAG. Combines RAPTOR hierarchical retrieval, knowledge graphs, HyDE, BM25, and Cohere neural reranking.
An open-source AI agent for infrastructure debugging.
A curated list of 100+ AI-powered tools, platforms, and resources for Site Reliability Engineering (SRE) — agents, incident management, observability, AIOps, chaos engineering, and more.
Synthetic production incidents and RCA evaluation for AI SRE agents.
Homebrew formulae that allows installation of Tracer tools through the Homebrew package manager.
AI-powered SRE Copilot for incident response, RAG knowledge operations, and autonomous AIOps diagnosis.
Build a vector database from scratch in C++. Compare HNSW, KD-Tree, and Brute Force search algorithms with a local RAG pipeline and web visualization.
Curate and explore a comprehensive list of AI-driven tools and resources tailored for Site Reliability Engineering tasks and challenges.
Add a description, image, and links to the ai-sre topic page so that developers can more easily learn about it.
To associate your repository with the ai-sre topic, visit your repo's landing page and select "manage topics."