Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

A comprehensive Kubernetes observability toolkit built on the Model Context Protocol (MCP) for Site Reliability Engineering.

Notifications You must be signed in to change notification settings

martinimarcello00/k8s-observability-mcp

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

☸️ K8s Observability MCP

Small MCP server that lets you explore Kubernetes metrics, logs, traces, and service graph data via simple tools.

  • 🐍 Python 3.13
  • 📈 Prometheus
  • 🔎 Jaeger
  • 🕸️ Neo4j
  • ☸️ Kubernetes API

Features

  • 📊 Get pod/service metrics (instant and range)
  • 📜 Read pod/service logs with important-line filtering
  • 🔗 Service map from Neo4j (uses/depends)
  • 🧭 Cluster overview (pods and services)
  • 🧵 Trace summaries and details from Jaeger

Requirements

  • 🐍 Python 3.13+
  • 📦 Poetry
  • ☸️ Access to your cluster (kubeconfig on this machine)
  • 📈 Prometheus URL
  • 🔎 Jaeger URL
  • 🕸️ Neo4j URI, user, password

Setup

  • Install (Poetry)
poetry install
  • Configure env
cp .env.example .env
# edit .env with your values

Run

poetry run python mcp_server.py

Then connect with your MCP client to use the tools.

Tools

🔍 Kubernetes Resource Inspection

  • get_pods_from_service(service)

    • Returns all pods belonging to a specific service
    • Shows pod names and current status (Running, Pending, etc.)
  • get_cluster_pods_and_services()

    • Comprehensive cluster overview
    • Lists all pods and services with counts

📊 Metrics & Observability

  • get_metrics(resource_name, resource_type)

    • Retrieves instant Prometheus metrics for a pod or service
    • Parameters:
      • resource_name: The exact name of the Kubernetes resource
      • resource_type: Either "pod" or "service"
    • Returns CPU, memory, network, thread, and container specifications
  • get_metrics_range(resource_name, resource_type, time_range_minutes)

    • Historical metrics over a specified time range from Prometheus
    • Parameters:
      • resource_name: The exact name of the Kubernetes resource
      • resource_type: Either "pod" or "service"
      • time_range_minutes: Historical lookback in minutes (minimum 1)
  • get_logs(resource_name, resource_type, tail=100, important=True)

    • Retrieve pod/service logs with optional keyword filtering
    • Parameters:
      • resource_name: The exact name of the Kubernetes resource
      • resource_type: Either "pod" or "service"
      • tail: Number of recent log lines to retrieve (default: 100)
      • important: If true, filter for ERROR, WARN, CRITICAL keywords (default: true)

🔗 Service Dependencies & Graph

  • get_services_used_by(service)

    • Returns downstream services called by the given service
    • Shows service dependency chain (who calls whom)
  • get_dependencies(service)

    • Retrieves infrastructure dependencies for a service
    • Includes databases, caches, message queues, etc.

🧵 Distributed Tracing

  • get_traces(service_name, only_errors=False)

    • Retrieves traces for a specific service from Jaeger
    • Parameters:
      • service_name: The name of the service to retrieve traces for
      • only_errors: If true, return only traces containing errors (default: false)
    • Returns: traceID, latency_ms, has_error, service sequence
  • get_trace(trace_id)

    • Retrieves detailed information for a specific trace by ID
    • Parameters:
      • trace_id: The unique trace ID to retrieve
    • Includes all spans with timestamps, durations, tags, and errors

Notes

  • Uses your default kubeconfig. Set TARGET_NAMESPACE in .env to scope queries.

  • 🕸️ Service graph docs: see service-graph/README.md for how the Neo4j service graph is built (Jaeger CALLS + static USES), how to load it, and the result image.

About

A comprehensive Kubernetes observability toolkit built on the Model Context Protocol (MCP) for Site Reliability Engineering.

Topics

Resources

Stars

Watchers

Forks

Morty Proxy This is a proxified and sanitized view of the page, visit original site.