GitHub - schmitech/orbit: A self-hosted AI infrastructure for private RAG and multi-model applications.

ORBIT — Open Retrieval-Based Inference Toolkit

Self-hosted AI infrastructure for private RAG and multi-model applications.

Tutorial | Docker Guide | Cookbook | Docs

Teams want AI connected to real business data without sending everything to a SaaS vendor, rewriting applications for every model provider, or maintaining fragile glue code between LLMs, databases, APIs, and files.

ORBIT gives you one OpenAI-compatible gateway for private RAG, model routing, retrieval adapters, conversations, tools, and production controls. Run it on your infrastructure, connect the systems you already use, and choose local or hosted models per workload.

You can build:

Private RAG over documents, databases, APIs, and internal knowledge sources.
OpenAI-compatible applications that can switch between local models and hosted providers.
Agent and MCP tools that expose controlled access to business data and actions.
Agentic tool-calling loops over external MCP servers — filesystem, GitHub, Slack, SharePoint, Postgres, and more — invoked as a skill during any conversation.
AI media generation pipelines — images and videos from text prompts — wired into the same adapter and conversation system.

A Typical ORBIT Workflow

Connect Postgres, internal PDFs, and a REST API.
Run ORBIT on your own infrastructure.
Query those sources through one OpenAI-compatible API.
Keep sensitive data under your control.
Switch between local and hosted models without changing your app.

Get Running In 60 Seconds

git clone https://github.com/schmitech/orbit.git && cd orbit/docker
docker compose up -d

Then test the OpenAI-compatible chat API:

curl -X POST http://localhost:3000/v1/chat \
  -H 'Content-Type: application/json' \
  -H 'X-API-Key: default-key' \
  -H 'X-Session-ID: local-test' \
  -d '{
    "messages": [{"role": "user", "content": "Summarize ORBIT in one sentence."}],
    "stream": false
  }'

ORBIT listens on port 3000. The admin panel is available at localhost:3000/admin with the default login admin / admin123.

For GPU acceleration:

docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d

Adapter wiring and sample domains live in config/adapters/ and examples/intent-templates/. See the full Docker Guide for GPU mode, volumes, and configuration.

Demos

multimodal-image-generation.mp4

Upload PDFs, documents, and images, then ask questions across all of them in a single conversation. Context is preserved across turns, and local model deployments keep files and queries on your infrastructure.

hr-github.mp4

Query structured databases with natural language and generate dynamic charts via cross-adapter image generation skills.

mcp-tool-demo.mp4

MCP agent skill — ORBIT connects to an external MCP filesystem server, runs a multi-step tool-calling loop, and returns a grounded answer

es-logs.mp4

Ask application logs plain-English questions and have ORBIT translate them into Elasticsearch Query DSL for error analysis, latency triage, and operational summaries.

See more

sensitive-data.mp4

Private local AI model analyzing sensitive PII data offline.

svg-rendering.mp4

Native SVG rendering.

second-opinion.mp4

Runtime model switching during a conversation, including chat history.

business-analytics-demo.mp4

Conversation threading with multi-turn follow-ups on the same result set. Source: examples/intent-templates/duckdb-intent-template/examples/analytics.

image-skill.mp4

Cross-adapter skills for tasks such as image generation during a conversation. Learn more.

chart-regression.mp4

OrbitChat rendering live charts from LLM output with no client-side charting code required.

image-skills.mp4

Image generation as a cross-adapter skill with conversation and thread context.

puppy.mp4

Text-to-video generation using Google Veo 2, invoked as a cross-adapter skill. The prompt is automatically enriched with motion, camera movement, and lighting detail before generation. Video is persisted server-side and streamed back without sending raw bytes over the wire.

Why ORBIT?

Common problem	What ORBIT provides
One SDK per provider, with rewrites when you switch	One OpenAI-compatible API across local and hosted providers
Separate systems for inference, retrieval, tools, and chat history	One gateway for model calls, adapters, tools, conversations, and clients
RAG limited to vector search over clean documents	Retrieval over SQL, NoSQL, HTTP, GraphQL, files, web content, and vector stores
Glue scripts between prompts and business systems	Intent adapters, composite adapters, diagnostics, and reusable templates
Privacy-sensitive data sent through third-party services by default	Self-hosted deployment with local models, local embeddings, API keys, RBAC, audit logs, and rate limits
Provider failures cascading into application failures	Circuit breakers, failover, parallel fan-out, and quota-aware throttling

Core Capabilities

AI Gateway

Route requests across local and hosted models.
Use one OpenAI-compatible chat API with existing clients and SDKs.
Stream responses with failover, moderation hooks, and rate limiting.
Connect through the web chat, Node SDK, or OpenAI-style clients.

Retrieval And Adapters

Retrieve context from databases, APIs, files, web content, and vector stores.
Use intent-based retrieval for natural-language questions over structured data.
Fan one prompt across multiple sources with composite adapters.
Build and debug data-backed assistants with template diagnostics and autocomplete.

Private RAG And Conversations

Run private RAG over PDFs, documents, images, manuals, contracts, and knowledge bases.
Reuse retrieved context through conversation threading and cached datasets.
Handle multilingual conversations across 100+ languages.
Keep deployments self-hosted for privacy-sensitive environments.

Media Generation

Generate images from text prompts using DALL-E, Stability AI, and other providers.
Generate videos from text prompts using Google Veo 2 — prompts are automatically enriched with motion, camera movement, and lighting detail before generation.
Generated media is persisted server-side and delivered via stable URLs, keeping large binary payloads off the wire.
Image and video generation follow the same adapter type system as every other capability — no special-casing required.

Tools, Agents, And Production Controls

Expose controlled tools through MCP for agent clients.
Connect outward to external MCP servers and run a multi-step tool-calling loop inside a conversation — the model calls tools, ORBIT executes them, feeds results back, and repeats until a final answer is produced.
Invoke specialized adapters during a conversation with cross-adapter skills.
Operate with API keys, RBAC, quotas, audit logs, rate limits, and circuit breakers.
Add voice assistant support through audio adapters.

Who Is ORBIT For?

Developers building internal AI apps that need real company data, not isolated chat.
Teams that need private RAG over documents, databases, APIs, and operational systems.
Companies avoiding long-term lock-in to a single LLM provider or hosted AI platform.
Engineers connecting AI to SQL, NoSQL, REST, GraphQL, files, and document stores.
Builders who want OpenAI-compatible APIs with self-hosted control.

ORBIT is probably more than you need if you only want a thin wrapper around one LLM provider.

What Makes ORBIT Different?

ORBIT is not only a model router. It handles the layers that usually become custom infrastructure in production RAG systems: retrieval, tools, adapters, conversations, access control, and operational safeguards.

Retrieval beyond vector search: use intent templates and adapters for structured databases, APIs, files, web content, and vector stores. Intent SQL RAG
Data source support: query SQL, MongoDB, Elasticsearch, REST, GraphQL, DuckDB, files, and composite sources through one gateway. Composite adapters
Local and hosted models: run private workloads on Ollama, llama.cpp, vLLM, or other local providers, while still supporting hosted LLMs where appropriate.
Production controls included: use API keys, RBAC, quotas, audit logging, moderation, rate limits, and circuit breakers. Rate limiting
Agent-ready protocol support: expose ORBIT-backed chat, RAG, and adapter tools through MCP for agent clients. MCP / OpenClaw walkthrough
MCP client — agentic tool calling: ORBIT can also connect outward to external MCP servers (filesystem, GitHub, Slack, SharePoint, Postgres, and more), discover their tools, and run a bounded multi-step tool-calling loop. Exposed as the mcp-agent skill on any adapter. Works with OpenAI, Anthropic, Gemini, xAI, and llama.cpp. MCP agent skill

Example Use Cases

Use case	Start here
Chat with a local model through an OpenAI-compatible API	Step-by-step tutorial
Ask Postgres, MySQL, MongoDB, DuckDB, or Elasticsearch questions in natural language	Database copilot
Query SQL + NoSQL + REST APIs in one prompt	Composite adapters
Upload files and get grounded answers	File-upload RAG
Deploy a private AI gateway for regulated data	Private gateway cookbook
Run ORBIT as an MCP tool server for agents	MCP / OpenClaw walkthrough
Connect to external MCP servers and run agentic tool-calling loops	MCP agent skill
Build a full-duplex voice assistant	PersonaPlex voice assistant
Generate images and videos from text prompts	Cross-adapter skills

Architecture And Adapters

ORBIT sits between clients, models, and data sources. Clients call the OpenAI-compatible API, ORBIT authenticates and routes the request, adapters retrieve or act on external data, and the selected model generates the response with the retrieved context.

Layer	Coverage
Clients and protocols	Web chat, Node SDK, OpenAI-compatible SDKs, MCP
Model routing	Hosted providers, local providers, streaming, failover, runtime model selection
Retrieval adapters	SQL, NoSQL, REST, GraphQL, files, web content, vector stores, composite adapters
RAG workflow	Intent templates, diagnostics, autocomplete, cached datasets, conversation threading
Media generation	Images (DALL-E, Stability AI), videos (Google Veo 2), server-side persistence, URL delivery
MCP client / agent	Connect to external MCP servers, discover tools, run bounded multi-step tool-calling loops as a skill
Operations	API keys, RBAC, audit logs, quotas, rate limits, moderation, circuit breakers, admin UI

Compatibility overview

ORBIT supports:

Local and hosted LLM providers.
SQL and NoSQL databases.
REST and GraphQL APIs.
File, web, and vector-based retrieval.
Local and hosted embedding providers.
Reranking, moderation, and guardrail integrations.
OpenAI-compatible clients and MCP-compatible tools.

See the Documentation and Cookbook for full setup details and integration coverage.

Deployment Options

Docker Compose

git clone https://github.com/schmitech/orbit.git && cd orbit/docker
docker compose up -d

This starts ORBIT with Ollama and SmolLM2, pulls models automatically, and exposes the API on port 3000. The web admin UI is at /admin on the same host.

Connect orbitchat from your host:

ORBIT_ADAPTER_KEYS='{"simple-chat":"default-key"}' npx orbitchat

Pre-Built Image

docker pull schmitech/orbit:basic
docker run -d --name orbit-basic -p 3000:3000 schmitech/orbit:basic

If Ollama runs on your host, add -e OLLAMA_HOST=host.docker.internal:11434 so the container can reach it. The basic image includes simple-chat only.

Release Tarball

Download the current release from GitHub Releases, then install:

curl -LO https://github.com/schmitech/orbit/releases/download/v2.7.1/orbit-2.7.1.tar.gz
tar -xzf orbit-2.7.1.tar.gz && cd orbit-2.7.1
cp env.example .env && ./install/setup.sh
./bin/orbit.sh start && tail -f ./logs/orbit.log

orbit-admin.mp4

The ORBIT Admin Panel provides real-time monitoring of system health, adapter states, and inference performance.

Clients

Client	Description
Web Chat	React chat UI
Node SDK	Node client, or use any OpenAI-compatible SDK

Documentation

Step-by-Step Tutorial — Chat with your own data in minutes.
Cookbook — Recipes for database copilots, private gateways, file RAG, voice assistants, fault tolerance, and MCP agents.
Adapter Configuration — Configure adapters, models, and routing behavior.
Server Documentation — API, server setup, and MCP protocol details.
Docker Guide — Docker Compose, GPU mode, volumes, and configuration.

Roadmap

More ready-to-run adapter templates for common business systems.
More MCP recipes for agent platforms and desktop clients.
Persistent MCP server connections and per-server circuit breakers for the MCP agent skill.
Human-in-the-loop tool approval for write operations in the MCP agent loop.
Expanded evaluation, tracing, and observability workflows.
Admin UI improvements for configuration, diagnostics, and operations.
Additional deployment templates for private cloud and regulated environments.

Contributing

Contributions are welcome. Check the issues for good first tasks, or open a new issue to discuss your idea.

ORBIT is Apache 2.0, so you can build commercial products on top of it. If ORBIT helps you build private RAG, agent tools, or AI gateway infrastructure, a star helps others find the project and follow its development.

License

Apache 2.0 — see LICENSE.

Name	Name	Last commit message	Last commit date
Latest commit History 1,752 Commits 1,752 Commits
bin	bin
clients	clients
config	config
docker	docker
docs	docs
examples	examples
install	install
server	server
utils	utils
.gitignore	.gitignore
CHANGELOG.md	CHANGELOG.md
CLAUDE.md	CLAUDE.md
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md
CONTRIBUTING.md	CONTRIBUTING.md
LICENSE	LICENSE
README.md	README.md
SECURITY.md	SECURITY.md
env.example	env.example
ruff.toml	ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-hosted AI infrastructure for private RAG and multi-model applications.

A Typical ORBIT Workflow

Get Running In 60 Seconds

Demos

Why ORBIT?

Core Capabilities

AI Gateway

Retrieval And Adapters

Private RAG And Conversations

Media Generation

Tools, Agents, And Production Controls

Who Is ORBIT For?

What Makes ORBIT Different?

Example Use Cases

Architecture And Adapters

Deployment Options

Docker Compose

Pre-Built Image

Release Tarball

Clients

Documentation

Roadmap

Contributing

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Search code, repositories, users, issues, pull requests...

Folders and files

Latest commit

History

Repository files navigation

Self-hosted AI infrastructure for private RAG and multi-model applications.

A Typical ORBIT Workflow

Get Running In 60 Seconds

Demos

Why ORBIT?

Core Capabilities

AI Gateway

Retrieval And Adapters

Private RAG And Conversations

Media Generation

Tools, Agents, And Production Controls

Who Is ORBIT For?

What Makes ORBIT Different?

Example Use Cases

Architecture And Adapters

Deployment Options

Docker Compose

Pre-Built Image

Release Tarball

Clients

Documentation

Roadmap

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages