Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

schmitech/orbit

Open more actions menu
ORBIT Logo

ORBIT — Open Retrieval-Based Inference Toolkit

Self-hosted AI infrastructure for private RAG and multi-model applications.


License Python Latest release Last commit GitHub stars

Tutorial  |  Docker Guide  |  Cookbook  |  Docs

Maintained by Remsy Schmilinsky

Teams want AI connected to real business data without sending everything to a SaaS vendor, rewriting applications for every model provider, or maintaining fragile glue code between LLMs, databases, APIs, and files.

ORBIT gives you one OpenAI-compatible gateway for private RAG, model routing, retrieval adapters, conversations, tools, and production controls. Run it on your infrastructure, connect the systems you already use, and choose local or hosted models per workload.

You can build:

  • Private RAG over documents, databases, APIs, and internal knowledge sources.
  • OpenAI-compatible applications that can switch between local models and hosted providers.
  • Agent and MCP tools that expose controlled access to business data and actions.
  • Agentic tool-calling loops over external MCP servers — filesystem, GitHub, Slack, SharePoint, Postgres, and more — invoked as a skill during any conversation.
  • AI media generation pipelines — images and videos from text prompts — wired into the same adapter and conversation system.

A Typical ORBIT Workflow

  1. Connect Postgres, internal PDFs, and a REST API.
  2. Run ORBIT on your own infrastructure.
  3. Query those sources through one OpenAI-compatible API.
  4. Keep sensitive data under your control.
  5. Switch between local and hosted models without changing your app.

Get Running In 60 Seconds

git clone https://github.com/schmitech/orbit.git && cd orbit/docker
docker compose up -d

Then test the OpenAI-compatible chat API:

curl -X POST http://localhost:3000/v1/chat \
  -H 'Content-Type: application/json' \
  -H 'X-API-Key: default-key' \
  -H 'X-Session-ID: local-test' \
  -d '{
    "messages": [{"role": "user", "content": "Summarize ORBIT in one sentence."}],
    "stream": false
  }'

ORBIT listens on port 3000. The admin panel is available at localhost:3000/admin with the default login admin / admin123.

For GPU acceleration:

docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d

Adapter wiring and sample domains live in config/adapters/ and examples/intent-templates/. See the full Docker Guide for GPU mode, volumes, and configuration.


Demos

multimodal-image-generation.mp4

Upload PDFs, documents, and images, then ask questions across all of them in a single conversation. Context is preserved across turns, and local model deployments keep files and queries on your infrastructure.

hr-github.mp4

Query structured databases with natural language and generate dynamic charts via cross-adapter image generation skills.

mcp-tool-demo.mp4

MCP agent skill — ORBIT connects to an external MCP filesystem server, runs a multi-step tool-calling loop, and returns a grounded answer

es-logs.mp4

Ask application logs plain-English questions and have ORBIT translate them into Elasticsearch Query DSL for error analysis, latency triage, and operational summaries.

See more

sensitive-data.mp4

Private local AI model analyzing sensitive PII data offline.

svg-rendering.mp4

Native SVG rendering.

second-opinion.mp4

Runtime model switching during a conversation, including chat history.

business-analytics-demo.mp4

Conversation threading with multi-turn follow-ups on the same result set. Source: examples/intent-templates/duckdb-intent-template/examples/analytics.

image-skill.mp4

Cross-adapter skills for tasks such as image generation during a conversation. Learn more.

chart-regression.mp4

OrbitChat rendering live charts from LLM output with no client-side charting code required.

image-skills.mp4

Image generation as a cross-adapter skill with conversation and thread context.

puppy.mp4

Text-to-video generation using Google Veo 2, invoked as a cross-adapter skill. The prompt is automatically enriched with motion, camera movement, and lighting detail before generation. Video is persisted server-side and streamed back without sending raw bytes over the wire.


Why ORBIT?

Common problem What ORBIT provides
One SDK per provider, with rewrites when you switch One OpenAI-compatible API across local and hosted providers
Separate systems for inference, retrieval, tools, and chat history One gateway for model calls, adapters, tools, conversations, and clients
RAG limited to vector search over clean documents Retrieval over SQL, NoSQL, HTTP, GraphQL, files, web content, and vector stores
Glue scripts between prompts and business systems Intent adapters, composite adapters, diagnostics, and reusable templates
Privacy-sensitive data sent through third-party services by default Self-hosted deployment with local models, local embeddings, API keys, RBAC, audit logs, and rate limits
Provider failures cascading into application failures Circuit breakers, failover, parallel fan-out, and quota-aware throttling

Core Capabilities

AI Gateway

  • Route requests across local and hosted models.
  • Use one OpenAI-compatible chat API with existing clients and SDKs.
  • Stream responses with failover, moderation hooks, and rate limiting.
  • Connect through the web chat, Node SDK, or OpenAI-style clients.

Retrieval And Adapters

  • Retrieve context from databases, APIs, files, web content, and vector stores.
  • Use intent-based retrieval for natural-language questions over structured data.
  • Fan one prompt across multiple sources with composite adapters.
  • Build and debug data-backed assistants with template diagnostics and autocomplete.

Private RAG And Conversations

  • Run private RAG over PDFs, documents, images, manuals, contracts, and knowledge bases.
  • Reuse retrieved context through conversation threading and cached datasets.
  • Handle multilingual conversations across 100+ languages.
  • Keep deployments self-hosted for privacy-sensitive environments.

Media Generation

  • Generate images from text prompts using DALL-E, Stability AI, and other providers.
  • Generate videos from text prompts using Google Veo 2 — prompts are automatically enriched with motion, camera movement, and lighting detail before generation.
  • Generated media is persisted server-side and delivered via stable URLs, keeping large binary payloads off the wire.
  • Image and video generation follow the same adapter type system as every other capability — no special-casing required.

Tools, Agents, And Production Controls

  • Expose controlled tools through MCP for agent clients.
  • Connect outward to external MCP servers and run a multi-step tool-calling loop inside a conversation — the model calls tools, ORBIT executes them, feeds results back, and repeats until a final answer is produced.
  • Invoke specialized adapters during a conversation with cross-adapter skills.
  • Operate with API keys, RBAC, quotas, audit logs, rate limits, and circuit breakers.
  • Add voice assistant support through audio adapters.

Who Is ORBIT For?

  • Developers building internal AI apps that need real company data, not isolated chat.
  • Teams that need private RAG over documents, databases, APIs, and operational systems.
  • Companies avoiding long-term lock-in to a single LLM provider or hosted AI platform.
  • Engineers connecting AI to SQL, NoSQL, REST, GraphQL, files, and document stores.
  • Builders who want OpenAI-compatible APIs with self-hosted control.

ORBIT is probably more than you need if you only want a thin wrapper around one LLM provider.


What Makes ORBIT Different?

ORBIT is not only a model router. It handles the layers that usually become custom infrastructure in production RAG systems: retrieval, tools, adapters, conversations, access control, and operational safeguards.

  • Retrieval beyond vector search: use intent templates and adapters for structured databases, APIs, files, web content, and vector stores. Intent SQL RAG
  • Data source support: query SQL, MongoDB, Elasticsearch, REST, GraphQL, DuckDB, files, and composite sources through one gateway. Composite adapters
  • Local and hosted models: run private workloads on Ollama, llama.cpp, vLLM, or other local providers, while still supporting hosted LLMs where appropriate.
  • Production controls included: use API keys, RBAC, quotas, audit logging, moderation, rate limits, and circuit breakers. Rate limiting
  • Agent-ready protocol support: expose ORBIT-backed chat, RAG, and adapter tools through MCP for agent clients. MCP / OpenClaw walkthrough
  • MCP client — agentic tool calling: ORBIT can also connect outward to external MCP servers (filesystem, GitHub, Slack, SharePoint, Postgres, and more), discover their tools, and run a bounded multi-step tool-calling loop. Exposed as the mcp-agent skill on any adapter. Works with OpenAI, Anthropic, Gemini, xAI, and llama.cpp. MCP agent skill

Example Use Cases

Use case Start here
Chat with a local model through an OpenAI-compatible API Step-by-step tutorial
Ask Postgres, MySQL, MongoDB, DuckDB, or Elasticsearch questions in natural language Database copilot
Query SQL + NoSQL + REST APIs in one prompt Composite adapters
Upload files and get grounded answers File-upload RAG
Deploy a private AI gateway for regulated data Private gateway cookbook
Run ORBIT as an MCP tool server for agents MCP / OpenClaw walkthrough
Connect to external MCP servers and run agentic tool-calling loops MCP agent skill
Build a full-duplex voice assistant PersonaPlex voice assistant
Generate images and videos from text prompts Cross-adapter skills

Architecture And Adapters

ORBIT sits between clients, models, and data sources. Clients call the OpenAI-compatible API, ORBIT authenticates and routes the request, adapters retrieve or act on external data, and the selected model generates the response with the retrieved context.

Layer Coverage
Clients and protocols Web chat, Node SDK, OpenAI-compatible SDKs, MCP
Model routing Hosted providers, local providers, streaming, failover, runtime model selection
Retrieval adapters SQL, NoSQL, REST, GraphQL, files, web content, vector stores, composite adapters
RAG workflow Intent templates, diagnostics, autocomplete, cached datasets, conversation threading
Media generation Images (DALL-E, Stability AI), videos (Google Veo 2), server-side persistence, URL delivery
MCP client / agent Connect to external MCP servers, discover tools, run bounded multi-step tool-calling loops as a skill
Operations API keys, RBAC, audit logs, quotas, rate limits, moderation, circuit breakers, admin UI
Compatibility overview

ORBIT supports:

  • Local and hosted LLM providers.
  • SQL and NoSQL databases.
  • REST and GraphQL APIs.
  • File, web, and vector-based retrieval.
  • Local and hosted embedding providers.
  • Reranking, moderation, and guardrail integrations.
  • OpenAI-compatible clients and MCP-compatible tools.

See the Documentation and Cookbook for full setup details and integration coverage.


Deployment Options

Docker Compose

git clone https://github.com/schmitech/orbit.git && cd orbit/docker
docker compose up -d

This starts ORBIT with Ollama and SmolLM2, pulls models automatically, and exposes the API on port 3000. The web admin UI is at /admin on the same host.

Connect orbitchat from your host:

ORBIT_ADAPTER_KEYS='{"simple-chat":"default-key"}' npx orbitchat

Pre-Built Image

docker pull schmitech/orbit:basic
docker run -d --name orbit-basic -p 3000:3000 schmitech/orbit:basic

If Ollama runs on your host, add -e OLLAMA_HOST=host.docker.internal:11434 so the container can reach it. The basic image includes simple-chat only.

Release Tarball

Download the current release from GitHub Releases, then install:

curl -LO https://github.com/schmitech/orbit/releases/download/v2.7.1/orbit-2.7.1.tar.gz
tar -xzf orbit-2.7.1.tar.gz && cd orbit-2.7.1
cp env.example .env && ./install/setup.sh
./bin/orbit.sh start && tail -f ./logs/orbit.log

orbit-admin.mp4

The ORBIT Admin Panel provides real-time monitoring of system health, adapter states, and inference performance.


Clients

Client Description
Web Chat React chat UI
Node SDK Node client, or use any OpenAI-compatible SDK

Documentation


Roadmap

  • More ready-to-run adapter templates for common business systems.
  • More MCP recipes for agent platforms and desktop clients.
  • Persistent MCP server connections and per-server circuit breakers for the MCP agent skill.
  • Human-in-the-loop tool approval for write operations in the MCP agent loop.
  • Expanded evaluation, tracing, and observability workflows.
  • Admin UI improvements for configuration, diagnostics, and operations.
  • Additional deployment templates for private cloud and regulated environments.

Contributing

Contributions are welcome. Check the issues for good first tasks, or open a new issue to discuss your idea.

ORBIT is Apache 2.0, so you can build commercial products on top of it. If ORBIT helps you build private RAG, agent tools, or AI gateway infrastructure, a star helps others find the project and follow its development.


License

Apache 2.0 — see LICENSE.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.