This guide will help you get started with ORBIT using docker-compose. The setup uses separate containers for better isolation, flexibility, and GPU configuration.
docker-compose.yml
├── ollama (official ollama/ollama image, port 11434)
├── ollama-init (one-shot: pulls smollm2 + nomic-embed-text models)
└── orbit (lean Python server image, port 3000)
- Ollama runs in its own container with a persistent volume for models
- ORBIT server is a lean Python image (no Ollama, no Node.js bundled)
- orbitchat is installed separately on the host via
npm install -g orbitchat
- ORBIT server with core functionality
- simple-chat adapter - a conversational chatbot adapter
- Ollama (separate container) with auto-pulled models:
- smollm2 - chat/inference model (ultra-fast, ~1.2GB)
- nomic-embed-text - embeddings model
- Default database - pre-configured so no API key creation needed
- Default configuration - optimized for quick start
- Automatic GPU/CPU detection - selects optimal preset at runtime
No external API keys or cloud services required!
- Docker (version 20.10 or higher) with Docker Compose v2
- 4GB+ RAM available for Docker
- 3GB+ disk space for images and models
- Node.js (optional, for orbitchat web interface)
You can run ORBIT with Docker in two ways:
| Option | Use when |
|---|---|
| Docker Compose (below) | You want Ollama + models + ORBIT in one go (recommended). |
| Pre-built image only | You already have Ollama on the host or elsewhere; you only need the ORBIT server. |
cd docker
docker compose up -dThis starts three containers:
- ollama - LLM inference server (port 11434)
- ollama-init - pulls required models then exits
- orbit - ORBIT API server (port 3000)
# Check container status
docker compose psYou should see ollama and orbit as healthy, and ollama-init as exited (0).
# Test the health endpoint
curl http://localhost:3000/health
# Verify models are available
curl http://localhost:11434/api/tagsInstall and run the orbitchat web interface from your host machine:
npm install -g orbitchat
ORBIT_ADAPTER_KEYS='{"simple-chat":"default-key"}' orbitchatThen open http://localhost:5173 in your browser.
curl -X POST http://localhost:3000/v1/chat \
-H 'Content-Type: application/json' \
-H 'X-API-Key: default-key' \
-H 'X-Session-ID: test-session' \
-d '{
"messages": [
{"role": "user", "content": "Hello, what is 2+2?"}
],
"stream": false
}'To run only the ORBIT server from Docker Hub (no Ollama or models inside the image):
docker pull schmitech/orbit:basic
docker run -d --name orbit-basic -p 3000:3000 schmitech/orbit:basicThe server will listen on port 3000 but needs an LLM backend to handle chat:
- Ollama on your host: use
host.docker.internalso the container can reach it:docker run -d --name orbit-basic -p 3000:3000 \ -e OLLAMA_HOST=host.docker.internal:11434 \ schmitech/orbit:basic
- Ollama in another container or remote: set
OLLAMA_HOSTto that address (e.g.ollama:11434orhttp://your-ollama-host:11434).
The basic image includes the simple-chat adapter only. For the full stack (Ollama + model pull + ORBIT), use Docker Compose (Option A above).
To enable NVIDIA GPU acceleration for Ollama:
cd docker
docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -dThis requires:
- NVIDIA GPU with compatible drivers
- NVIDIA Container Toolkit installed
The ORBIT server automatically detects the GPU and selects the appropriate preset (smollm2-1.7b-gpu vs smollm2-1.7b-cpu).
You can also force a specific preset:
# In docker-compose.yml, change the orbit service environment:
environment:
- ORBIT_PRESET=smollm2-1.7b-gpuSet these in docker-compose.yml under the orbit service:
ORBIT_PRESET- Override GPU auto-detection (default:auto)auto- detect GPU and select appropriate presetsmollm2-1.7b-gpu- force GPU presetsmollm2-1.7b-cpu- force CPU preset- Any preset name from
ollama.yaml
OLLAMA_HOST- Ollama service address (default:ollama:11434)ORBIT_DEFAULT_ADMIN_PASSWORD- Admin password for CLI access (default:admin123)
Docker-compose volumes are configured by default:
ollama-data- Ollama models (persists across restarts, no re-download)orbit-data- ORBIT application dataorbit-logs- ORBIT server logs
Models are only pulled once by ollama-init. Subsequent docker compose down && docker compose up -d will reuse cached models.
# Login as admin (will prompt for password, default: admin123)
./docker/orbit-docker.sh --container orbit-server login
# Create a default API key with a simple prompt
./docker/orbit-docker.sh --container orbit-server cli key create \
--adapter simple-chat \
--name "Default Chat Key" \
--prompt-name "Default Assistant Prompt" \
--prompt-text "You are a helpful assistant. Be concise and friendly."
# List API keys
./docker/orbit-docker.sh --container orbit-server cli key list
# Check status
./docker/orbit-docker.sh --container orbit-server status# Login as admin
docker exec -it orbit-server python /orbit/bin/orbit.py login --username admin
# Create an API key
docker exec -it orbit-server python /orbit/bin/orbit.py key create \
--adapter simple-chat \
--name "Default Key" \
--prompt-name "Default Prompt" \
--prompt-text "You are a helpful assistant. Be concise and friendly."
# List API keys
docker exec -it orbit-server python /orbit/bin/orbit.py key listThe CLI supports both --prompt-text (direct string) and --prompt-file (file path).
curl -X POST http://localhost:3000/v1/chat \
-H 'Content-Type: application/json' \
-H 'X-API-Key: default-key' \
-H 'X-Session-ID: my-session' \
-d '{
"messages": [
{"role": "user", "content": "Explain quantum computing in simple terms"}
],
"stream": false
}'curl -X POST http://localhost:3000/v1/chat \
-H 'Content-Type: application/json' \
-H 'X-API-Key: default-key' \
-H 'X-Session-ID: my-session' \
-d '{
"messages": [
{"role": "user", "content": "Tell me a short story"}
],
"stream": true
}' \
--no-buffercd docker
# Stop all containers
docker compose down
# Stop and remove volumes (deletes models and data)
docker compose down -vCheck the logs:
docker compose logs # All services
docker compose logs orbit # ORBIT server only
docker compose logs ollama # Ollama onlyChange ports in docker-compose.yml:
services:
orbit:
ports:
- "3001:3000" # Use host port 3001 instead
ollama:
ports:
- "11435:11434" # Use host port 11435 insteadIncrease Docker's memory limit in Docker Desktop settings, or ensure at least 4GB RAM is available.
The ollama-init service pulls models on first start. Check its logs:
docker compose logs ollama-initIf it failed, you can manually pull models:
docker exec orbit-ollama ollama pull smollm2
docker exec orbit-ollama ollama pull nomic-embed-text:latestEnsure the ollama service is healthy:
docker compose ps
curl http://localhost:11434/api/tagsThe ORBIT entrypoint waits for Ollama to be ready, but if Ollama takes too long to start, try restarting:
docker compose restart orbit- Docker (version 20.10 or higher)
- Git (to clone the repository)
- Docker Hub account (if publishing)
- 4GB+ disk space (for the build process - no models bundled in image)
cd docker
chmod +x publish.sh
# Build only
./publish.sh --build
# Build and publish to Docker Hub
./publish.sh --publish
# Build and publish with version tag
./publish.sh --publish --tag v1.0.0The build creates a lean server-only image (no Ollama, no Node.js, no models). Models are pulled at runtime by the ollama-init service.
./publish.sh --build # Build the Docker image
./publish.sh --publish # Build and push to Docker Hub
./publish.sh --publish --tag v1.0.0 # Build, push, and tag version
./publish.sh --help # Show helpBuild fails with missing config files:
ls install/default-config/ollama.yaml
ls install/default-config/inference.yaml
ls install/orbit.db.defaultDocker build runs out of memory: Increase Docker's memory limit to at least 4GB.
Once you're comfortable with the basic setup, you can:
- Explore the full ORBIT capabilities with additional adapters
- Customize the configuration for your needs
- Add more Ollama models by editing the
ollama-initcommand - Integrate with your applications using the API
For more information, see the main README.md.