OME 0.1.4
📦 Container Images
The following container images are available:
# OME Manager
docker pull ghcr.io/moirai-internal/ome-manager:v0.1.4
# Model Agent
docker pull ghcr.io/moirai-internal/model-agent:v0.1.4
# OME Agent
docker pull ghcr.io/moirai-internal/ome-agent:v0.1.4
# Multinode Prober
docker pull ghcr.io/moirai-internal/multinode-prober:v0.1.4⎈ Helm Installation
Option 1: OCI Registry (Recommended)
# Install directly from OCI registry
helm install ome-crd oci://ghcr.io/moirai-internal/charts/ome-crd --version 0.1.4 --namespace ome --create-namespace
helm install ome oci://ghcr.io/moirai-internal/charts/ome-resources --version 0.1.4 --namespace omeOption 2: GitHub Releases
# Add the OME Helm repository
helm repo add ome https://github.com/sgl-project/ome/releases/download/v0.1.4
helm repo update
# Install OME
helm install ome-crd ome/ome-crd --namespace ome --create-namespace
helm install ome ome/ome-resources --namespace ome📋 Changelog
📄 Software Bill of Materials
SBOMs are available in both SPDX and CycloneDX formats for:
- Source code
- All container images
🔐 Signatures
All container images are signed with cosign. Verify with:
cosign verify ghcr.io/moirai-internal/ome-manager:v0.1.4 --certificate-identity-regexp=https://github.com/sgl-project/ome/.github/workflows/release.yaml@refs/tags/.* --certificate-oidc-issuer=https://token.actions.githubusercontent.comWhat's Changed
- [misc] cleanup docs by @slin1237 in #197
- fix: benchmark container volume mounts update when base model exists by @carlory in #198
- style: 🔨 optimize import order and more readable. by @yafengio in #191
- [misc] update kimi k2 runtime by @slin1237 in #200
- [misc] update default kimi isvc with ingres disabled by @slin1237 in #201
- [misc] update kimi k2 runtime to remove volumes and env by @slin1237 in #202
- [WIP] Update kimi-k2-pd-rt.yaml by @Atream in #203
- Fix error message for invalid deployment mode by @carlory in #207
- [Bugfix]: verify KEDA ScaledObject CRD is registered before deletion by @bindrad in #206
- [Docs] Fix serving_runtime doc by @bcfre in #218
- Bump cross-env from 7.0.3 to 10.0.0 in /site by @dependabot[bot] in #219
- chart: support for user-defined model agent tolerations by @my-git9 in #221
- [Docs] Improve readability and logical flow of installation guide by @JiangJiaWei1103 in #224
- [Bugfix] Wait for the OME controller Pod to become ready by @JiangJiaWei1103 in #228
- [misc] fix github page version by @slin1237 in #232
- [Misc] Add PVC support to OME agent replica by @beiguo218 in #229
- add acceleratorClass crd by @pallasathena92 in #215
- [misc] upgrade lws version to latest by @slin1237 in #231
- [misc] add oss gpt models by @slin1237 in #233
- [misc] add oss gpt bf16 models by @slin1237 in #234
- [misc] add ut for oss 120b models by @slin1237 in #235
- [BUG]Fix BenchmarkJob Deprecated Storage Command by @YouNeedCryDear in #239
- [Bug]fix openai api violation for accelerator crd by @pallasathena92 in #242
- [BUG]Fix api-backend default value for benchmark job by @YouNeedCryDear in #241
- [Misc] Add checksum upload support in ome agent replica when using OCI as target by @beiguo218 in #240
- Bump actions/download-artifact from 4 to 5 by @dependabot[bot] in #237
- [oep] add proposal for accelerator aware runtime selection by @slin1237 in #129
- Extend ServingRuntime API for Accelerator Support by @pallasathena92 in #246
- [Core] Add local filesystem storage support for model weights by @slin1237 in #253
- feat: support set global hub for chart images by @my-git9 in #251
- add runtime-type helper function test by @pallasathena92 in #249
- [bugfix] add back omitempty on spec and status by @slin1237 in #254
- [bugfix] fix accelerator class Quantity spec by @slin1237 in #256
- [docs] Update Bootstrap utility classes following Bootstrap 5 breaking changes by @yankay in #255
- [bugfix] patch deepseek runtime by @slin1237 in #259
- Support extra volumes and mounts in model-agent by @abatilo in #260
- Extend inferenceService API for Accelerator Support by @pallasathena92 in #258
- [misc] improve hf hub module by @slin1237 in #261
- [core] implement storage interface for multi cloud support by @slin1237 in #262
- [core] implement oci storage by @slin1237 in #263
- [core] implement s3 storage base structure by @slin1237 in #264
- [misc] storage bug fixes and code clean up by @slin1237 in #265
- [core] add s3 and minio storage provider and download option support by @slin1237 in #266
- [core] gcp storage provider base structure by @slin1237 in #267
- Bump actions/setup-go from 5 to 6 by @dependabot[bot] in #272
- Bump actions/setup-node from 4 to 5 by @dependabot[bot] in #271
- Bump actions/github-script from 7 to 8 by @dependabot[bot] in #270
- [Task5]accelerator class controller by @pallasathena92 in #276
- Bump actions/checkout from 4 to 5 by @dependabot[bot] in #245
- [Misc] Add XET-Core Rust Integration for HuggingFace Hub Downloads by @beiguo218 in #277
- [Misc] Add Dockerfile for XET Rust binding linux build by @beiguo218 in #278
- [Misc] Add XET-Core Go binding for HuggingFace Hub Downloads by @pallasathena92 in #279
- [Docs] Add READMEs for Xet binding by @beiguo218 in #280
- [Misc]add static library .a in gitignore by @pallasathena92 in #281
- [core] add pdb controller in ome by @pallasathena92 in #283
- [Misc] Updates in model agent for model deletion: avoid model path deletion when it is referred by other models by @beiguo218 in #284
- [Misc] Migrate to use XET-Core Rust based HF Hub in ome-agent replica by @beiguo218 in #282
- [Core]add runtime selector logic with Awareness AcceleratorClasses by @pallasathena92 in #285
- [Bugfix] Fix list_files fn in XET-Core Rust Hub implementation - implement recursive file listing by @beiguo218 in #286
- Bump cross-env from 10.0.0 to 10.1.0 in /site by @dependabot[bot] in #287
- [Core]Integrate the accelerator‑aware runtime into the InferenceService con… by @pallasathena92 in #289
- [Docs]update code owner by @pallasathena92 in #291
- Bump github/codeql-action from 3 to 4 by @dependabot[bot] in #292
- [OEP]OEP-0005: Model Context Protocol (MCP) Support Design by @YouNeedCryDear in #290
- [Core]Integrate the accelerator‑aware runtime into the InferenceServi… by @pallasathena92 in #293
- [API][Misc] Add new model capabilities and update its determination logic by @beiguo218 in #294
- [Misc] Add more OCI instance types in instance_type_util by @beiguo218 in #295
- [Misc] Add model config parsing support for more model types by @beiguo218 in #296
- [Docs] add readme for AC and update readme for runtime selector by @pallasathena92 in #297
- [Misc][Helm] Add retry logic to scout cache sync by @beiguo218 in #299
- Bump actions/setup-node from 5 to 6 by @dependabot[bot] in #298
- [Helm]add rbac for pdb by @pallasathena92 in #300
- [Bugfix]fix runnerspec during migration by @pallasathena92 in #301
- [Bugfix]fix runnerspec during migration unit test by @pallasathena92 in #302
- Bump actions/download-artifact from 5 to 6 by @dependabot[bot] in #305
- Bump actions/upload-artifact from 4 to 5 by @dependabot[bot] in #304
- [Core] modify old predictor deployment cleanup logic by @pallasathena92 in #303
- [Misc] Support qwen3_vl model for config parsing; Improve qwen3 vision model parameter estimation logic by @beiguo218 in #307
- Add instance type for Nebius by @Kangyan-Zhou in #309
- [Bugfix] Download Model with Xet HF Hub Client by @XinyueZhang369 in #313
- Make OME Deployment Flexible with GPU only option; Improve Build Cache and Docker Build by @Kangyan-Zhou in #310
- Bump autoprefixer from 10.4.21 to 10.4.22 in /site by @dependabot[bot] in #315
- unify retry logic in basemodel controller and update Kong docs link by @ErikJiang in #316
- [Misc] update error handle during processbaseLable by @pallasathena92 in #318
- [Misc] refactor router status update using util function by @pallasathena92 in #320
- Validate ClusterServingRuntime and ServingRuntime resources only reference known accelerator classes by @Kangyan-Zhou in #317
- [Bugfix] fix MergeNodeSelector with different componentType by @pallasathena92 in #322
- Bump actions/checkout from 5 to 6 by @dependabot[bot] in #321
- [Misc]revert ome-agent dockerfile by @pallasathena92 in #323
New Contributors
- @Atream made their first contribution in #203
- @bindrad made their first contribution in #206
- @bcfre made their first contribution in #218
- @my-git9 made their first contribution in #221
- @JiangJiaWei1103 made their first contribution in #224
- @yankay made their first contribution in #255
- @Kangyan-Zhou made their first contribution in #309
- @XinyueZhang369 made their first contribution in #313
- @ErikJiang made their first contribution in #316
Full Changelog: v0.1.3...v0.1.4