Start massive AI/ML container images 10x faster with lazy-loading snapshotter

What is Fastpull?

Fastpull is a lazy-loading snapshotter that starts massive AI/ML container images (>10 GB) in seconds.

The Cold Start Problem

AI/ML container images like CUDA, vLLM, and sglang are large (10 GB+). Traditional Docker pulls take 7-10 minutes, causing:

20-30% GPU capacity wasted from overprovisioning
SLA breaches during traffic spikes

The Solution

Fastpull uses lazy-loading to pull only the files needed to start the container, then fetches remaining layers on demand. This accelerates start times by 10x. See the results below:

You can now:

Install Fastpull on a VM
Install Fastpull on Kubernetes

For more information, check out the fastpull blog release.

Install fastpull on a VM

Prerequisites

VM Image: Works on Debian 12+, Ubuntu, AL2023 VMs with GPU, mileage on other AMIs may vary.
Python>=3.10, pip, python3-venv, Docker, CUDA drivers, Nvidia Container Toolkit installed

Installation Steps

1. Install fastpull

git clone https://github.com/tensorfuse/fastpull.git
cd fastpull/
sudo python3 scripts/setup.py

You should see: "✅ Fastpull installed successfully on your VM"

2. Run containers

Fastpull requires your images to be in a special format. You can either choose from our template of pre-built images like vLLM, TensorRT, and SGlang or build your own using a Dockerfile.

Use pre-built images

Test with vLLM, TensorRT, or Sglang:

fastpull quickstart tensorrt
fastpull quickstart vllm
fastpull quickstart sglang

Each of these will run two times, once with fastpull optimisations, and one the way docker runs it After the quickstart runs are complete, we also run fastpull clean --all which cleans up the downloaded images.

Build custom images

First, authenticate with your registry For ECR:

aws configure;
aws ecr get-login-password --region us-east-1 | sudo nerdctl login --username AWS --password-stdin ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com

For GAR:

gcloud auth login;
gcloud auth print-access-token | sudo nerdctl login <REGION>-docker.pkg.dev --username oauth2accesstoken --password-stdin

For Dockerhub:

sudo docker login

Build and push from your Dockerfile:

Note

We support --registry gar, --registry ecr, --registry dockerhub
For <TAG>, you can use any name that's convenient, ex: v1, latest
2 images are created, one is the overlayfs with tag:<TAG> and another is the fastpull image with tag: <TAG>-fastpull

# Build and push image
fastpull build --registry <REGISTRY> --dockerfile-path <DOCKERFILE-PATH> --repository-url <ECR/GAR-REPO-URL>:<TAG>

Benchmarking with Fastpull

To get the run time for your container, you can use either:

Completion Time

Use if the workload has a defined end point

fastpull run --benchmark-mode completion [--FLAGS] <REPO-URL>:<TAG>
fastpull run --benchmark-mode completion --mode normal [--FLAGS] <REPO-URL>:<TAG>

Server Endpoint Readiness Time

Use if you're preparing a server, and it send with a 200 SUCCESS response once the server is up

fastpull run --benchmark-mode readiness --readiness-endpoint localhost:<PORT>/<ENDPOINT> [--FLAGS] <REPO-URL>:<TAG>
fastpull run --benchmark-mode readiness --readiness-endpoint localhost:<PORT>/<ENDPOINT> --model normal [--FLAGS] <REPO-URL>:<TAG>

Note

When running for Readiness, you must publish the right port ex. -p 8000:8000 and use --readiness-endpoint localhost:8000/health
Use --mode normal to run normal docker, running without this flag runs with fastpull optimisations
For [--FLAGS] you can use any docker compatible flags, ex. --gpus all, -p PORT:PORT, -v <VOLUME_MOUNT>
If using GPUs, make sure you add --gpus all as a fastpull run flag

Cleaning after a run

To get the right cold start numbers, run the clean command after each run:

fastpull clean --all

Understanding Test Results

Results show the startup and completion/readiness times:

Example Output

==================================================
BENCHMARK SUMMARY
==================================================
Time to Container Start: 141.295s
Time to Readiness:       329.367s
Total Elapsed Time:      329.367s
==================================================

Install fastpull on a Kubernetes Cluster

Prerequisites

Tested on GKE
Tested with COS Operating System for the nodes

Installation

In your K8s cluster, create a GPU Nodepool. For GKE, ensure Workload Identity is enabled on your cluster
Install Nvidia GPU drivers. For COS:

kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded-latest.yaml

Install containerd config updater daemonset: kubectl apply -f https://raw.githubusercontent.com/tensorfuse/fastpull-gke/main/containerd-daemonset.yaml
Install the Helm Chart. For COS:

helm upgrade --install fastpull-snapshotter oci://registry-1.docker.io/tensorfuse/fastpull-snapshotter \
--version 0.0.10-gke-helm \
--create-namespace \
--namespace fastpull-snapshotter \
--set 'tolerations[0].key=nvidia.com/gpu' \
--set 'tolerations[0].operator=Equal' \
--set 'tolerations[0].value=present' \
--set 'tolerations[0].effect=NoSchedule' \
--set 'affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].key=cloud.google.com/gke-accelerator' \
--set 'affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].operator=Exists'

Build your images, which can be done by two ways:

a. On a standalone VM, preferably using Ubuntu os, install fastpull and build your image

b. Build in a container:

First authenticate to your registry and ensure the ~/docker/config.json is updated

#for aws
aws configure
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com
#for gcp
gcloud auth login
gcloud auth print-access-token | sudo nerdctl login <REGION>-docker.pkg.dev --username oauth2accesstoken --password-stdin

Then build using our image:

docker run --rm --privileged \
  -v /path/to/dockerfile-dir:/workspace:ro \
  -v ~/.docker/config.json:/root/.docker/config.json:ro \
  tensorfuse/fastpull-builder:latest \
  REGISTRY/REPO/IMAGE:TAG

This creates IMAGE:TAG (normal) and IMAGE:TAG-fastpull (fastpull-optimized). Use the -fastpull tag in your pod spec. See builder documentation for details.

Create the pod spec for image we created. For COS, use a pod spec like this:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-test-a100-fastpull
spec:
  tolerations:
    - operator: Exists
  nodeSelector:
    cloud.google.com/gke-accelerator: nvidia-tesla-a100 # Use your GPU Type
  runtimeClassName: runc-fastpull
  containers:
  - name: debug-container
    image: IMAGE_PATH:<TAG>-fastpull # USE FASTPULL IMAGE
    resources:
      limits:
        nvidia.com/gpu: 1
    env:
    - name: LD_LIBRARY_PATH
      value: /usr/local/cuda/lib64:/usr/local/nvidia/lib64 # NOTE: This path may vary depending on the base image

Run a pod with this spec:

kubectl apply -f <POD-SPECFILE>.yaml

🤝 Contributing

We welcome contributions! Submit a Pull Request or join our Slack community.

Built with ❤️ by the TensorFuse team

Name	Name	Last commit message	Last commit date
Latest commit History 67 Commits
assets	assets
docs	docs
images	images
scripts	scripts
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
pyproject.toml	pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Start massive AI/ML container images 10x faster with lazy-loading snapshotter

What is Fastpull?

The Cold Start Problem

The Solution

Install fastpull on a VM

Prerequisites

Installation Steps

Use pre-built images

Build custom images

Benchmarking with Fastpull

Cleaning after a run

Understanding Test Results

Install fastpull on a Kubernetes Cluster

Prerequisites

Installation

🤝 Contributing

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Search code, repositories, users, issues, pull requests...

License

tensorfuse/fastpull

Folders and files

Latest commit

History

Repository files navigation

Start massive AI/ML container images 10x faster with lazy-loading snapshotter

What is Fastpull?

The Cold Start Problem

The Solution

Install fastpull on a VM

Prerequisites

Installation Steps

Use pre-built images

Build custom images

Benchmarking with Fastpull

Cleaning after a run

Understanding Test Results

Install fastpull on a Kubernetes Cluster

Prerequisites

Installation

🤝 Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages