diff --git a/.gitignore b/.gitignore
index 78e1837..d230f33 100644
--- a/.gitignore
+++ b/.gitignore
@@ -160,8 +160,4 @@ Thumbs.db
 *.bak
 *.backup
 
-# Claude specific files
-CLAUDE.md
-
-# Installation scripts
-scripts/install_cuda_nvidia.sh
+results/
diff --git a/README.md b/README.md
index 7269f2d..380bb7a 100644
--- a/README.md
+++ b/README.md
@@ -6,11 +6,10 @@
 <div align="center">
 
 # Start massive AI/ML container images 10x faster with lazy-loading snapshotter
+[![Join Slack](https://img.shields.io/badge/Join_Slack-2EB67D?style=for-the-badge&logo=slack&logoColor=white)](https://join.slack.com/t/tensorfusecommunity/shared_invite/zt-30r6ik3dz-Rf7nS76vWKOu6DoKh5Cs5w)
+[![Read our Blog](https://img.shields.io/badge/Read_our_Blog-ff9800?style=for-the-badge&logo=RSS&logoColor=white)](https://tensorfuse.io/docs/blogs/blog)
 
-<a href="https://join.slack.com/t/tensorfusecommunity/shared_invite/zt-30r6ik3dz-Rf7nS76vWKOu6DoKh5Cs5w"><img src="assets/button_join-our-slack.png" width="150"></a>
-<a href="https://tensorfuse.io/docs/blogs/blog"><img src="assets/button_blog.png" width="150"></a>
-
-[Installation](#install-fastpull-on-a-vm) • [Results](#understanding-test-results)
+[Installation](#install-fastpull-on-a-vm) • [Results](#understanding-test-results) • [Detailed Usage](docs/fastpull.md)
 
 </div>
 
@@ -29,25 +28,26 @@ AI/ML container images like CUDA, vLLM, and sglang are large (10 GB+). Tradition
 
 #### The Solution
 
-Fastpull uses lazy-loading to pull only the files needed to start the container, then fetches remaining layers on demand. This accelerates start times by 10x. See the results below: 
+Fastpull uses lazy-loading to pull only the files needed to start the container, then fetches remaining layers on demand. This accelerates start times by 10x. See the results below:
 
 <div align="center">
   <img src="assets/time_first_log_tensorrt.png" alt="benchmark" width="530" />
 </div>
 
+You can now:
+- [Install Fastpull on a VM](#install-fastpull-on-a-vm)
+- [Install Fastpull on Kubernetes](#install-fastpull-on-a-kubernetes-cluster)
+
 For more information, check out the [fastpull blog release](https://tensorfuse.io/docs/blogs/reducing_gpu_cold_start).
 
 ---
 
 ## Install fastpull on a VM
 
-> **Note:** For Kubernetes installation, [contact us](mailto:agam@tensorfuse.io) for early access to our helm chart.
-
 ### Prerequisites
 
-- Debian or Ubuntu VM with GPU
-- Docker and CUDA driver installed
-- Registry authentication configured (GAR, ECR, etc.)
+- VM Image: Works on Debian 12+, Ubuntu, AL2023 VMs with GPU, mileage on other AMIs may vary.
+- Python>=3.10, pip, python3-venv, [Docker](https://docs.docker.com/engine/install/), [CUDA drivers](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/), [Nvidia Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) installed
 
 ### Installation Steps
 
@@ -56,81 +56,191 @@ For more information, check out the [fastpull blog release](https://tensorfuse.i
 ```bash
 git clone https://github.com/tensorfuse/fastpull.git
 cd fastpull/
-sudo python3 scripts/install_snapshotters.py
-
-# Verify installation
-sudo systemctl status nydus-snapshotter-fuse.service
+sudo python3 scripts/setup.py
 ```
 
 You should see: **"✅ Fastpull installed successfully on your VM"**
 
 **2. Run containers**
 
-Fastpull requires your images to be in a special format. You can either choose from our template of pre-built images like vLLM, TensorRT, and SGlang or build your own using a Dockerfile. 
+Fastpull requires your images to be in a special format. You can either choose from our template of pre-built images like vLLM, TensorRT, and SGlang or build your own using a Dockerfile.
 
-<b>Option A: Use pre-built images</b>
+#### Use pre-built images
 
 Test with vLLM, TensorRT, or Sglang:
 
 ```bash
-python3 scripts/benchmark/test-bench-vllm.py \
-  --image public.ecr.aws/s6z9f6e5/tensorfuse/fastpull/vllm:latest-nydus \
-  --snapshotter nydus
+fastpull quickstart tensorrt
+fastpull quickstart vllm
+fastpull quickstart sglang
 ```
 
-<b>Option B: Build custom images</b>
+Each of these will run two times, once with fastpull optimisations, and one the way docker runs it
+After the quickstart runs are complete, we also run `fastpull clean --all` which cleans up the downloaded images.
+
+#### Build custom images
+
+First, authenticate with your registry
+For ECR:
+```
+aws configure;
+aws ecr get-login-password --region us-east-1 | sudo nerdctl login --username AWS --password-stdin ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com
+
+```
+
+For GAR:
+```
+gcloud auth login;
+gcloud auth print-access-token | sudo nerdctl login <REGION>-docker.pkg.dev --username oauth2accesstoken --password-stdin
+```
+For Dockerhub:
+```
+sudo docker login
+```
+
+Build and push from your Dockerfile:
+
+> [!NOTE]
+> - We support --registry gar, --registry ecr, --registry dockerhub
+> - For `<TAG>`, you can use any name that's convenient, ex: `v1`, `latest`
+> - 2 images are created, one is the overlayfs with tag:`<TAG>` and another is the fastpull image with tag: `<TAG>-fastpull`
 
-Build from your Dockerfile:
 
 ```bash
-# Build image
-python3 scripts/build.py --dockerfile <path_to_your_dockerfile>
+# Build and push image
+fastpull build --registry <REGISTRY> --dockerfile-path <DOCKERFILE-PATH> --repository-url <ECR/GAR-REPO-URL>:<TAG>
+```
+
+### Benchmarking with Fastpull
+
+To get the run time for your container, you can use either:
 
-# Push to registry
-python3 scripts/push.py \
-  --registry_type <ecr/gar> \
-  --account_id <YOUR_ACCOUNT_ID>
+<b>Completion Time</b>
 
-# Run with fastpull
-python3 scripts/fastpull.py --image <image_tag>
+Use if the workload has a defined end point
+```
+fastpull run --benchmark-mode completion [--FLAGS] <REPO-URL>:<TAG>
+fastpull run --benchmark-mode completion --mode normal [--FLAGS] <REPO-URL>:<TAG>
 ```
 
+<b>Server Endpoint Readiness Time</b>
 
----
+Use if you're preparing a server, and it send with a 200 SUCCESS response once the server is up
+```
+fastpull run --benchmark-mode readiness --readiness-endpoint localhost:<PORT>/<ENDPOINT> [--FLAGS] <REPO-URL>:<TAG>
+fastpull run --benchmark-mode readiness --readiness-endpoint localhost:<PORT>/<ENDPOINT> --model normal [--FLAGS] <REPO-URL>:<TAG>
+```
+
+> [!NOTE]
+> - When running for Readiness, you must publish the right port ex. `-p 8000:8000` and use `--readiness-endpoint localhost:8000/health`
+> - Use --mode normal to run normal docker, running without this flag runs with fastpull optimisations
+> - For `[--FLAGS]` you can use any docker compatible flags, ex. `--gpus all`, `-p PORT:PORT`, `-v <VOLUME_MOUNT>`
+> - If using GPUs, make sure you add `--gpus all` as a fastpull run flag
 
-## Understanding Test Results
+#### Cleaning after a run
+
+To get the right cold start numbers, run the clean command after each run:
+```
+fastpull clean --all
+```
 
-Results show timing breakdown across startup phases:
+### Understanding Test Results
 
-- **Time to first log:** Container start to entrypoint execution
-- **First log to model download start:** Initialization time
-- **Model download time:** Downloading weights (e.g., Qwen-3-8b, 16GB)
-- **Model load time:** Loading weights into GPU
-- **CUDA compilation/graph capture:** Optimization phase
-- **Total end-to-end time:** Container start to server ready
+Results show the startup and completion/readiness times:
 
 <b>Example Output</b>
 
 ```bash
-=== VLLM TIMING SUMMARY ===
-Container Startup Time:     2.145s
-Container to First Log:     15.234s
-Engine Initialization:      45.123s
-Weights Download Start:     67.890s
-Weights Download Complete: 156.789s
-Weights Loaded:            198.456s
-Graph Capture Complete:    245.678s
-Server Ready:              318.435s
-Total Test Time:           325.678s
-
-BREAKDOWN:
-Container to First Log:                      15.234s
-First Log to Weight Download Start:          52.656s  
-Weight Download Start to Complete:           88.899s
-Weight Download Complete to Weights Loaded:  41.667s
-Weights Loaded to Server Ready:             119.979s
+==================================================
+BENCHMARK SUMMARY
+==================================================
+Time to Container Start: 141.295s
+Time to Readiness:       329.367s
+Total Elapsed Time:      329.367s
+==================================================
 ```
 
+---
+
+## Install fastpull on a Kubernetes Cluster
+
+### Prerequisites
+- Tested on GKE
+- Tested with COS Operating System for the nodes
+
+### Installation
+1. In your K8s cluster, create a GPU Nodepool. For GKE, ensure Workload Identity is enabled on your cluster
+2. Install Nvidia GPU drivers. For COS:
+```bash
+kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded-latest.yaml
+```
+3. Install containerd config updater daemonset: `kubectl apply -f https://raw.githubusercontent.com/tensorfuse/fastpull-gke/main/containerd-daemonset.yaml`
+4. Install the [Helm Chart](https://hub.docker.com/repository/docker/tensorfuse/fastpull-snapshotter/general). For COS:
+```bash
+helm upgrade --install fastpull-snapshotter oci://registry-1.docker.io/tensorfuse/fastpull-snapshotter \
+--version 0.0.10-gke-helm \
+--create-namespace \
+--namespace fastpull-snapshotter \
+--set 'tolerations[0].key=nvidia.com/gpu' \
+--set 'tolerations[0].operator=Equal' \
+--set 'tolerations[0].value=present' \
+--set 'tolerations[0].effect=NoSchedule' \
+--set 'affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].key=cloud.google.com/gke-accelerator' \
+--set 'affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[0].matchExpressions[0].operator=Exists'
+```
+5. Build your images, which can be done by two ways:
+
+    a. On a standalone VM, preferably using Ubuntu os, [install fastpull](#installation-steps) and [build your image](#build-custom-images)
+  
+    b. Build in a container:
+  
+    First authenticate to your registry and ensure the ~/docker/config.json is updated
+    ```bash
+    #for aws
+    aws configure
+    aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com
+    #for gcp
+    gcloud auth login
+    gcloud auth print-access-token | sudo nerdctl login <REGION>-docker.pkg.dev --username oauth2accesstoken --password-stdin
+    ```
+    Then build using our image:
+    ```bash
+    docker run --rm --privileged \
+      -v /path/to/dockerfile-dir:/workspace:ro \
+      -v ~/.docker/config.json:/root/.docker/config.json:ro \
+      tensorfuse/fastpull-builder:latest \
+      REGISTRY/REPO/IMAGE:TAG
+    ```
+    This creates `IMAGE:TAG` (normal) and `IMAGE:TAG-fastpull` (fastpull-optimized). Use the `-fastpull` tag in your pod spec. See [builder documentation](scripts/builder/README.md) for details.
+
+6. Create the pod spec for image we created. For COS, use a pod spec like this:
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+  name: gpu-test-a100-fastpull
+spec:
+  tolerations:
+    - operator: Exists
+  nodeSelector:
+    cloud.google.com/gke-accelerator: nvidia-tesla-a100 # Use your GPU Type
+  runtimeClassName: runc-fastpull
+  containers:
+  - name: debug-container
+    image: IMAGE_PATH:<TAG>-fastpull # USE FASTPULL IMAGE
+    resources:
+      limits:
+        nvidia.com/gpu: 1
+    env:
+    - name: LD_LIBRARY_PATH
+      value: /usr/local/cuda/lib64:/usr/local/nvidia/lib64 # NOTE: This path may vary depending on the base image
+```
+7. Run a pod with this spec:
+```bash
+kubectl apply -f <POD-SPECFILE>.yaml
+```
+
+
 ---
 
 <div align="center">
@@ -145,4 +255,4 @@ We welcome contributions! Submit a Pull Request or join our [Slack community](ht
 
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 
-</div>
\ No newline at end of file
+</div>
diff --git a/docs/fastpull.md b/docs/fastpull.md
new file mode 100644
index 0000000..cf3e839
--- /dev/null
+++ b/docs/fastpull.md
@@ -0,0 +1,312 @@
+# FastPull CLI - Quick Reference
+
+The new unified `fastpull` command-line interface for building and running containers with lazy-loading snapshotters.
+
+## Installation
+
+The setup script automatically detects your OS (Ubuntu/Debian/RHEL/CentOS/Fedora) and installs all dependencies including `python3-venv` and `wget`.
+
+```bash
+# Full installation (containerd + Nydus + CLI)
+sudo python3 scripts/setup.py
+
+# Install only CLI (if containerd/Nydus already installed)
+sudo python3 scripts/setup.py --cli-only
+
+# Verify installation
+fastpull --version
+```
+
+**Supported Package Managers:**
+- `apt` (Ubuntu/Debian)
+- `yum` (RHEL/CentOS 7)
+- `dnf` (RHEL/CentOS 8+/Fedora)
+
+## Commands
+
+### `fastpull quickstart` - Quick Benchmark Comparisons
+
+Run pre-configured benchmarks to quickly compare snapshotter performance.
+
+#### Available Workloads
+
+**TensorRT:**
+```bash
+sudo fastpull quickstart tensorrt
+sudo fastpull quickstart tensorrt --output-dir ./results
+```
+
+**vLLM:**
+```bash
+sudo fastpull quickstart vllm
+sudo fastpull quickstart vllm --output-dir ./results
+```
+
+**SGLang:**
+```bash
+sudo fastpull quickstart sglang
+sudo fastpull quickstart sglang --output-dir ./results
+```
+
+Each quickstart automatically:
+1. Runs with FastPull mode (Nydus snapshotter)
+2. Runs with Normal mode (OverlayFS snapshotter)
+3. Measures readiness benchmarking for startup performance
+4. **Auto-cleans containers and images after completion**
+
+---
+
+### `fastpull run` - Run Containers with Benchmarking
+
+Run containers with FastPull (Nydus) or Normal (OverlayFS) mode.
+
+#### Basic Usage
+
+```bash
+# Run with FastPull mode (default, auto-adds -nydus suffix to tag)
+fastpull run myapp:latest
+
+# Run with Normal mode (OverlayFS, no suffix)
+fastpull run --mode normal myapp:latest
+
+# Run with GPU support
+fastpull run myapp:latest --gpus all -p 8080:8080
+```
+
+#### Benchmarking Modes
+
+**Readiness Mode** - Poll HTTP endpoint until 200 response:
+```bash
+fastpull run \
+  myapp:latest \
+  --benchmark-mode readiness \
+  --readiness-endpoint http://localhost:8080/health \
+  -p 8080:8080
+```
+
+**Completion Mode** - Wait for container to exit:
+```bash
+fastpull run \
+  myapp:latest \
+  --benchmark-mode completion
+```
+
+**Export Metrics** - Save results to JSON:
+```bash
+fastpull run \
+  myapp:latest \
+  --benchmark-mode readiness \
+  --readiness-endpoint http://localhost:8080/health \
+  --output-json results.json \
+  -p 8080:8080
+```
+
+#### Supported Flags
+
+- `--mode` - Run mode: nydus (default, adds -nydus suffix), normal (overlayfs, no suffix)
+- `IMAGE` - Container image to run (positional argument, required)
+- `--benchmark-mode` - Options: none, completion, readiness (default: none)
+- `--readiness-endpoint` - HTTP endpoint for health checks
+- `--output-json` - Export metrics to JSON file
+- `--name` - Container name
+- `-p, --publish` - Publish ports (repeatable)
+- `-e, --env` - Environment variables (repeatable)
+- `-v, --volume` - Bind mount volumes (repeatable)
+- `--gpus` - GPU devices (e.g., "all")
+- `--rm` - Auto-remove container on exit
+- `-d, --detach` - Run in background
+
+**Note:** Any additional arguments after the image are passed through to nerdctl.
+
+#### Pass-through Examples
+
+```bash
+# Custom entrypoint
+fastpull run myapp:latest --entrypoint /bin/bash
+
+# Command override
+fastpull run myapp:latest python script.py --arg1 value1
+
+# Additional nerdctl flags
+fastpull run myapp:latest --privileged --network host
+```
+
+---
+
+### `fastpull build` - Build and Push Images in Multiple Formats
+
+Build Docker and snapshotter-optimized images, then push to registry.
+
+#### Basic Usage
+
+```bash
+# Build Docker and Nydus (default) and push
+fastpull build --dockerfile-path ./app --repository-url myapp:latest
+
+# Build specific formats
+fastpull build \
+  --dockerfile-path ./app \
+  --repository-url myapp:v1 \
+  --format docker,nydus
+```
+
+#### Build Options
+
+```bash
+# No cache
+fastpull build --dockerfile-path ./app --repository-url myapp:latest --no-cache
+
+# With build arguments
+fastpull build \
+  --dockerfile-path ./app \
+  --repository-url myapp:latest \
+  --build-arg VERSION=1.0 \
+  --build-arg ENV=prod
+
+# Custom Dockerfile
+fastpull build \
+  --dockerfile-path ./app \
+  --repository-url myapp:latest \
+  --dockerfile Dockerfile.prod
+```
+
+#### Supported Flags
+
+- `--dockerfile-path` - Path to Dockerfile directory (required)
+- `--repository-url` - Full image reference including registry, repository, and tag (required)
+- `--format` - Comma-separated formats: docker, nydus (default: docker,nydus)
+- `--no-cache` - Build without cache
+- `--build-arg` - Build arguments (repeatable)
+- `--dockerfile` - Dockerfile name (default: Dockerfile)
+
+**Note:** Images are automatically pushed to the registry after building.
+
+---
+
+### `fastpull clean` - Remove Local Images and Artifacts
+
+Clean up local container images and stopped containers.
+
+#### Basic Usage
+
+```bash
+# Clean all images and containers (requires confirmation)
+fastpull clean --all
+
+# Clean only images
+fastpull clean --images
+
+# Clean only stopped containers
+fastpull clean --containers
+
+# Target specific snapshotter
+fastpull clean --all --snapshotter nydus
+fastpull clean --all --snapshotter overlayfs
+
+# Dry run to see what would be removed
+fastpull clean --all --dry-run
+
+# Force removal without confirmation
+fastpull clean --all --force
+```
+
+#### Supported Flags
+
+- `--images` - Remove all images
+- `--containers` - Remove stopped containers
+- `--all` - Remove both images and containers
+- `--snapshotter` - Target specific snapshotter: nydus, overlayfs, all (default: all)
+- `--dry-run` - Show what would be removed without removing
+- `--force` - Force removal without confirmation
+
+---
+
+## Complete Workflow Example
+
+```bash
+# 1. Build and push images in multiple formats
+fastpull build \
+  --dockerfile-path ./my-app \
+  --repository-url 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.0 \
+  --format docker,nydus
+
+# 2. Run with benchmarking (FastPull mode, auto-adds -nydus suffix)
+fastpull run \
+  123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.0 \
+  --benchmark-mode readiness \
+  --readiness-endpoint http://localhost:8000/health \
+  --output-json benchmark-results.json \
+  -p 8000:8000 \
+  --gpus all
+```
+
+---
+
+## Benchmarking Metrics
+
+When using `--benchmark-mode`, fastpull tracks:
+
+1. **Time to Container Start** - Using `ctr events` to monitor container lifecycle
+2. **Time to Readiness/Completion**:
+   - **Readiness mode**: Polls HTTP endpoint until 200 response
+   - **Completion mode**: Waits for container to exit
+
+Example output:
+
+**FastPull mode (Nydus):**
+```
+==================================================
+FASTPULL BENCHMARK SUMMARY
+==================================================
+Time to Container Start: 2.34s
+Time to Readiness:       45.67s
+Total Elapsed Time:      48.01s
+==================================================
+```
+
+**Normal mode (OverlayFS):**
+```
+==================================================
+NORMAL BENCHMARK SUMMARY
+==================================================
+Time to Container Start: 13.64s
+Time to Readiness:       387.77s
+Total Elapsed Time:      387.77s
+==================================================
+```
+
+---
+
+## Uninstallation
+
+```bash
+# Remove fastpull CLI
+sudo python3 scripts/setup.py --uninstall
+```
+
+---
+
+## Backwards Compatibility
+
+The original scripts remain unchanged and continue to work:
+- `scripts/build_push.py`
+- `scripts/benchmark/test-bench-vllm.py`
+- `scripts/benchmark/test-bench-sglang.py`
+- `scripts/install_snapshotters.py`
+
+---
+
+## Service Management
+
+After installation, the Nydus snapshotter service is renamed to `fastpull.service`:
+
+```bash
+# Check status
+systemctl status fastpull.service
+
+# Restart service
+sudo systemctl restart fastpull.service
+
+# View logs
+journalctl -u fastpull.service -f
+```
diff --git a/images/alpine-loop/Dockerfile b/images/alpine-loop/Dockerfile
new file mode 100644
index 0000000..bfcce27
--- /dev/null
+++ b/images/alpine-loop/Dockerfile
@@ -0,0 +1,3 @@
+FROM alpine:latest
+
+CMD ["/bin/sh", "-c", "for i in $(seq 1 1000); do echo \"Iteration $i\"; done; echo \"Loop complete\""]
diff --git a/pyproject.toml b/pyproject.toml
new file mode 100644
index 0000000..17a98c1
--- /dev/null
+++ b/pyproject.toml
@@ -0,0 +1,41 @@
+[build-system]
+requires = ["setuptools>=61.0", "wheel"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "fastpull"
+version = "0.1.0"
+description = "Accelerate AI/ML container startup with lazy-loading snapshotters"
+readme = "README.md"
+requires-python = ">=3.7"
+license = {text = "MIT"}
+authors = [
+    {name = "TensorFuse", email = "saurabh@tensorfuse.io"}
+]
+keywords = ["containers", "docker", "fastpull", "snapshotter", "ml", "ai"]
+classifiers = [
+    "Development Status :: 4 - Beta",
+    "Intended Audience :: Developers",
+    "Topic :: Software Development :: Build Tools",
+    "License :: OSI Approved :: MIT License",
+    "Programming Language :: Python :: 3",
+    "Programming Language :: Python :: 3.7",
+    "Programming Language :: Python :: 3.8",
+    "Programming Language :: Python :: 3.9",
+    "Programming Language :: Python :: 3.10",
+    "Programming Language :: Python :: 3.11",
+]
+
+[project.urls]
+Homepage = "https://github.com/tensorfuse/fastpull"
+Documentation = "https://github.com/tensorfuse/fastpull/blob/main/docs/fastpull.md"
+Repository = "https://github.com/tensorfuse/fastpull"
+Issues = "https://github.com/tensorfuse/fastpull/issues"
+
+[project.scripts]
+fastpull = "scripts.fastpull.cli:main"
+
+[tool.setuptools.packages.find]
+where = ["."]
+include = ["scripts.fastpull*"]
+exclude = ["docs*", "images*"]
diff --git a/scripts/benchmark/benchmark_base.py b/scripts/benchmark/benchmark_base.py
deleted file mode 100644
index c3f1594..0000000
--- a/scripts/benchmark/benchmark_base.py
+++ /dev/null
@@ -1,782 +0,0 @@
-#!/usr/bin/env python3
-"""
-Generic Benchmark Framework Base Class
-Provides common functionality for all ML application benchmarks.
-"""
-
-import argparse
-import json
-import os
-import queue
-import re
-import requests
-import signal
-import subprocess
-import sys
-import threading
-import time
-from abc import ABC, abstractmethod
-from datetime import datetime, timezone
-from typing import Dict, List, Optional, Tuple
-
-
-def run_command(cmd, check=True, capture_output=False):
-    """Run a shell command and handle errors."""
-    try:
-        if capture_output:
-            result = subprocess.run(cmd, shell=True, check=check, capture_output=True, text=True)
-            return result.stdout.strip()
-        else:
-            subprocess.run(cmd, shell=True, check=check)
-    except subprocess.CalledProcessError as e:
-        print(f"Error running command: {cmd}")
-        print(f"Error: {e}")
-        if capture_output and e.stdout:
-            print(f"Stdout: {e.stdout}")
-        if capture_output and e.stderr:
-            print(f"Stderr: {e.stderr}")
-        raise
-
-
-def check_aws_credentials():
-    """Check if AWS credentials are configured."""
-    try:
-        run_command("aws sts get-caller-identity", capture_output=True)
-        print("✓ AWS credentials are configured")
-        return True
-    except:
-        print("Warning: AWS credentials not configured. Please run 'aws configure' first.")
-        return False
-
-
-def docker_login_ecr(account=None, region="us-east-1"):
-    """Login to ECR using both docker and nerdctl."""
-    print("Checking AWS credentials and logging into ECR...")
-    
-    if not check_aws_credentials():
-        print("Skipping ECR login due to missing AWS credentials")
-        return False
-    
-    if not account:
-        # Try to get account from AWS STS
-        try:
-            account_info = run_command("aws sts get-caller-identity --query Account --output text", capture_output=True)
-            account = account_info.strip()
-            print(f"Auto-detected AWS account: {account}")
-        except:
-            print("Could not auto-detect AWS account ID")
-            return False
-    
-    try:
-        password = run_command(f"aws ecr get-login-password --region {region}", capture_output=True)
-        registry = f"{account}.dkr.ecr.{region}.amazonaws.com"
-        
-        # Login with docker
-        login_cmd = f"echo '{password}' | docker login -u AWS --password-stdin {registry}"
-        run_command(login_cmd, check=False)
-        
-        # Login with nerdctl
-        login_cmd = f"echo '{password}' | nerdctl login -u AWS --password-stdin {registry}"
-        run_command(login_cmd, check=False)
-        
-        # Login with sudo nerdctl
-        login_cmd = f"echo '{password}' | sudo nerdctl login -u AWS --password-stdin {registry}"
-        run_command(login_cmd, check=False)
-        
-        print("✓ Successfully logged into ECR")
-        return True
-        
-    except Exception as e:
-        print(f"Warning: Could not login to ECR: {e}")
-        return False
-
-
-def construct_ecr_image(repo: str, tag: str, snapshotter: str, region: str = "us-east-1") -> str:
-    """Construct ECR image URL from repo, tag, and snapshotter."""
-    try:
-        # Get AWS account ID
-        account_info = run_command("aws sts get-caller-identity --query Account --output text", capture_output=True)
-        account = account_info.strip()
-        
-        # Add snapshotter suffix to tag (except for overlayfs/native which use base tag)
-        if snapshotter in ["overlayfs", "native"]:
-            final_tag = tag
-        else:
-            final_tag = f"{tag}-{snapshotter}"
-        
-        return f"{account}.dkr.ecr.{region}.amazonaws.com/{repo}:{final_tag}"
-        
-    except Exception as e:
-        raise ValueError(f"Could not construct ECR image URL: {e}. Ensure AWS credentials are configured.")
-
-
-class BenchmarkBase(ABC):
-    """Abstract base class for all benchmarks."""
-    
-    def __init__(self, image: str, container_name: str, snapshotter: str = "nydus", port: int = 8080, model_mount_path: str = None):
-        self.image = image
-        self.container_name = container_name
-        self.snapshotter = snapshotter
-        self.port = port
-        self.model_mount_path = model_mount_path
-        self.start_time = None
-        self.phases = {}
-        self.log_queue = queue.Queue()
-        self.should_stop = threading.Event()
-        
-        # Container events monitoring
-        self.ctr_events_queue = queue.Queue()
-        self.ctr_events_thread = None
-        self.container_create_time = None
-        self.container_start_time = None
-        self.container_startup_duration = None
-        
-        # Health endpoint polling
-        self.health_thread = None
-        self.health_ready_time = None
-        self.health_ready_event = threading.Event()
-        self.interrupted = False
-        
-        # Initialize phases from subclass
-        self._init_phases()
-    
-    @abstractmethod
-    def _init_phases(self) -> None:
-        """Initialize the phases dictionary for the specific application."""
-        pass
-    
-    @abstractmethod
-    def analyze_log_line(self, line: str, timestamp: float) -> Optional[str]:
-        """Analyze a log line and return detected phase. Must be implemented by subclass."""
-        pass
-    
-    
-    @abstractmethod
-    def get_default_image(self, snapshotter: str) -> str:
-        """Get default image for the snapshotter. Must be implemented by subclass."""
-        pass
-    
-    def get_health_endpoint(self) -> Optional[str]:
-        """Get health endpoint for the application. Override in subclasses."""
-        return None
-    
-    def supports_health_polling(self) -> bool:
-        """Check if this application supports health endpoint polling. Override in subclasses."""
-        return False
-    
-    def get_elapsed_time(self) -> float:
-        """Get elapsed time since start in seconds."""
-        if self.start_time is None:
-            return 0.0
-        return time.time() - self.start_time
-    
-    def start_ctr_events_monitor(self):
-        """Start monitoring containerd events in a separate thread."""
-        def monitor_events():
-            try:
-                cmd = ["sudo", "ctr", "events"]
-                process = subprocess.Popen(
-                    cmd,
-                    stdout=subprocess.PIPE,
-                    stderr=subprocess.PIPE,
-                    text=True,
-                    bufsize=1,
-                    universal_newlines=True
-                )
-                
-                while not self.should_stop.is_set():
-                    line = process.stdout.readline()
-                    if not line:
-                        if process.poll() is not None:
-                            break
-                        time.sleep(0.1)
-                        continue
-                    
-                    self.ctr_events_queue.put((time.time(), line.strip()))
-                
-                process.terminate()
-                process.wait()
-                
-            except Exception as e:
-                print(f"Error monitoring ctr events: {e}")
-        
-        self.ctr_events_thread = threading.Thread(target=monitor_events, daemon=True)
-        self.ctr_events_thread.start()
-        return self.ctr_events_thread
-    
-    def process_ctr_events(self):
-        """Process containerd events to track container lifecycle timing."""
-        while not self.should_stop.is_set():
-            try:
-                timestamp, line = self.ctr_events_queue.get(timeout=1.0)
-                
-                # Parse containerd event line
-                # Format: TIMESTAMP NAMESPACE EVENT_TYPE DATA
-                parts = line.split(' ', 3)
-                if len(parts) < 4:
-                    continue
-                
-                event_timestamp_str = f"{parts[0]} {parts[1]}"
-                namespace = parts[2]
-                event_type = parts[3]
-                
-                # Parse the event timestamp
-                try:
-                    # Remove timezone info for parsing, then add it back
-                    ts_clean = event_timestamp_str.replace(" +0000 UTC", "")
-                    event_time = datetime.fromisoformat(ts_clean.replace(' ', 'T'))
-                    event_time = event_time.replace(tzinfo=timezone.utc)
-                    event_timestamp = event_time.timestamp()
-                except:
-                    event_timestamp = timestamp  # Fallback to capture time
-                
-                # Look for task start event (any task since only one container is running)
-                if "/tasks/start" in event_type and self.container_start_time is None:
-                    self.container_start_time = event_timestamp
-                    if self.container_create_time:
-                        self.container_startup_duration = self.container_start_time - self.container_create_time
-                        elapsed = event_timestamp - self.start_time if self.start_time else 0
-                        print(f"[{elapsed:.3f}s] ✓ CONTAINER START (startup: {self.container_startup_duration:.3f}s)")
-                        break  # We found what we needed - stop monitoring
-                        
-            except queue.Empty:
-                continue
-            except KeyboardInterrupt:
-                break
-    
-    def cleanup_container(self):
-        """Remove any existing container with the same name."""
-        try:
-            nerdctl_snapshotter = self.get_nerdctl_snapshotter()
-            cmd = ["sudo", "nerdctl", "--snapshotter", nerdctl_snapshotter, "rm", "-f", self.container_name]
-            subprocess.run(cmd, capture_output=True, check=False)
-        except Exception as e:
-            print(f"Warning: Could not cleanup container: {e}")
-    
-    def start_container(self) -> bool:
-        """Start the container and return success status."""
-        try:
-            # Start ctr events monitoring before container creation
-            print("Starting containerd events monitoring...")
-            self.start_ctr_events_monitor()
-            
-            # Start processing events in background
-            events_thread = threading.Thread(target=self.process_ctr_events, daemon=True)
-            events_thread.start()
-            
-            # Small delay to ensure events monitoring is ready
-            time.sleep(0.5)
-            
-            nerdctl_snapshotter = self.get_nerdctl_snapshotter()
-            cmd = [
-                "sudo", "nerdctl", "--snapshotter", nerdctl_snapshotter, "run",
-                "--name", self.container_name,
-                "--gpus", "all",
-                "--detach",
-                "--publish", f"{self.port}:8000"
-            ]
-            
-            # Add volume mounts if model mount path is provided
-            if self.model_mount_path:
-                cmd.extend([
-                    "--volume", f"{self.model_mount_path}/huggingface:/workspace/huggingface",
-                    "--volume", f"{self.model_mount_path}/hf-xet-cache:/workspace/hf-xet-cache"
-                ])
-            
-            cmd.append(self.image)
-            
-            print(f"Running command: {' '.join(cmd)}")
-            # Set container creation time just before running nerdctl command
-            self.container_create_time = time.time()
-            if self.start_time is not None:
-                elapsed = self.container_create_time - self.start_time
-                print(f"[{elapsed:.3f}s] ✓ CONTAINER CREATE (nerdctl run started)")
-            else:
-                print("No start time is set")
-            
-            result = subprocess.run(cmd, capture_output=True, text=True, check=True)
-            return True
-            
-        except subprocess.CalledProcessError as e:
-            print(f"Error starting container: {e}")
-            print(f"STDERR: {e.stderr}")
-            return False
-    
-    def monitor_logs(self):
-        """Monitor container logs in a separate thread."""
-        def log_reader():
-            try:
-                cmd = ["sudo", "nerdctl", "--snapshotter", self.snapshotter, "logs", "-f", self.container_name]
-                process = subprocess.Popen(
-                    cmd,
-                    stdout=subprocess.PIPE,
-                    stderr=subprocess.STDOUT,
-                    text=True,
-                    bufsize=1,
-                    universal_newlines=True
-                )
-                
-                while not self.should_stop.is_set():
-                    line = process.stdout.readline()
-                    if not line:
-                        if process.poll() is not None:
-                            break
-                        time.sleep(0.1)
-                        continue
-                    
-                    self.log_queue.put((time.time(), line.strip()))
-                
-                process.terminate()
-                
-            except Exception as e:
-                print(f"Error monitoring logs: {e}")
-        
-        log_thread = threading.Thread(target=log_reader, daemon=True)
-        log_thread.start()
-        return log_thread
-    
-    def start_health_polling(self):
-        """Start health endpoint polling in a separate thread."""
-        if not self.supports_health_polling():
-            return None
-            
-        def health_poller():
-            endpoint = self.get_health_endpoint()
-            if not endpoint:
-                return
-                
-            url = f"http://localhost:{self.port}/{endpoint}"
-            print(f"Starting health polling for endpoint: {url}")
-            
-            # Poll with 0.1 second intervals, timeout after 20 minutes
-            start_time = time.time()
-            timeout = 20 * 60  # 20 minutes
-            
-            while not self.should_stop.is_set() and not self.health_ready_event.is_set():
-                if time.time() - start_time > timeout:
-                    print(f"Health polling timed out after {timeout}s")
-                    break
-                    
-                # Check for interrupt
-                if self.interrupted:
-                    print("Health polling interrupted by user")
-                    break
-                    
-                try:
-                    response = requests.get(url, timeout=5)
-                    if response.status_code == 200:
-                        self.health_ready_time = time.time() - self.start_time
-                        elapsed = self.health_ready_time
-                        print(f"[{elapsed:.3f}s] ✓ SERVER READY (HTTP 200)")
-                        
-                        # Set server ready time from health check
-                        self.phases["server_ready"] = self.health_ready_time
-                        self.health_ready_event.set()
-                        break
-                        
-                except requests.exceptions.RequestException:
-                    # Connection failed, server not ready yet
-                    pass
-                
-                time.sleep(0.1)  # Wait 0.1 seconds before next poll
-        
-        self.health_thread = threading.Thread(target=health_poller, daemon=True)
-        self.health_thread.start()
-        return self.health_thread
-    
-    def process_logs(self, timeout: int = 1200):
-        """Process logs and detect phases."""
-        print("Monitoring container logs...")
-        log_thread = self.monitor_logs()
-        
-        # Start health polling if supported
-        health_thread = None
-        
-        start_monitoring = time.time()
-        
-        while time.time() - start_monitoring < timeout:
-            try:
-                timestamp, line = self.log_queue.get(timeout=1.0)
-                elapsed = timestamp - self.start_time
-                
-                # Detect first log
-                if "first_log" in self.phases and self.phases["first_log"] is None:
-                    self.phases["first_log"] = elapsed
-                    print(f"[{elapsed:.3f}s] ✓ FIRST LOG")
-                    
-                    # Start health polling after first log if supported
-                    if self.supports_health_polling() and not health_thread:
-                        health_thread = self.start_health_polling()
-                
-                phase = self.analyze_log_line(line, timestamp)
-                
-                if phase:
-                    print(f"[{elapsed:.3f}s] ✓ {phase.upper().replace('_', ' ')}")
-                
-                print(f"[{elapsed:.3f}s] {line}")
-                
-                # Check if we should stop monitoring
-                if self._should_stop_monitoring(elapsed):
-                    break
-                    
-            except queue.Empty:
-                # Check if we should stop monitoring even when no new logs
-                elapsed = time.time() - self.start_time
-                if self._should_stop_monitoring(elapsed):
-                    break
-                continue
-            except KeyboardInterrupt:
-                print("\nReceived interrupt signal...")
-                break
-        
-        self.should_stop.set()
-    
-    def _should_stop_monitoring(self, elapsed: float) -> bool:
-        """Determine if we should stop monitoring logs. Should be overridden by subclasses."""
-        # For applications that support health polling, stop only after health check succeeds
-        if self.supports_health_polling():
-            return self.health_ready_event.is_set()
-        return False
-    
-    
-    def stop_container(self):
-        """Stop and remove the container. Wait for health check or timeout first."""
-        try:
-            # For applications that support health polling, wait for health check or timeout
-            # But skip waiting if interrupted by user
-            if (self.supports_health_polling() and not self.health_ready_event.is_set() 
-                and not self.interrupted):
-                print("Waiting for health check success or timeout before stopping container...")
-                timeout = 20 * 60  # 20 minutes
-                if self.health_ready_event.wait(timeout):
-                    if not self.interrupted:
-                        print("Health check succeeded, proceeding with container stop")
-                    else:
-                        print("Interrupted during health check, proceeding with container stop")
-                else:
-                    print("Health check timed out, proceeding with container stop")
-            elif self.interrupted:
-                print("Skipping health check wait due to interrupt, proceeding with container stop")
-            
-            self.should_stop.set()
-            # Stop the container
-            cmd_stop = ["sudo", "nerdctl", "--snapshotter", self.snapshotter, "stop", self.container_name]
-            subprocess.run(cmd_stop, capture_output=True, check=False, timeout=30)
-            
-            # Wait for container to fully stop
-            time.sleep(2)
-            
-            # Remove the container
-            cmd_rm = ["sudo", "nerdctl", "--snapshotter", self.snapshotter, "rm", self.container_name]
-            subprocess.run(cmd_rm, capture_output=True, check=False, timeout=60)
-            
-            # Wait a moment for the container removal to be fully processed
-            time.sleep(2)
-            
-        except Exception as e:
-            print(f"Warning: Could not stop/remove container cleanly: {e}")
-    
-    def get_nerdctl_snapshotter(self) -> str:
-        """Get the correct snapshotter name for nerdctl commands."""
-        # Map estargz to stargz for nerdctl compatibility
-        if self.snapshotter == "estargz":
-            return "stargz"
-        return self.snapshotter
-
-    def cleanup_soci_snapshotter(self):
-        """Perform SOCI-specific cleanup: remove state directory and restart service."""
-        if self.snapshotter != "soci":
-            return
-            
-        try:
-            print("Performing SOCI-specific cleanup...")
-            
-            # Remove SOCI state directory
-            print("Removing SOCI state directory...")
-            cmd_rm = ["sudo", "rm", "-rf", "/var/lib/soci-snapshotter-grpc/"]
-            result = subprocess.run(cmd_rm, capture_output=True, text=True, check=False, timeout=30)
-            
-            if result.returncode == 0:
-                print("SOCI state directory removed successfully")
-            else:
-                print(f"Warning: Could not remove SOCI state directory: {result.stderr}")
-            
-            # Restart SOCI snapshotter service
-            print("Restarting SOCI snapshotter service...")
-            cmd_restart = ["sudo", "systemctl", "restart", "soci-snapshotter-grpc.service"]
-            result = subprocess.run(cmd_restart, capture_output=True, text=True, check=False, timeout=30)
-            
-            if result.returncode == 0:
-                print("SOCI snapshotter service restarted successfully")
-                # Give the service a moment to start
-                time.sleep(2)
-            else:
-                print(f"Warning: Could not restart SOCI snapshotter service: {result.stderr}")
-                
-        except Exception as e:
-            print(f"Warning: Could not perform SOCI cleanup: {e}")
-
-    def cleanup_images(self):
-        """Remove the image to ensure fresh pulls for testing."""
-        try:
-            print(f"Removing image {self.image} for clean testing...")
-            
-            nerdctl_snapshotter = self.get_nerdctl_snapshotter()
-            
-            # First, try with image name/tag
-            cmd_rmi = ["sudo", "nerdctl", "--snapshotter", nerdctl_snapshotter, "rmi", self.image]
-            result = subprocess.run(cmd_rmi, capture_output=True, text=True, check=False, timeout=60)
-            
-            if result.returncode == 0:
-                print("Image removed successfully")
-                return
-            
-            # If that fails, get the image ID and try with that
-            print("Trying to remove by image ID...")
-            cmd_images = ["sudo", "nerdctl", "--snapshotter", nerdctl_snapshotter, "images", "--format", "{{.ID}}", self.image]
-            images_result = subprocess.run(cmd_images, capture_output=True, text=True, check=False, timeout=30)
-            
-            if images_result.returncode == 0 and images_result.stdout.strip():
-                image_id = images_result.stdout.strip().split('\n')[0]
-                cmd_rmi_id = ["sudo", "nerdctl", "--snapshotter", nerdctl_snapshotter, "rmi", image_id]
-                id_result = subprocess.run(cmd_rmi_id, capture_output=True, text=True, check=False, timeout=60)
-                
-                if id_result.returncode == 0:
-                    print(f"Image removed successfully using ID: {image_id}")
-                else:
-                    print(f"Could not remove image by ID: {id_result.stderr}")
-            else:
-                print(f"Note: Could not find or remove image: {result.stderr}")
-                
-        except Exception as e:
-            print(f"Warning: Could not remove image: {e}")
-    
-    def print_summary(self, total_time: float):
-        """Print timing summary."""
-        print("\n" + "="*50)
-        print(f"{self.__class__.__name__.replace('Benchmark', '').upper()} TIMING SUMMARY")
-        print("="*50)
-        
-        for label, value in self._get_summary_items(total_time):
-            if label == "":
-                print()  # Empty line
-            elif label.endswith(":") and value is None:
-                print(label)  # Section header
-            elif value is not None:
-                print(f"{label:<30} {value:.3f}s")
-            else:
-                print(f"{label:<30} N/A")
-        
-        print("="*50)
-    
-    def _get_summary_items(self, total_time: float) -> List[Tuple[str, Optional[float]]]:
-        """Get summary items for printing. Must be overridden by subclasses."""
-        items = []
-        
-        # Add container startup time at the beginning
-        items.append(("Container Startup Time:", self.container_startup_duration))
-        
-        for phase_key, phase_value in self.phases.items():
-            label = phase_key.replace('_', ' ').title() + ":"
-            items.append((label, phase_value))
-        items.append(("Total Test Time:", total_time))
-        return items
-    
-    def run_benchmark(self) -> Dict[str, Optional[float]]:
-        """Run the complete benchmark."""
-        app_name = self.__class__.__name__.replace('Benchmark', '')
-        print(f"=== {app_name} Startup Timing Test ===")
-        print(f"Image: {self.image}")
-        print(f"Snapshotter: {self.snapshotter}")
-        print(f"Port: {self.port}")
-        print()
-        
-        # Check AWS credentials and login to ECR if needed
-        if ".ecr." in self.image:
-            print("ECR image detected, attempting AWS login...")
-            # Extract region from image URL if possible, otherwise use default
-            region = "us-east-1"  # Default region
-            if hasattr(self, '_region'):
-                region = self._region
-            docker_login_ecr(region=region)
-        
-        # Cleanup
-        print("Cleaning up existing containers...")
-        self.cleanup_container()
-        self.cleanup_soci_snapshotter()
-        
-        # Start timing
-        self.start_time = time.time()
-        start_datetime = datetime.fromtimestamp(self.start_time)
-        print(f"Test started at: {start_datetime.strftime('%Y-%m-%d %H:%M:%S')}")
-        print()
-        
-        try:
-            # Start container
-            print("Starting container...")
-            if not self.start_container():
-                print("Failed to start container")
-                return self.phases
-            
-            # Wait a moment for container to initialize
-            time.sleep(2)
-            
-            # Monitor logs
-            self.process_logs()
-            
-        except KeyboardInterrupt:
-            print("\nBenchmark interrupted by user")
-            self.interrupted = True
-            self.should_stop.set()
-            self.health_ready_event.set()  # Stop waiting for health check
-        except Exception as e:
-            print(f"Error during benchmark: {e}")
-        finally:
-            # Cleanup
-            print("\nCleaning up...")
-            self.stop_container()
-            if hasattr(self, '_keep_image') and not self._keep_image:
-                self.cleanup_images()
-                self.cleanup_soci_snapshotter()
-        
-        # Calculate total time and print summary
-        total_time = time.time() - self.start_time
-        self.print_summary(total_time)
-        
-        return self.phases
-    
-    def create_arg_parser(self, description: str) -> argparse.ArgumentParser:
-        """Create standard argument parser for benchmarks."""
-        parser = argparse.ArgumentParser(description=description)
-        
-        # Image specification - either full image or repo + tag
-        image_group = parser.add_mutually_exclusive_group()
-        image_group.add_argument(
-            "--image", 
-            help="Full container image to test (e.g., registry.com/repo:tag-snapshotter)"
-        )
-        image_group.add_argument(
-            "--repo",
-            help="ECR repository name (e.g., my-vllm-app). Will construct full ECR URL automatically"
-        )
-        
-        parser.add_argument(
-            "--tag",
-            default="latest",
-            help="Image tag base (default: latest). Snapshotter suffix will be appended (e.g., latest-nydus)"
-        )
-        parser.add_argument(
-            "--region",
-            default="us-east-1",
-            help="AWS region for ECR (default: us-east-1)"
-        )
-        parser.add_argument(
-            "--container-name",
-            default=f"{self.__class__.__name__.lower().replace('benchmark', '')}-timing-test",
-            help="Name for the test container"
-        )
-        parser.add_argument(
-            "--snapshotter",
-            default="nydus",
-            choices=["nydus", "overlayfs", "native", "soci", "estargz"],
-            help="Snapshotter to use"
-        )
-        parser.add_argument(
-            "--port",
-            type=int,
-            default=self.port,
-            help=f"Local port to bind (default: {self.port})"
-        )
-        parser.add_argument(
-            "--model-mount-path",
-            help="Path to local SSD directory to mount for model storage (e.g., /mnt/ssd/models)"
-        )
-        parser.add_argument(
-            "--output-json",
-            help="Output results to JSON file"
-        )
-        parser.add_argument(
-            "--keep-image",
-            action="store_true",
-            help="Don't remove image after test (faster for repeated runs)"
-        )
-        return parser
-    
-    def save_results(self, results: Dict[str, Optional[float]], output_file: str, 
-                    image: str, snapshotter: str):
-        """Save results to JSON file."""
-        output_data = {
-            "application": self.__class__.__name__.replace('Benchmark', '').lower(),
-            "snapshotter": snapshotter,
-            "image": image,
-            "timestamp": datetime.now().isoformat(),
-            "phases": results,
-            "container_startup_duration": self.container_startup_duration,
-            "health_ready_time": self.health_ready_time,
-            "supports_health_polling": self.supports_health_polling()
-        }
-        
-        with open(output_file, 'w') as f:
-            json.dump(output_data, f, indent=2)
-        
-        print(f"\nResults saved to: {output_file}")
-    
-    def setup_signal_handler(self):
-        """Setup graceful interrupt handling."""
-        def signal_handler(sig, frame):
-            print("\nReceived interrupt signal, cleaning up...")
-            self.interrupted = True
-            self.should_stop.set()
-            self.health_ready_event.set()  # Stop waiting for health check
-            # Don't exit immediately, let cleanup happen
-        
-        signal.signal(signal.SIGINT, signal_handler)
-    
-    def main(self, description: str) -> int:
-        """Main execution method for benchmark scripts."""
-        parser = self.create_arg_parser(description)
-        args = parser.parse_args()
-        
-        # Determine image to use
-        if args.image:
-            # Full image provided
-            final_image = args.image
-        elif args.repo:
-            # Construct ECR image from repo + tag + snapshotter
-            final_image = construct_ecr_image(args.repo, args.tag, args.snapshotter, args.region)
-            print(f"Constructed ECR image: {final_image}")
-        else:
-            # Fall back to default image from subclass
-            final_image = self.get_default_image(args.snapshotter)
-        
-        # Update instance with parsed arguments
-        self.image = final_image
-        self.container_name = args.container_name
-        self.snapshotter = args.snapshotter
-        self.port = args.port
-        self.model_mount_path = args.model_mount_path
-        self._keep_image = args.keep_image
-        self._region = args.region
-        
-        # Setup signal handling
-        self.setup_signal_handler()
-        
-        # Override image cleanup if requested
-        if args.keep_image:
-            self.cleanup_images = lambda: print("Keeping image as requested")
-        
-        # Run benchmark
-        results = self.run_benchmark()
-        
-        # Output JSON if requested
-        if args.output_json:
-            self.save_results(results, args.output_json, self.image, args.snapshotter)
-        
-        
-        return 0 if self._is_successful(results) else 1
-    
-    
-    def _is_successful(self, results: Dict[str, Optional[float]]) -> bool:
-        """Determine if benchmark was successful. Can be overridden by subclasses."""
-        # Default: successful if we have first_log timing
-        return results.get("first_log") is not None
\ No newline at end of file
diff --git a/scripts/benchmark/test-bench-sglang.py b/scripts/benchmark/test-bench-sglang.py
deleted file mode 100644
index f63978e..0000000
--- a/scripts/benchmark/test-bench-sglang.py
+++ /dev/null
@@ -1,248 +0,0 @@
-#!/usr/bin/env python3
-"""
-SGLang Inference Server Benchmark
-Measures container startup and SGLang readiness times with different snapshotters.
-
-LOG PATTERN DETECTION & PHASES:
-===============================
-
-This benchmark monitors SGLang inference server logs and detects the following phases:
-
-1. SGLANG_INIT (SGLang Framework Initialization)
-   - Patterns: "starting sglang", "sglang server", "initializing sglang", "launch_server"
-   - Detects: SGLang framework startup and server initialization
-
-2. WEIGHTS_DOWNLOAD (Weight Download Start)
-   - Patterns: "load weight begin"
-   - Detects: Beginning of model weight loading process
-
-3. WEIGHTS_DOWNLOAD_COMPLETE (Weight Download Complete)
-   - Patterns: "loading safetensors checkpoint shards: 0%"
-   - Detects: First safetensors checkpoint loading starts
-
-4. WEIGHTS_LOADED (Weights Loaded)
-   - Patterns: "load weight end"
-   - Detects: Completion of weight loading phase
-
-5. KV_CACHE_ALLOCATED (KV Cache Setup)
-   - Patterns: "kv cache is allocated", "kv cache allocated"
-   - Detects: Key-value cache memory allocation for inference
-
-6. GRAPH_CAPTURE_BEGIN (CUDA Graph Start)
-   - Patterns: "capture cuda graph begin", "capturing cuda graph"
-   - Detects: Beginning of CUDA graph capture for optimization
-
-7. GRAPH_CAPTURE_END (CUDA Graph Complete)
-   - Patterns: "capture cuda graph end", "cuda graph capture complete"
-   - Detects: CUDA graph capture completion
-
-8. SERVER_LOG_READY (Server Log Ready)
-   - Patterns: "starting server", "server starting", "uvicorn", "listening on"
-   - Detects: HTTP/API server initialization (log-based)
-
-9. SERVER_READY (Server Ready)
-   - Tested via HTTP requests to /health_generate endpoint with 0.1s polling
-   - Detects: API actually responding with valid HTTP 200 responses
-
-MONITORING BEHAVIOR:
-===================
-- Timeout: 20 minutes (model loading and optimization can be slow)
-- Container Status: Monitors container health during startup
-- Health Polling: Polls /health_generate endpoint every 0.1 seconds after first log
-- Success Criteria: HTTP 200 response from health endpoint
-- Port: Maps container port 8000 to specified local port
-- Stop Condition: Immediately after health endpoint returns 200
-
-EXAMPLE LOG FLOW:
-================
-[20.145s] starting sglang → sglang_init
-[119.058s] load weight begin → weights_download
-[200.525s] loading safetensors checkpoint shards: 0% → weights_download_complete
-[233.778s] load weight end → weights_loaded
-[233.828s] kv cache is allocated → kv_cache_allocated
-[245.123s] capture cuda graph begin → graph_capture_begin
-[267.890s] capture cuda graph end → graph_capture_end
-[289.456s] starting server → server_log_ready
-[291.789s] HTTP 200 /health_generate → server_ready
-"""
-
-import requests
-import json
-import time
-from typing import Dict, Optional
-from benchmark_base import BenchmarkBase
-
-
-class SGLangBenchmark(BenchmarkBase):
-    def __init__(self, image: str = "", container_name: str = "sglang-timing-test", 
-                 snapshotter: str = "nydus", port: int = 8000):
-        super().__init__(image, container_name, snapshotter, port)
-    
-    def get_health_endpoint(self) -> str:
-        """Get health endpoint for SGLang application."""
-        return "health_generate"
-    
-    def supports_health_polling(self) -> bool:
-        """SGLang application supports health endpoint polling."""
-        return True
-    
-    def _should_stop_monitoring(self, elapsed: float) -> bool:
-        """Custom stop monitoring logic for SGLang."""
-        # Use base class logic for health polling apps
-        return super()._should_stop_monitoring(elapsed)
-    
-    def _init_phases(self) -> None:
-        """Initialize the phases dictionary for SGLang."""
-        self.phases = {
-            "first_log": None,
-            "sglang_init": None,
-            "model_loading": None,
-            "weights_download": None,
-            "weights_download_complete": None,
-            "weights_loaded": None,
-            "kv_cache_allocated": None,
-            "graph_capture_begin": None,
-            "graph_capture_end": None,
-            "model_loaded": None,
-            "server_log_ready": None,
-            "server_ready": None
-        }
-    
-    def analyze_log_line(self, line: str, timestamp: float) -> Optional[str]:
-        """Analyze a log line and return detected phase."""
-        elapsed = timestamp - self.start_time
-        line_lower = line.lower()
-        
-        # SGLang initialization
-        if self.phases["sglang_init"] is None:
-            if any(pattern in line_lower for pattern in [
-                "starting sglang", "sglang server", "initializing sglang", "launch_server"
-            ]):
-                self.phases["sglang_init"] = elapsed
-                return "sglang_init"
-        
-        # Weight download start (was "load weight begin")
-        if self.phases["weights_download"] is None:
-            if "load weight begin" in line_lower:
-                self.phases["weights_download"] = elapsed
-                return "weights_download"
-        
-        # Weight download complete (first loading safetensors)
-        if self.phases["weights_download_complete"] is None:
-            if "loading safetensors checkpoint shards:" in line_lower and "0%" in line_lower:
-                self.phases["weights_download_complete"] = elapsed
-                return "weights_download_complete"
-        
-        # Weights loaded (was "load weight end")
-        if self.phases["weights_loaded"] is None:
-            if "load weight end" in line_lower:
-                self.phases["weights_loaded"] = elapsed
-                return "weights_loaded"
-        
-        # KV cache allocation
-        if self.phases["kv_cache_allocated"] is None:
-            if any(pattern in line_lower for pattern in [
-                "kv cache is allocated", "kv cache allocated"
-            ]):
-                self.phases["kv_cache_allocated"] = elapsed
-                return "kv_cache_allocated"
-        
-        # CUDA graph capture begin
-        if self.phases["graph_capture_begin"] is None:
-            if any(pattern in line_lower for pattern in [
-                "capture cuda graph begin", "capturing cuda graph"
-            ]):
-                self.phases["graph_capture_begin"] = elapsed
-                return "graph_capture_begin"
-        
-        # CUDA graph capture end
-        if self.phases["graph_capture_end"] is None:
-            if any(pattern in line_lower for pattern in [
-                "capture cuda graph end", "cuda graph capture complete"
-            ]):
-                self.phases["graph_capture_end"] = elapsed
-                return "graph_capture_end"
-        
-        # Server log ready pattern
-        if self.phases["server_log_ready"] is None:
-            if any(pattern in line_lower for pattern in [
-                "starting server", "server starting", "uvicorn", "listening on"
-            ]):
-                self.phases["server_log_ready"] = elapsed
-                return "server_log_ready"
-        
-        return None
-    
-    def test_api_readiness(self, timeout: int = 120) -> bool:
-        """SGLang benchmark doesn't test API readiness - stops after server ready."""
-        print("Skipping API readiness test - stopping after server ready detection")
-        return True
-    
-    def get_default_image(self, snapshotter: str) -> str:
-        """Get default image for the snapshotter. Users should now use --repo parameter instead."""
-        raise ValueError(
-            "No default image configured. Please specify either:\n"
-            "  --repo <ecr-repo-name> (e.g., --repo my-sglang-app)\n"
-            "  --image <full-image-url> (e.g., --image registry.com/repo:tag)\n"
-            "\nExample: python test-bench-sglang.py --repo saurabh-sglang-test --tag latest --snapshotter nydus"
-        )
-    
-    
-    
-    def _get_summary_items(self, total_time):
-        """Get summary items for printing."""
-        items = [
-            ("Container Startup Time:", self.container_startup_duration),
-            ("Container to First Log:", self.phases["first_log"]),
-            ("SGLang Initialization:", self.phases["sglang_init"]),
-            ("Weight Download Start:", self.phases["weights_download"]),
-            ("Weight Download Complete:", self.phases["weights_download_complete"]),
-            ("Weights Loaded:", self.phases["weights_loaded"]),
-            ("KV Cache Allocated:", self.phases["kv_cache_allocated"]),
-            ("Graph Capture Begin:", self.phases["graph_capture_begin"]),
-            ("Graph Capture End:", self.phases["graph_capture_end"]),
-            ("Server Log Ready:", self.phases["server_log_ready"]),
-            ("Server Ready:", self.phases["server_ready"]),
-            ("Total Test Time:", total_time)
-        ]
-        
-        # Add breakdown section
-        items.append(("", None))  # Empty line separator
-        items.append(("BREAKDOWN:", None))
-        
-        # Calculate breakdowns
-        if self.phases["first_log"] is not None:
-            items.append(("Container to First Log:", self.phases["first_log"]))
-        
-        if self.phases["first_log"] is not None and self.phases["weights_download"] is not None:
-            first_to_download = self.phases["weights_download"] - self.phases["first_log"]
-            items.append(("First Log to Weight Download Start:", first_to_download))
-        
-        if self.phases["weights_download"] is not None and self.phases["weights_download_complete"] is not None:
-            download_duration = self.phases["weights_download_complete"] - self.phases["weights_download"]
-            items.append(("Weight Download Start to Complete:", download_duration))
-        
-        if self.phases["weights_download_complete"] is not None and self.phases["weights_loaded"] is not None:
-            download_to_loaded = self.phases["weights_loaded"] - self.phases["weights_download_complete"]
-            items.append(("Weight Download Complete to Weights Loaded:", download_to_loaded))
-        
-        if self.phases["weights_loaded"] is not None and self.phases["server_ready"] is not None:
-            loaded_to_ready = self.phases["server_ready"] - self.phases["weights_loaded"]
-            items.append(("Weights Loaded to Server Ready:", loaded_to_ready))
-        
-        return items
-    
-    def _is_successful(self, results: Dict[str, Optional[float]]) -> bool:
-        """Determine if benchmark was successful."""
-        return results.get("server_ready") is not None
-
-
-def main():
-    benchmark = SGLangBenchmark()
-    return benchmark.main("SGLang Container Startup Benchmark")
-
-
-if __name__ == "__main__":
-    import sys
-    import subprocess
-    sys.exit(main())
\ No newline at end of file
diff --git a/scripts/benchmark/test-bench-tensorrt.py b/scripts/benchmark/test-bench-tensorrt.py
deleted file mode 100755
index 2f5586f..0000000
--- a/scripts/benchmark/test-bench-tensorrt.py
+++ /dev/null
@@ -1,218 +0,0 @@
-#!/usr/bin/env python3
-"""
-TensorRT-LLM Startup Timing Benchmark
-Measures container startup and TensorRT-LLM readiness times with different snapshotters.
-
-LOG PATTERN DETECTION & PHASES:
-===============================
-
-This benchmark monitors TensorRT-LLM server logs and detects the following phases:
-
-1. ENGINE_INIT (TensorRT-LLM Engine Initialization)
-   - Patterns: "PyTorchConfig(", "TensorRT-LLM version", "KV cache quantization"
-   - Detects: TensorRT-LLM engine initialization and configuration
-
-2. WEIGHT_DOWNLOAD_START (Weight Download Start)
-   - Patterns: "Prefetching", "checkpoint files", "Use.*GB for model weights"
-   - Detects: Beginning of model weight download/prefetching to memory
-
-3. WEIGHT_DOWNLOAD_COMPLETE (Weight Download Complete)
-   - Patterns: "Loading /workspace/huggingface", first model loading line
-   - Detects: All model weights downloaded and loading starts
-
-4. WEIGHTS_LOADED (Weight Loading Complete)
-   - Patterns: "Loading weights: 100%", "Model init total"
-   - Detects: Model weights fully loaded into memory
-
-5. MODEL_LOADED (Model Fully Loaded)
-   - Patterns: "Autotuning process ends", "Autotuner Cache size", memory configuration
-   - Detects: Complete model initialization with autotuning and optimization
-
-6. SERVER_LOG_READY (Server Log Ready)
-   - Patterns: "Started server process", "Waiting for application startup"
-   - Detects: Uvicorn/FastAPI server initialization (log-based)
-
-7. SERVER_READY (Server Ready)
-   - Tested via HTTP requests to /health endpoint with 0.1s polling
-   - Detects: API actually responding with valid HTTP 200 responses
-
-MONITORING BEHAVIOR:
-===================
-- Timeout: 25 minutes (model loading and autotuning can be very slow)
-- Container Status: Monitors container health during startup
-- Health Polling: Polls /health endpoint every 0.1 seconds after first log
-- Success Criteria: HTTP 200 response from health endpoint
-- Port: Maps container port 8000 to specified local port
-- Stop Condition: Immediately after health endpoint returns 200
-
-EXAMPLE LOG FLOW:
-================
-[10.230s] Starting TensorRT-LLM server → first_log
-[73.120s] PyTorchConfig( → engine_init
-[76.780s] Prefetching 15.26GB checkpoint → weight_download_start
-[130.450s] Loading /workspace/huggingface → weight_download_complete
-[156.670s] Loading weights: 100% → weights_loaded
-[324.456s] Autotuning process ends → model_loaded
-[325.789s] Started server process → server_log_ready
-[326.012s] HTTP 200 /health → server_ready
-"""
-
-import requests
-import json
-import time
-from typing import Dict, Optional
-from benchmark_base import BenchmarkBase
-
-
-class TensorRTBenchmark(BenchmarkBase):
-    def __init__(self, image: str = "", container_name: str = "tensorrt-timing-test", 
-                 snapshotter: str = "nydus", port: int = 8080):
-        super().__init__(image, container_name, snapshotter, port)
-    
-    def get_health_endpoint(self) -> str:
-        """Get health endpoint for TensorRT application."""
-        return "health"
-    
-    def supports_health_polling(self) -> bool:
-        """TensorRT application supports health endpoint polling."""
-        return True
-    
-    def _should_stop_monitoring(self, elapsed: float) -> bool:
-        """Custom stop monitoring logic for TensorRT-LLM."""
-        # Use base class logic for health polling apps
-        return super()._should_stop_monitoring(elapsed)
-    
-    def _is_successful(self, results: Dict[str, Optional[float]]) -> bool:
-        """Determine if benchmark was successful."""
-        return results.get("server_ready") is not None
-    
-    def _init_phases(self) -> None:
-        """Initialize the phases dictionary for TensorRT-LLM."""
-        self.phases = {
-            "first_log": None,
-            "engine_init": None,
-            "weight_download_start": None,
-            "weight_download_complete": None,
-            "weights_loaded": None,
-            "model_loaded": None,
-            "server_log_ready": None,
-            "server_ready": None
-        }
-
-    def analyze_log_line(self, line: str, timestamp: float) -> Optional[str]:
-        """Analyze a log line and return detected phase."""
-        elapsed = timestamp - self.start_time
-        line_lower = line.lower()
-        
-        # TensorRT-LLM engine initialization
-        if self.phases["engine_init"] is None:
-            if any(pattern in line_lower for pattern in [
-                "pytorchconfig(", "tensorrt-llm version", "kv cache quantization"
-            ]):
-                self.phases["engine_init"] = elapsed
-                return "engine_init"
-        
-        # Weight download start
-        if self.phases["weight_download_start"] is None:
-            if any(pattern in line_lower for pattern in [
-                "prefetching", "checkpoint files", "gb for model weights"
-            ]):
-                self.phases["weight_download_start"] = elapsed
-                return "weight_download_start"
-        
-        # Weight download complete and loading starts
-        if self.phases["weight_download_complete"] is None:
-            if any(pattern in line_lower for pattern in [
-                "loading /workspace/huggingface"
-            ]):
-                self.phases["weight_download_complete"] = elapsed
-                return "weight_download_complete"
-        
-        # Weights loading complete
-        if self.phases["weights_loaded"] is None:
-            if any(pattern in line_lower for pattern in [
-                "loading weights: 100%", "model init total"
-            ]):
-                self.phases["weights_loaded"] = elapsed
-                return "weights_loaded"
-        
-        # Model fully loaded (autotuning complete, memory configured)
-        if self.phases["model_loaded"] is None:
-            if any(pattern in line_lower for pattern in [
-                "autotuning process ends", "autotuner cache size", 
-                "max_seq_len=", "max_num_requests=", "allocated.*gib for max tokens"
-            ]):
-                self.phases["model_loaded"] = elapsed
-                return "model_loaded"
-        
-        # Server log ready pattern
-        if self.phases["server_log_ready"] is None:
-            if any(pattern in line_lower for pattern in [
-                "started server process", "waiting for application startup"
-            ]):
-                self.phases["server_log_ready"] = elapsed
-                return "server_log_ready"
-        
-        return None
-
-    def test_api_readiness(self, timeout: int = 120) -> bool:
-        """TensorRT benchmark doesn't test API readiness - stops after server ready."""
-        print("Skipping API readiness test - stopping after server ready detection")
-        return True
-
-    def get_default_image(self, snapshotter: str) -> str:
-        """Get default image for the snapshotter. Users should now use --repo parameter instead."""
-        raise ValueError(
-            "No default image configured. Please specify either:\n"
-            "  --repo <ecr-repo-name> (e.g., --repo my-tensorrt-app)\n"
-            "  --image <full-image-url> (e.g., --image registry.com/repo:tag)\n"
-            "\nExample: python test-bench-tensorrt.py --repo my-tensorrt-app --tag latest --snapshotter nydus"
-        )
-
-    def _get_summary_items(self, total_time):
-        """Get summary items for the timing summary."""
-        items = [
-            ("Container Startup Time:", self.container_startup_duration),
-            ("Container to First Log:", self.phases["first_log"]),
-            ("Engine Initialization:", self.phases["engine_init"]),
-            ("Weight Download Start:", self.phases["weight_download_start"]),
-            ("Weight Download Complete:", self.phases["weight_download_complete"]),
-            ("Weights Loaded:", self.phases["weights_loaded"]),
-            ("Model Loaded:", self.phases["model_loaded"]),
-            ("Server Log Ready:", self.phases["server_log_ready"]),
-            ("Server Ready:", self.phases["server_ready"]),
-            ("Total Test Time:", total_time)
-        ]
-        
-        # Add breakdown section
-        items.append(("", None))  # Empty line separator
-        items.append(("BREAKDOWN:", None))
-        
-        # Calculate breakdowns
-        if self.phases["first_log"] is not None:
-            items.append(("Container to First Log:", self.phases["first_log"]))
-        
-        if self.phases["first_log"] is not None and self.phases["weight_download_start"] is not None:
-            first_to_download = self.phases["weight_download_start"] - self.phases["first_log"]
-            items.append(("First Log to Weight Download Start:", first_to_download))
-        
-        if self.phases["weight_download_start"] is not None and self.phases["weight_download_complete"] is not None:
-            download_duration = self.phases["weight_download_complete"] - self.phases["weight_download_start"]
-            items.append(("Weight Download Start to Complete:", download_duration))
-        
-        if self.phases["weight_download_complete"] is not None and self.phases["weights_loaded"] is not None:
-            download_to_loaded = self.phases["weights_loaded"] - self.phases["weight_download_complete"]
-            items.append(("Weight Download Complete to Weights Loaded:", download_to_loaded))
-        
-        if self.phases["weights_loaded"] is not None and self.phases["server_ready"] is not None:
-            loaded_to_ready = self.phases["server_ready"] - self.phases["weights_loaded"]
-            items.append(("Weights Loaded to Server Ready:", loaded_to_ready))
-        
-        return items
-
-
-if __name__ == "__main__":
-    import sys
-    
-    benchmark = TensorRTBenchmark()
-    sys.exit(benchmark.main("TensorRT-LLM Container Startup Benchmark"))
\ No newline at end of file
diff --git a/scripts/benchmark/test-bench-vllm.py b/scripts/benchmark/test-bench-vllm.py
deleted file mode 100755
index efa1fca..0000000
--- a/scripts/benchmark/test-bench-vllm.py
+++ /dev/null
@@ -1,219 +0,0 @@
-#!/usr/bin/env python3
-"""
-vLLM Startup Timing Benchmark
-Measures container startup and vLLM readiness times with different snapshotters.
-
-LOG PATTERN DETECTION & PHASES:
-===============================
-
-This benchmark monitors vLLM inference server logs and detects the following phases:
-
-1. ENGINE_INIT (vLLM Engine Initialization)
-   - Patterns: "initializing a v1 llm engine", "waiting for init message", "v1 llm engine"
-   - Detects: vLLM V1 engine initialization start
-
-2. MODEL_LOADING (Model Loading Start)
-   - Patterns: "starting to load model", "loading model from scratch"
-   - Detects: Beginning of model loading process
-
-3. WEIGHTS_DOWNLOAD (Weight Download)
-   - Patterns: "time spent downloading weights", "downloading weights"
-   - Detects: Model weight download completion (if needed)
-
-4. WEIGHTS_LOADED (Weight Loading Complete)
-   - Patterns: "loading weights took", "loading safetensors checkpoint shards: 100%"
-   - Detects: Model weights fully loaded into memory
-
-5. MODEL_LOADED (Model Fully Loaded)
-   - Patterns: "model loading took", "init engine", "engine.*took.*seconds"
-   - Detects: Complete model initialization and engine setup
-
-6. GRAPH_CAPTURE (CUDA Graph Optimization)
-   - Patterns: "graph capturing finished", "capturing cuda graph shapes: 100%"
-   - Detects: CUDA graph capture completion for optimization
-
-7. SERVER_LOG_READY (Server Log Ready)
-   - Patterns: "started server process"
-   - Detects: FastAPI/Uvicorn server process started (log-based)
-
-8. SERVER_READY (Server Ready)
-   - Tested via HTTP requests to /health endpoint with 0.1s polling
-   - Detects: API actually responding with valid HTTP 200 responses
-
-MONITORING BEHAVIOR:
-===================
-- Timeout: 20 minutes (model loading can be slow)
-- Container Status: Monitors container health during startup
-- Health Polling: Polls /health endpoint every 0.1 seconds after first log
-- Success Criteria: HTTP 200 response from health endpoint
-- Port: Maps container port 8000 to specified local port
-- Stop Condition: Immediately after health endpoint returns 200
-
-EXAMPLE LOG FLOW:
-================
-[15.230s] initializing a v1 llm engine → engine_init
-[45.120s] starting to load model → model_loading
-[67.340s] downloading weights → weights_download
-[156.780s] loading weights took 89.44s → weights_loaded
-[198.450s] model loading took 153.33s → model_loaded
-[245.670s] graph capturing finished → graph_capture
-[318.429s] started server process → server_log_ready
-[318.435s] HTTP 200 /health → server_ready
-"""
-
-import requests
-import json
-import time
-from typing import Dict, Optional
-from benchmark_base import BenchmarkBase
-
-
-class VLLMBenchmark(BenchmarkBase):
-    def __init__(self, image: str = "", container_name: str = "vllm-timing-test", 
-                 snapshotter: str = "nydus", port: int = 8080):
-        super().__init__(image, container_name, snapshotter, port)
-    
-    def get_health_endpoint(self) -> str:
-        """Get health endpoint for vLLM application."""
-        return "health"
-    
-    def supports_health_polling(self) -> bool:
-        """vLLM application supports health endpoint polling."""
-        return True
-    
-    def _should_stop_monitoring(self, elapsed: float) -> bool:
-        """Custom stop monitoring logic for vLLM."""
-        # Use base class logic for health polling apps
-        return super()._should_stop_monitoring(elapsed)
-    
-    def _is_successful(self, results: Dict[str, Optional[float]]) -> bool:
-        """Determine if benchmark was successful."""
-        return results.get("server_ready") is not None
-    
-    def _init_phases(self) -> None:
-        """Initialize the phases dictionary for vLLM."""
-        self.phases = {
-            "first_log": None,
-            "engine_init": None,
-            "weights_download": None,
-            "weights_download_complete": None,
-            "weights_loaded": None,
-            "graph_capture": None,
-            "server_log_ready": None,
-            "server_ready": None
-        }
-
-    def analyze_log_line(self, line: str, timestamp: float) -> Optional[str]:
-        """Analyze a log line and return detected phase."""
-        elapsed = timestamp - self.start_time
-        line_lower = line.lower()
-        
-        # Engine initialization (vLLM V1 engine)
-        if self.phases["engine_init"] is None:
-            if any(pattern in line_lower for pattern in [
-                "initializing a v1 llm engine", "waiting for init message", "v1 llm engine"
-            ]):
-                self.phases["engine_init"] = elapsed
-                return "engine_init"
-        
-        # Weights download start (was model loading start)
-        if self.phases["weights_download"] is None:
-            if any(pattern in line_lower for pattern in [
-                "starting to load model", "loading model from scratch"
-            ]):
-                self.phases["weights_download"] = elapsed
-                return "weights_download"
-        
-        # Weights download complete
-        if self.phases["weights_download_complete"] is None:
-            if any(pattern in line_lower for pattern in [
-                "time spent downloading weights", "downloading weights"
-            ]):
-                self.phases["weights_download_complete"] = elapsed
-                return "weights_download_complete"
-        
-        # Weights loaded patterns
-        if self.phases["weights_loaded"] is None:
-            if any(pattern in line_lower for pattern in [
-                "loading weights took", "loading safetensors checkpoint shards: 100%"
-            ]):
-                self.phases["weights_loaded"] = elapsed
-                return "weights_loaded"
-        
-        # CUDA graph capture
-        if self.phases["graph_capture"] is None:
-            if any(pattern in line_lower for pattern in [
-                "graph capturing finished", "capturing cuda graph shapes: 100%"
-            ]):
-                self.phases["graph_capture"] = elapsed
-                return "graph_capture"
-        
-        # Server log ready pattern (vLLM/FastAPI specific)
-        if self.phases["server_log_ready"] is None:
-            if "started server process" in line_lower:
-                self.phases["server_log_ready"] = elapsed
-                return "server_log_ready"
-        
-        return None
-
-    def test_api_readiness(self, timeout: int = 120) -> bool:
-        """vLLM benchmark uses health polling instead of direct API test."""
-        print("Using health polling instead of direct API test")
-        return True
-
-    def get_default_image(self, snapshotter: str) -> str:
-        """Get default image for the snapshotter. Users should now use --repo parameter instead."""
-        raise ValueError(
-            "No default image configured. Please specify either:\n"
-            "  --repo <ecr-repo-name> (e.g., --repo my-vllm-app)\n"
-            "  --image <full-image-url> (e.g., --image registry.com/repo:tag)\n"
-            "\nExample: python test-bench-vllm.py --repo saurabh-vllm-test --tag latest --snapshotter nydus"
-        )
-
-    def _get_summary_items(self, total_time):
-        """Get summary items for the timing summary."""
-        items = [
-            ("Container Startup Time:", self.container_startup_duration),
-            ("Container to First Log:", self.phases["first_log"]),
-            ("Engine Initialization:", self.phases["engine_init"]),
-            ("Weights Download Start:", self.phases["weights_download"]),
-            ("Weights Download Complete:", self.phases["weights_download_complete"]),
-            ("Weights Loaded:", self.phases["weights_loaded"]),
-            ("Graph Capture Complete:", self.phases["graph_capture"]),
-            ("Server Log Ready:", self.phases["server_log_ready"]),
-            ("Server Ready:", self.phases["server_ready"]),
-            ("Total Test Time:", total_time)
-        ]
-        
-        # Add breakdown section
-        items.append(("", None))  # Empty line separator
-        items.append(("BREAKDOWN:", None))
-        
-        # Calculate breakdowns
-        if self.phases["first_log"] is not None:
-            items.append(("Container to First Log:", self.phases["first_log"]))
-        
-        if self.phases["first_log"] is not None and self.phases["weights_download"] is not None:
-            first_to_download = self.phases["weights_download"] - self.phases["first_log"]
-            items.append(("First Log to Weight Download Start:", first_to_download))
-        
-        if self.phases["weights_download"] is not None and self.phases["weights_download_complete"] is not None:
-            download_duration = self.phases["weights_download_complete"] - self.phases["weights_download"]
-            items.append(("Weight Download Start to Complete:", download_duration))
-        
-        if self.phases["weights_download_complete"] is not None and self.phases["weights_loaded"] is not None:
-            download_to_loaded = self.phases["weights_loaded"] - self.phases["weights_download_complete"]
-            items.append(("Weight Download Complete to Weights Loaded:", download_to_loaded))
-        
-        if self.phases["weights_loaded"] is not None and self.phases["server_ready"] is not None:
-            loaded_to_ready = self.phases["server_ready"] - self.phases["weights_loaded"]
-            items.append(("Weights Loaded to Server Ready:", loaded_to_ready))
-        
-        return items
-
-
-if __name__ == "__main__":
-    import sys
-    
-    benchmark = VLLMBenchmark()
-    sys.exit(benchmark.main("vLLM Container Startup Benchmark"))
\ No newline at end of file
diff --git a/scripts/build_push.py b/scripts/build_push.py
deleted file mode 100755
index 65afabf..0000000
--- a/scripts/build_push.py
+++ /dev/null
@@ -1,547 +0,0 @@
-#!/usr/bin/env python3
-"""
-Build and push container images with different snapshotter formats.
-Supports ECR (AWS) and GAR (Google Artifact Registry).
-"""
-
-import argparse
-import os
-import subprocess
-import sys
-import json
-from pathlib import Path
-from abc import ABC, abstractmethod
-
-
-def run_command(cmd, check=True, capture_output=False):
-    """Run a shell command and handle errors."""
-    import time
-
-    print(f"Running: {cmd}")
-    start_time = time.time()
-
-    try:
-        if capture_output:
-            result = subprocess.run(cmd, shell=True, check=check, capture_output=True, text=True)
-            elapsed = time.time() - start_time
-            print(f"✓ Completed in {elapsed:.2f}s")
-            return result.stdout.strip()
-        else:
-            subprocess.run(cmd, shell=True, check=check)
-            elapsed = time.time() - start_time
-            print(f"✓ Completed in {elapsed:.2f}s")
-    except subprocess.CalledProcessError as e:
-        elapsed = time.time() - start_time
-        print(f"❌ Failed after {elapsed:.2f}s")
-        print(f"Error running command: {cmd}")
-        print(f"Error: {e}")
-        if capture_output and e.stdout:
-            print(f"Stdout: {e.stdout}")
-        if capture_output and e.stderr:
-            print(f"Stderr: {e.stderr}")
-        sys.exit(1)
-
-
-class Registry(ABC):
-    """Abstract base class for container registries."""
-
-    @abstractmethod
-    def check_credentials(self):
-        """Check if credentials are configured."""
-        pass
-
-    @abstractmethod
-    def create_repository(self, image_name):
-        """Create repository if it doesn't exist."""
-        pass
-
-    @abstractmethod
-    def login(self):
-        """Login to registry with docker and nerdctl."""
-        pass
-
-    @abstractmethod
-    def get_registry_url(self):
-        """Return the registry URL."""
-        pass
-
-    def get_full_image_name(self, image_name, tag="latest"):
-        """Construct full image reference."""
-        return f"{self.get_registry_url()}/{image_name}:{tag}"
-
-
-class ECRRegistry(Registry):
-    """AWS Elastic Container Registry implementation."""
-
-    def __init__(self, account, region):
-        self.account = account
-        self.region = region
-        self.registry_url = f"{account}.dkr.ecr.{region}.amazonaws.com"
-
-    def check_credentials(self):
-        """Check if AWS credentials are configured."""
-        try:
-            run_command("aws sts get-caller-identity", capture_output=True)
-            print("✓ AWS credentials are configured")
-        except:
-            print("Error: AWS credentials not configured. Please run 'aws configure' first.")
-            sys.exit(1)
-
-    def create_repository(self, image_name):
-        """Create ECR repository if it doesn't exist."""
-        print(f"Checking/creating ECR repository: {image_name}")
-
-        # Check if repository exists
-        check_cmd = f"aws ecr describe-repositories --repository-names {image_name} --region {self.region}"
-        try:
-            run_command(check_cmd, capture_output=True)
-            print(f"✓ Repository {image_name} already exists")
-        except:
-            # Repository doesn't exist, create it
-            create_cmd = f"aws ecr create-repository --repository-name {image_name} --region {self.region}"
-            run_command(create_cmd)
-            print(f"✓ Created repository {image_name}")
-
-    def login(self):
-        """Login to ECR using both docker and nerdctl."""
-        print("Logging into ECR...")
-
-        password = run_command(f"aws ecr get-login-password --region {self.region}", capture_output=True)
-
-        # Login with docker
-        login_cmd = f"echo '{password}' | docker login -u AWS --password-stdin {self.registry_url}"
-        run_command(login_cmd)
-
-        # Login with nerdctl
-        login_cmd = f"echo '{password}' | nerdctl login -u AWS --password-stdin {self.registry_url}"
-        run_command(login_cmd)
-
-        # Login with sudo nerdctl
-        login_cmd = f"echo '{password}' | sudo nerdctl login -u AWS --password-stdin {self.registry_url}"
-        run_command(login_cmd)
-
-        print("✓ Successfully logged into ECR")
-
-    def get_registry_url(self):
-        """Return the ECR registry URL."""
-        return self.registry_url
-
-
-class GARRegistry(Registry):
-    """Google Artifact Registry implementation."""
-
-    def __init__(self, project_id, repository, location):
-        self.project_id = project_id
-        self.repository = repository
-        self.location = location
-        self.registry_url = f"{location}-docker.pkg.dev/{project_id}/{repository}"
-
-    def check_credentials(self):
-        """Check if GCP credentials are configured."""
-        try:
-            run_command("gcloud auth application-default print-access-token", capture_output=True)
-            print("✓ GCP credentials are configured")
-        except:
-            print("Error: GCP credentials not configured.")
-            print("Please run 'gcloud auth application-default login' or 'gcloud auth login'")
-            sys.exit(1)
-
-    def create_repository(self, image_name):
-        """Create GAR repository if it doesn't exist."""
-        print(f"Checking/creating GAR repository: {self.repository}")
-
-        # Check if repository exists
-        check_cmd = f"gcloud artifacts repositories describe {self.repository} --location={self.location} --project={self.project_id}"
-        try:
-            run_command(check_cmd, capture_output=True)
-            print(f"✓ Repository {self.repository} already exists")
-        except:
-            # Repository doesn't exist, create it
-            create_cmd = f"gcloud artifacts repositories create {self.repository} --repository-format=docker --location={self.location} --project={self.project_id}"
-            run_command(create_cmd)
-            print(f"✓ Created repository {self.repository}")
-
-    def login(self):
-        """Login to GAR using both docker and nerdctl."""
-        print("Logging into Google Artifact Registry...")
-
-        # Configure Docker authentication helper for GAR
-        auth_cmd = f"gcloud auth configure-docker {self.location}-docker.pkg.dev"
-        run_command(auth_cmd)
-
-        # Get access token for nerdctl login
-        token = run_command("gcloud auth print-access-token", capture_output=True)
-
-        # Login with nerdctl
-        login_cmd = f"echo '{token}' | nerdctl login -u oauth2accesstoken --password-stdin {self.location}-docker.pkg.dev"
-        run_command(login_cmd)
-
-        # Login with sudo nerdctl
-        login_cmd = f"echo '{token}' | sudo nerdctl login -u oauth2accesstoken --password-stdin {self.location}-docker.pkg.dev"
-        run_command(login_cmd)
-
-        print("✓ Successfully logged into GAR")
-
-    def get_registry_url(self):
-        """Return the GAR registry URL."""
-        return self.registry_url
-
-
-def build_and_push_image(image_dir, image_name, registry):
-    """Build and push the base Docker image."""
-    print(f"Building image from {image_dir}...")
-
-    # Change to image directory for build context
-    original_dir = os.getcwd()
-    os.chdir(image_dir)
-
-    try:
-        # Build the image
-        build_cmd = f"docker build -t {image_name} ."
-        run_command(build_cmd)
-
-        # Tag for registry
-        full_image = registry.get_full_image_name(image_name, "latest")
-        tag_cmd = f"docker tag {image_name} {full_image}"
-        run_command(tag_cmd)
-
-        # Push the image
-        push_cmd = f"docker push {full_image}"
-        run_command(push_cmd)
-
-        print(f"✓ Successfully built and pushed {full_image}")
-
-    finally:
-        os.chdir(original_dir)
-
-
-def convert_to_nydus(image_name, registry):
-    """Convert and push Nydus image."""
-    print("Converting to Nydus format...")
-
-    source_image = registry.get_full_image_name(image_name, "latest")
-    target_image = registry.get_full_image_name(image_name, "latest-nydus")
-
-    nydus_cmd = f"""nydusify convert \\
-        --source {source_image} \\
-        --source-backend-config ~/.docker/config.json \\
-        --target {target_image}"""
-
-    run_command(nydus_cmd)
-    print(f"✓ Successfully converted and pushed {target_image}")
-
-
-def convert_to_soci(image_name, registry):
-    """Convert and push SOCI image."""
-    print("Converting to SOCI format...")
-
-    source_image = registry.get_full_image_name(image_name, "latest")
-    target_image = registry.get_full_image_name(image_name, "latest-soci")
-
-    # Pull the image with nerdctl first
-    pull_cmd = f"sudo nerdctl pull {source_image}"
-    run_command(pull_cmd)
-
-    # Convert to SOCI
-    soci_cmd = f"sudo soci convert {source_image} {target_image}"
-    run_command(soci_cmd)
-
-    # Push SOCI image
-    push_cmd = f"sudo nerdctl push {target_image}"
-    run_command(push_cmd)
-
-    print(f"✓ Successfully converted and pushed {target_image}")
-
-
-def convert_to_estargz(image_name, registry):
-    """Convert and push eStargz image."""
-    print("Converting to eStargz format...")
-
-    source_image = registry.get_full_image_name(image_name, "latest")
-    target_image = registry.get_full_image_name(image_name, "latest-estargz")
-
-    # Pull the image with nerdctl first
-    pull_cmd = f"sudo nerdctl pull {source_image}"
-    run_command(pull_cmd)
-
-    estargz_cmd = f"sudo nerdctl image convert --estargz --oci {source_image} {target_image}"
-    run_command(estargz_cmd)
-
-    # Push eStargz image
-    push_cmd = f"sudo nerdctl push {target_image}"
-    run_command(push_cmd)
-
-    print(f"✓ Successfully converted and pushed {target_image}")
-
-
-def cleanup_built_images(image_name, registry, formats):
-    """Remove only the images that were built in this run."""
-    import time
-
-    print("\n" + "="*60)
-    print("🧹 CLEANUP: Removing built images...")
-    print("="*60)
-
-    cleanup_start = time.time()
-    images_to_remove = []
-
-    # Collect all image references that were built
-    if "normal" in formats:
-        images_to_remove.append(image_name)  # Local tag
-        images_to_remove.append(registry.get_full_image_name(image_name, "latest"))
-    if "nydus" in formats:
-        images_to_remove.append(registry.get_full_image_name(image_name, "latest-nydus"))
-    if "soci" in formats:
-        images_to_remove.append(registry.get_full_image_name(image_name, "latest-soci"))
-    if "estargz" in formats:
-        images_to_remove.append(registry.get_full_image_name(image_name, "latest-estargz"))
-
-    # Cleanup Docker images
-    print("\n📦 Docker Cleanup:")
-    for image in images_to_remove:
-        try:
-            print(f"  Removing: {image}")
-            run_command(f"docker rmi -f {image}", check=False, capture_output=True)
-        except Exception as e:
-            print(f"    ⚠️  Warning: Could not remove {image}: {e}")
-
-    # Cleanup nerdctl images for relevant snapshotters
-    snapshotter_map = {
-        "normal": "overlayfs",
-        "nydus": "nydus",
-        "soci": "soci",
-        "estargz": "stargz"
-    }
-
-    print(f"\n🔧 nerdctl Cleanup:")
-    for format_type in formats:
-        snapshotter = snapshotter_map.get(format_type)
-        if not snapshotter:
-            continue
-
-        print(f"  Processing {snapshotter} snapshotter...")
-        try:
-            # Determine the correct tag
-            if format_type == "normal":
-                tag = "latest"
-            else:
-                tag = f"latest-{format_type}"
-
-            image_ref = registry.get_full_image_name(image_name, tag)
-            print(f"    Removing: {image_ref}")
-            run_command(f"sudo nerdctl --snapshotter {snapshotter} rmi -f {image_ref}", check=False, capture_output=True)
-
-        except Exception as e:
-            print(f"    ⚠️  Warning: Could not cleanup {snapshotter} images: {e}")
-
-    total_cleanup_time = time.time() - cleanup_start
-    print(f"\n✅ Cleanup completed in {total_cleanup_time:.2f}s")
-    print("="*60)
-
-
-def list_available_images(base_path="snapshotters/images"):
-    """List available image directories."""
-    images_dir = Path(base_path)
-    if not images_dir.exists():
-        print(f"Error: {base_path} directory not found")
-        return []
-    
-    image_dirs = []
-    for item in images_dir.iterdir():
-        if item.is_dir() and (item / "Dockerfile").exists():
-            image_dirs.append(item.name)
-    
-    return sorted(image_dirs)
-
-
-def main():
-    parser = argparse.ArgumentParser(
-        description="Build and push container images with different snapshotter formats. Supports ECR (AWS) and GAR (Google Artifact Registry).",
-        formatter_class=argparse.RawDescriptionHelpFormatter,
-        epilog="""
-Examples:
-  # ECR (AWS) - Build image from custom path
-  python3 build_push.py --registry-type ecr --account 123456789 --image-path /path/to/my/image --image-name my-image --region us-east-1
-
-  # ECR - Build with specific formats
-  python3 build_push.py --registry-type ecr --account 123456789 --image-path ./images/cuda --image-name cuda-test --formats normal,nydus
-
-  # GAR (Google) - Build and push all formats
-  python3 build_push.py --registry-type gar --project-id my-gcp-project --repository my-repo --image-path ./images/vllm --image-name vllm-app --location us-central1
-
-  # GAR - Build with specific formats
-  python3 build_push.py --registry-type gar --project-id my-project --repository ai-models --image-path ./images/sglang --image-name sglang --location us-east1 --formats normal,nydus,soci
-
-  # List available images in default directory
-  python3 build_push.py --list-images
-        """)
-
-    # Registry selection
-    parser.add_argument("--registry-type", choices=["ecr", "gar"], default="ecr",
-                       help="Registry type: ecr (AWS) or gar (Google Artifact Registry). Default: ecr")
-
-    # Common arguments
-    parser.add_argument("--image-path", required=False, help="Full path to image directory")
-    parser.add_argument("--image-name", required=False, help="Image name for the container")
-    parser.add_argument("--formats", default="normal,nydus,soci,estargz",
-                       help="Comma-separated list of formats to build (normal,nydus,soci,estargz)")
-    parser.add_argument("--list-images", action="store_true", help="List available image directories")
-    parser.add_argument("--no-cleanup", action="store_true", help="Skip cleanup of local images after build")
-
-    # ECR-specific arguments
-    parser.add_argument("--account", required=False, help="AWS account ID (required for ECR)")
-    parser.add_argument("--region", required=False, default="us-east-1",
-                       help="AWS region for ECR (default: us-east-1)")
-
-    # GAR-specific arguments
-    parser.add_argument("--project-id", required=False, help="GCP project ID (optional for GAR, defaults to gcloud config)")
-    parser.add_argument("--repository", required=False, help="GAR repository name (required for GAR)")
-    parser.add_argument("--location", required=False, default="us-central1",
-                       help="GCP location for GAR (default: us-central1)")
-
-    args = parser.parse_args()
-
-    # List available images
-    if args.list_images:
-        available_images = list_available_images()
-        if available_images:
-            print("Available image directories:")
-            for img in available_images:
-                print(f"  - {img}")
-        else:
-            print("No image directories found with Dockerfiles")
-        return
-
-    # Validate registry-specific arguments
-    if args.registry_type == "ecr":
-        if not args.account:
-            parser.error("--account is required for ECR")
-    elif args.registry_type == "gar":
-        # Get project ID from gcloud config if not provided
-        if not args.project_id:
-            try:
-                args.project_id = run_command("gcloud config get project", capture_output=True)
-                if not args.project_id:
-                    parser.error("--project-id is required for GAR (or set default project with 'gcloud config set project PROJECT_ID')")
-                print(f"Using project ID from gcloud config: {args.project_id}")
-            except:
-                parser.error("--project-id is required for GAR (or set default project with 'gcloud config set project PROJECT_ID')")
-        if not args.repository:
-            parser.error("--repository is required for GAR")
-
-    # Validate common required arguments
-    if not args.image_path:
-        parser.error("--image-path is required")
-    if not args.image_name:
-        parser.error("--image-name is required")
-    
-    # Validate image directory exists
-    image_dir = Path(args.image_path)
-    if not image_dir.exists():
-        print(f"Error: Image directory '{args.image_path}' not found")
-        sys.exit(1)
-    
-    dockerfile_path = image_dir / "Dockerfile"
-    if not dockerfile_path.exists():
-        print(f"Error: No Dockerfile found in {image_dir}")
-        sys.exit(1)
-    
-    # Parse formats
-    formats = [f.strip() for f in args.formats.split(",")]
-    valid_formats = {"normal", "nydus", "soci", "estargz"}
-    invalid_formats = set(formats) - valid_formats
-    if invalid_formats:
-        print(f"Error: Invalid formats: {invalid_formats}")
-        print(f"Valid formats: {valid_formats}")
-        sys.exit(1)
-    
-    # Set image name
-    image_name = args.image_name
-
-    # Create registry instance based on type
-    if args.registry_type == "ecr":
-        registry = ECRRegistry(args.account, args.region)
-        registry_info = f"Account: {args.account}, Region: {args.region}"
-    else:  # gar
-        registry = GARRegistry(args.project_id, args.repository, args.location)
-        registry_info = f"Project: {args.project_id}, Repository: {args.repository}, Location: {args.location}"
-
-    print("="*70)
-    print("🚀 STARTING CONTAINER IMAGE BUILD AND PUSH")
-    print("="*70)
-    print(f"Registry Type: {args.registry_type.upper()}")
-    print(f"Building image: {image_name}")
-    print(f"From directory: {image_dir}")
-    print(f"{registry_info}")
-    print(f"Formats: {formats}")
-    print()
-
-    import time
-    total_start_time = time.time()
-
-    # Check credentials
-    print(f"🔐 Checking {args.registry_type.upper()} credentials...")
-    registry.check_credentials()
-
-    # Login to registry
-    print(f"\n🔑 Logging into {args.registry_type.upper()}...")
-    registry.login()
-
-    # Create repository
-    print(f"\n📦 Setting up repository...")
-    registry.create_repository(image_name)
-    
-    # Build and push base image
-    if "normal" in formats:
-        print(f"\n🏗️  Building and pushing base image...")
-        build_start = time.time()
-        build_and_push_image(str(image_dir), image_name, registry)
-        build_time = time.time() - build_start
-        print(f"✅ Base image build completed in {build_time:.2f}s")
-    
-    # Convert to different formats
-    if "nydus" in formats:
-        print(f"\n🔄 Converting to Nydus format...")
-        nydus_start = time.time()
-        convert_to_nydus(image_name, registry)
-        nydus_time = time.time() - nydus_start
-        print(f"✅ Nydus conversion completed in {nydus_time:.2f}s")
-    
-    if "soci" in formats:
-        print(f"\n🔄 Converting to SOCI format...")
-        soci_start = time.time()
-        convert_to_soci(image_name, registry)
-        soci_time = time.time() - soci_start
-        print(f"✅ SOCI conversion completed in {soci_time:.2f}s")
-    
-    if "estargz" in formats:
-        print(f"\n🔄 Converting to eStargz format...")
-        estargz_start = time.time()
-        convert_to_estargz(image_name, registry)
-        estargz_time = time.time() - estargz_start
-        print(f"✅ eStargz conversion completed in {estargz_time:.2f}s")
-    
-    total_time = time.time() - total_start_time
-
-    print("\n" + "="*70)
-    print("🎉 ALL FORMATS BUILT AND PUSHED SUCCESSFULLY!")
-    print("="*70)
-    print(f"Registry: {registry.get_registry_url()}")
-    print(f"Base image: {registry.get_full_image_name(image_name, 'latest')}")
-    if "nydus" in formats:
-        print(f"Nydus image: {registry.get_full_image_name(image_name, 'latest-nydus')}")
-    if "soci" in formats:
-        print(f"SOCI image: {registry.get_full_image_name(image_name, 'latest-soci')}")
-    if "estargz" in formats:
-        print(f"eStargz image: {registry.get_full_image_name(image_name, 'latest-estargz')}")
-
-    print(f"\n⏱️  Total build and push time: {total_time:.2f}s ({total_time/60:.1f} minutes)")
-    print("="*70)
-
-    # Cleanup built images by default (unless --no-cleanup is specified)
-    if not args.no_cleanup:
-        cleanup_built_images(image_name, registry, formats)
-
-
-if __name__ == "__main__":
-    main()
diff --git a/scripts/builder/Dockerfile b/scripts/builder/Dockerfile
new file mode 100644
index 0000000..f90e9d5
--- /dev/null
+++ b/scripts/builder/Dockerfile
@@ -0,0 +1,56 @@
+# Build stage: Compile buildkit with Nydus support
+FROM golang:1.21-alpine AS buildkit-builder
+
+# Install build dependencies
+RUN apk add --no-cache git make
+
+# Clone nydusaccelerator/buildkit fork
+ARG BUILDKIT_VERSION=nydus-compression-type-enhance
+RUN git clone --depth 1 --branch ${BUILDKIT_VERSION} \
+    https://github.com/nydusaccelerator/buildkit.git /buildkit
+
+WORKDIR /buildkit
+
+# Build buildkitd and buildctl with Nydus support
+RUN go build -tags=nydus -o ./bin/buildkitd ./cmd/buildkitd && \
+    go build -o ./bin/buildctl ./cmd/buildctl
+
+# Runtime stage
+FROM alpine:latest
+
+# Copy buildkit binaries with Nydus support
+COPY --from=buildkit-builder /buildkit/bin/buildctl /usr/bin/buildctl
+COPY --from=buildkit-builder /buildkit/bin/buildkitd /usr/bin/buildkitd
+
+# Copy buildctl-daemonless.sh wrapper from moby/buildkit repo
+ADD https://raw.githubusercontent.com/moby/buildkit/master/examples/buildctl-daemonless/buildctl-daemonless.sh /usr/bin/buildctl-daemonless.sh
+RUN chmod +x /usr/bin/buildctl-daemonless.sh
+
+# Install runtime dependencies
+RUN apk add --no-cache \
+    ca-certificates \
+    curl \
+    wget \
+    iptables \
+    fuse-overlayfs \
+    containerd
+
+# Install nydus-image binary (v2.3.6)
+ARG NYDUS_VERSION=v2.3.6
+RUN wget -O /tmp/nydus.tgz \
+    "https://github.com/dragonflyoss/nydus/releases/download/${NYDUS_VERSION}/nydus-static-${NYDUS_VERSION}-linux-amd64.tgz" \
+    && tar -xzf /tmp/nydus.tgz -C /tmp \
+    && mv /tmp/nydus-static/nydus-image /usr/bin/nydus-image \
+    && chmod +x /usr/bin/nydus-image \
+    && rm -rf /tmp/nydus.tgz /tmp/nydus-static
+
+# Set NYDUS_BUILDER environment variable (required for buildkit)
+ENV NYDUS_BUILDER=/usr/bin/nydus-image
+
+# Copy build script
+COPY build.sh /usr/local/bin/build.sh
+RUN chmod +x /usr/local/bin/build.sh
+
+WORKDIR /workspace
+
+ENTRYPOINT ["/usr/local/bin/build.sh"]
diff --git a/scripts/builder/README.md b/scripts/builder/README.md
new file mode 100644
index 0000000..450d4f0
--- /dev/null
+++ b/scripts/builder/README.md
@@ -0,0 +1,156 @@
+# Container-Based Image Builder
+
+Builds container images using `buildctl` in a containerized environment. Produces both normal OCI and Nydus-optimized images.
+
+## Features
+
+- **Registry-agnostic**: Works with AWS ECR, Google Artifact Registry, Docker Hub, or any OCI registry
+- **No local dependencies**: All build tools run inside a container
+- **Two image formats**: Builds both normal OCI and Nydus images in one go
+- **Direct push**: Images pushed directly to registry via buildctl
+
+## Architecture
+
+```
+Host (authenticated) → Builder Container (buildctl + nydus-image) → Registry
+```
+
+- **Host**: Authenticates to registry, mounts build context and docker config
+- **Builder Container**: Runs buildctl to build and push images
+- **No Docker daemon dependency**: buildctl pushes directly to registries
+
+## Prerequisites
+
+1. **Docker** installed on host (no other dependencies needed!)
+2. **Authenticated to your registry** before running:
+
+```bash
+# AWS ECR
+aws ecr get-login-password --region us-east-1 | \
+  docker login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com
+
+# Google Artifact Registry
+gcloud auth configure-docker us-central1-docker.pkg.dev
+
+# Docker Hub
+docker login
+```
+
+## Usage
+
+```bash
+docker run --rm --privileged \
+  -v /path/to/build-context:/workspace:ro \
+  -v ~/.docker/config.json:/root/.docker/config.json:ro \
+  tensorfuse/fastpull-builder:latest \
+  <image:tag>
+```
+
+### Examples
+
+**AWS ECR:**
+```bash
+docker run --rm --privileged \
+  -v ./my-app:/workspace:ro \
+  -v ~/.docker/config.json:/root/.docker/config.json:ro \
+  tensorfuse/fastpull-builder:latest \
+  123456789.dkr.ecr.us-east-1.amazonaws.com/my-app:latest
+```
+
+**Google Artifact Registry:**
+```bash
+docker run --rm --privileged \
+  -v ./my-app:/workspace:ro \
+  -v ~/.docker/config.json:/root/.docker/config.json:ro \
+  tensorfuse/fastpull-builder:latest \
+  us-central1-docker.pkg.dev/my-project/my-repo/my-app:v1.0
+```
+
+**Docker Hub:**
+```bash
+docker run --rm --privileged \
+  -v ./my-app:/workspace:ro \
+  -v ~/.docker/config.json:/root/.docker/config.json:ro \
+  tensorfuse/fastpull-builder:latest \
+  docker.io/username/my-app:latest
+```
+
+**No tag (defaults to :latest):**
+```bash
+docker run --rm --privileged \
+  -v ./my-app:/workspace:ro \
+  -v ~/.docker/config.json:/root/.docker/config.json:ro \
+  tensorfuse/fastpull-builder:latest \
+  my-registry.com/my-app
+```
+
+**Custom Dockerfile:**
+```bash
+docker run --rm --privileged \
+  -v ./my-app:/workspace:ro \
+  -v ~/.docker/config.json:/root/.docker/config.json:ro \
+  -e DOCKERFILE=Dockerfile.custom \
+  tensorfuse/fastpull-builder:latest \
+  my-registry.com/my-app:latest
+```
+
+## Output
+
+The script builds and pushes two images:
+- `<image>:<tag>` - Normal OCI image
+- `<image>:<tag>-fastpull` - Fastpull-optimized image
+
+## Files
+
+- `Dockerfile` - Builder container definition (builds from nydusaccelerator/buildkit fork)
+- `build.sh` - Build script that runs inside container (entrypoint)
+- `README.md` - This file
+
+## Technical Details
+
+### Buildkit with Nydus Support
+The Dockerfile builds `buildkitd` and `buildctl` from the [nydusaccelerator/buildkit](https://github.com/nydusaccelerator/buildkit) fork with the `-tags=nydus` flag, which enables Nydus compression support. The standard moby/buildkit does not include this functionality.
+
+### Components
+- **buildkitd/buildctl**: Compiled from nydusaccelerator/buildkit fork
+- **nydus-image**: v2.3.6 binary (set via `NYDUS_BUILDER` env var)
+- **buildctl-daemonless.sh**: Wrapper that runs buildkitd in rootless mode
+
+## How It Works
+
+1. **Pull builder image**: Downloads `tensorfuse/fastpull-builder:latest` from Docker Hub
+2. **Mount context**: Your build context is mounted read-only into `/workspace`
+3. **Mount auth**: `~/.docker/config.json` is mounted for registry authentication
+4. **Run buildctl**: Builds normal OCI image with `buildctl-daemonless.sh`
+5. **Run buildctl again**: Builds Fastpull image with Nydus compression
+6. **Direct push**: Both images pushed directly to registry
+
+## Troubleshooting
+
+**"Error: Docker config not found"**
+- Run registry authentication command first (see Prerequisites)
+
+**"Error: Build context path does not exist"**
+- Check that `--context` points to a valid directory
+
+**"Error: Dockerfile not found"**
+- Ensure Dockerfile exists in context directory
+- Or specify custom name with `--dockerfile`
+
+**Build fails with authentication error:**
+- Re-authenticate to your registry
+- Check that `~/.docker/config.json` contains valid credentials
+
+**"permission denied" errors:**
+- Builder container runs with `--privileged` flag (required for buildkit)
+- Ensure Docker is running with appropriate permissions
+
+## Comparison with Original build_push.py
+
+| Feature | Original | Container-Based |
+|---------|----------|-----------------|
+| Dependencies | Requires nerdctl, nydusify, soci, stargz locally | All tools in container |
+| Registry | AWS ECR or GAR | Any OCI registry |
+| Formats | normal, nydus, soci, estargz | normal, nydus |
+| Push method | nerdctl/docker | buildctl (direct) |
+| Portability | Requires snapshotter setup | Runs anywhere Docker runs |
diff --git a/scripts/builder/build.sh b/scripts/builder/build.sh
new file mode 100644
index 0000000..8858ccf
--- /dev/null
+++ b/scripts/builder/build.sh
@@ -0,0 +1,72 @@
+#!/bin/sh
+set -e
+
+# Usage: build.sh <image[:tag]>
+# Example: build.sh my-registry.com/my-app:latest
+# Example: build.sh my-registry.com/my-app (defaults to :latest)
+
+if [ $# -lt 1 ]; then
+    echo "Usage: $0 <image[:tag]>"
+    echo "Example: $0 123456789.dkr.ecr.us-east-1.amazonaws.com/my-app:v1.0"
+    echo "Example: $0 123456789.dkr.ecr.us-east-1.amazonaws.com/my-app (defaults to :latest)"
+    exit 1
+fi
+
+IMAGE_WITH_TAG="$1"
+DOCKERFILE="${DOCKERFILE:-Dockerfile}"
+CONTEXT_PATH="${CONTEXT_PATH:-/workspace}"
+
+# Parse image and tag (default to :latest if no tag provided)
+if echo "$IMAGE_WITH_TAG" | grep -q ":"; then
+    IMAGE_NAME="${IMAGE_WITH_TAG%:*}"
+    TAG="${IMAGE_WITH_TAG##*:}"
+else
+    IMAGE_NAME="$IMAGE_WITH_TAG"
+    TAG="latest"
+fi
+
+FULL_IMAGE="${IMAGE_NAME}:${TAG}"
+FULL_IMAGE_FASTPULL="${IMAGE_NAME}:${TAG}-fastpull"
+
+echo "=========================================="
+echo "Building images for: ${IMAGE_NAME}"
+echo "Tag: ${TAG}"
+echo "Context: ${CONTEXT_PATH}"
+echo "Dockerfile: ${DOCKERFILE}"
+echo "=========================================="
+
+# Build normal OCI image
+echo ""
+echo ">>> Building normal OCI image: ${FULL_IMAGE}"
+echo ""
+time buildctl-daemonless.sh build \
+    --frontend dockerfile.v0 \
+    --local context="${CONTEXT_PATH}" \
+    --local dockerfile="${CONTEXT_PATH}" \
+    --opt filename="${DOCKERFILE}" \
+    --output type=image,name="${FULL_IMAGE}",push=true
+
+echo ""
+echo "✓ Normal OCI image built and pushed: ${FULL_IMAGE}"
+echo ""
+
+# Build Fastpull image
+echo ""
+echo ">>> Building Fastpull image: ${FULL_IMAGE_FASTPULL}"
+echo ""
+time buildctl-daemonless.sh build \
+    --frontend dockerfile.v0 \
+    --local context="${CONTEXT_PATH}" \
+    --local dockerfile="${CONTEXT_PATH}" \
+    --opt filename="${DOCKERFILE}" \
+    --output type=image,name="${FULL_IMAGE_FASTPULL}",push=true,compression=nydus,force-compression=true,oci-mediatypes=true
+
+echo ""
+echo "✓ Fastpull image built and pushed: ${FULL_IMAGE_FASTPULL}"
+echo ""
+
+echo "=========================================="
+echo "✓ Build complete!"
+echo "  Normal:   ${FULL_IMAGE}"
+echo "  Fastpull: ${FULL_IMAGE_FASTPULL}"
+echo "=========================================="
diff --git a/scripts/fastpull-cli.py b/scripts/fastpull-cli.py
new file mode 100755
index 0000000..064a501
--- /dev/null
+++ b/scripts/fastpull-cli.py
@@ -0,0 +1,81 @@
+#!/usr/bin/env python3
+"""
+FastPull - Accelerate AI/ML container startup with lazy-loading snapshotters.
+
+Main CLI entry point for the unified fastpull command.
+"""
+
+import argparse
+import sys
+import os
+
+# Add the library directory to the path to import fastpull module
+# When installed, fastpull module is at /usr/local/lib/fastpull
+# When running from source, it's in the same directory as this script
+script_dir = os.path.dirname(os.path.abspath(__file__))
+sys.path.insert(0, script_dir)  # For running from source
+sys.path.insert(0, '/usr/local/lib')  # For installed version
+
+from fastpull import __version__
+from fastpull import run, build, quickstart
+
+
+def main():
+    """Main CLI entry point."""
+    parser = argparse.ArgumentParser(
+        prog='fastpull',
+        description='FastPull - Accelerate AI/ML container startup with lazy-loading snapshotters',
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  # Run container with benchmarking
+  fastpull run --snapshotter nydus --image myapp:latest-nydus \\
+    --benchmark-mode readiness --readiness-endpoint http://localhost:8080/health -p 8080:8080
+
+  # Build and push Docker and Nydus images
+  fastpull build --image-path ./app --image myapp:v1 --format docker,nydus
+
+For more information, visit: https://github.com/tensorfuse/fastpull
+        """
+    )
+
+    parser.add_argument(
+        '--version',
+        action='version',
+        version=f'%(prog)s {__version__}'
+    )
+
+    # Create subparsers for commands
+    subparsers = parser.add_subparsers(
+        dest='command',
+        title='commands',
+        description='Available fastpull commands',
+        help='Command to execute'
+    )
+
+    # Add subcommands
+    run.add_parser(subparsers)
+    build.add_parser(subparsers)
+    quickstart.add_parser(subparsers)
+
+    # Parse arguments
+    args = parser.parse_args()
+
+    # If no command specified, print help
+    if not args.command:
+        parser.print_help()
+        sys.exit(1)
+
+    # Execute the command
+    try:
+        args.func(args)
+    except KeyboardInterrupt:
+        print("\n\nInterrupted by user")
+        sys.exit(130)
+    except Exception as e:
+        print(f"Error: {e}")
+        sys.exit(1)
+
+
+if __name__ == '__main__':
+    main()
diff --git a/scripts/fastpull/__init__.py b/scripts/fastpull/__init__.py
new file mode 100644
index 0000000..23d4405
--- /dev/null
+++ b/scripts/fastpull/__init__.py
@@ -0,0 +1,8 @@
+"""
+FastPull - Accelerate AI/ML container startup with lazy-loading snapshotters.
+
+A unified CLI for building, pushing, and running containers with Nydus, SOCI,
+and eStarGZ snapshotters.
+"""
+
+__version__ = "0.1.0"
diff --git a/scripts/fastpull/benchmark.py b/scripts/fastpull/benchmark.py
new file mode 100644
index 0000000..f79d228
--- /dev/null
+++ b/scripts/fastpull/benchmark.py
@@ -0,0 +1,193 @@
+"""
+Benchmarking utilities for fastpull run command.
+
+Tracks container lifecycle events and readiness checks.
+"""
+
+import json
+import subprocess
+import threading
+import time
+from datetime import datetime
+from typing import Optional, Dict
+from urllib.request import urlopen
+from urllib.error import URLError, HTTPError
+
+
+class ContainerBenchmark:
+    """Track container startup and readiness metrics."""
+
+    def __init__(self, container_id: str, benchmark_mode: str = 'none',
+                 readiness_endpoint: Optional[str] = None, mode: str = 'normal'):
+        """
+        Initialize benchmark tracker.
+
+        Args:
+            container_id: Container ID to track
+            benchmark_mode: 'none', 'completion', or 'readiness'
+            readiness_endpoint: HTTP endpoint for readiness checks
+            mode: 'nydus' or 'normal' (for display purposes)
+        """
+        self.container_id = container_id
+        self.benchmark_mode = benchmark_mode
+        self.readiness_endpoint = readiness_endpoint
+        self.mode = mode
+        self.metrics: Dict[str, float] = {}
+        self.start_time = time.time()
+        self._event_thread: Optional[threading.Thread] = None
+        self._container_started = False
+
+    def start_event_monitoring(self):
+        """Start monitoring containerd events in background thread."""
+        if self.benchmark_mode == 'none':
+            return
+
+        def monitor_events():
+            """Monitor ctr events for container lifecycle."""
+            try:
+                # Run sudo ctr events and parse for our container
+                proc = subprocess.Popen(
+                    ['sudo', 'ctr', 'events'],
+                    stdout=subprocess.PIPE,
+                    stderr=subprocess.PIPE,
+                    text=True,
+                    bufsize=1
+                )
+
+                for line in proc.stdout:
+                    # Look for /tasks/start event (check any task since we're the only one running)
+                    if '/tasks/start' in line and self.metrics.get('container_start_time') is None:
+                        elapsed = time.time() - self.start_time
+                        self.metrics['container_start_time'] = elapsed
+                        self._container_started = True
+                        print(f"[{elapsed:.3f}s] ✓ CONTAINER START")
+
+                    # Look for our specific container's exit event
+                    if self.container_id in line and '/tasks/exit' in line and self.benchmark_mode == 'completion':
+                        elapsed = time.time() - self.start_time
+                        self.metrics['completion_time'] = elapsed
+                        print(f"[{elapsed:.3f}s] ✓ CONTAINER EXIT")
+                        break
+
+            except Exception as e:
+                print(f"Event monitoring error: {e}")
+
+        self._event_thread = threading.Thread(target=monitor_events, daemon=True)
+        self._event_thread.start()
+
+    def wait_for_readiness(self, timeout: int = 600, poll_interval: int = 2):
+        """
+        Poll readiness endpoint until HTTP 200 response.
+
+        Args:
+            timeout: Maximum time to wait in seconds
+            poll_interval: Time between polls in seconds
+
+        Returns:
+            True if endpoint became ready, False if timeout
+        """
+        if self.benchmark_mode != 'readiness' or not self.readiness_endpoint:
+            return True
+
+        # Ensure endpoint has protocol prefix
+        endpoint = self.readiness_endpoint
+        if not endpoint.startswith(('http://', 'https://')):
+            endpoint = f'http://{endpoint}'
+
+        print(f"Polling {endpoint} for readiness...")
+        end_time = time.time() + timeout
+
+        while time.time() < end_time:
+            try:
+                response = urlopen(endpoint, timeout=5)
+                if response.getcode() == 200:
+                    elapsed = time.time() - self.start_time
+                    self.metrics['readiness_time'] = elapsed
+                    print(f"Container ready (HTTP 200): {elapsed:.2f}s")
+                    return True
+            except (URLError, HTTPError):
+                pass
+
+            time.sleep(poll_interval)
+
+        print(f"Readiness check timeout after {timeout}s")
+        return False
+
+    def wait_for_completion(self, timeout: int = 3600):
+        """
+        Wait for container to exit.
+
+        Args:
+            timeout: Maximum time to wait in seconds
+
+        Returns:
+            True if container exited, False if timeout
+        """
+        if self.benchmark_mode != 'completion':
+            return True
+
+        print(f"Waiting for container completion...")
+        end_time = time.time() + timeout
+
+        while time.time() < end_time:
+            # Check if container is still running
+            result = subprocess.run(
+                ['nerdctl', 'ps', '-q', '-f', f'id={self.container_id}'],
+                capture_output=True,
+                text=True
+            )
+
+            if not result.stdout.strip():
+                # Container has exited
+                if 'completion_time' not in self.metrics:
+                    elapsed = time.time() - self.start_time
+                    self.metrics['completion_time'] = elapsed
+                print(f"Container completed")
+                return True
+
+            time.sleep(1)
+
+        print(f"Completion timeout after {timeout}s")
+        return False
+
+    def print_summary(self):
+        """Print benchmark results summary."""
+        if self.benchmark_mode == 'none':
+            return
+
+        mode_label = "FASTPULL" if self.mode == 'nydus' else "NORMAL"
+        print("\n" + "="*50)
+        print(f"{mode_label} BENCHMARK SUMMARY")
+        print("="*50)
+
+        if 'container_start_time' in self.metrics:
+            print(f"Time to Container Start: {self.metrics['container_start_time']:.3f}s")
+
+        if 'readiness_time' in self.metrics:
+            print(f"Time to Readiness:       {self.metrics['readiness_time']:.3f}s")
+
+        if 'completion_time' in self.metrics:
+            print(f"Time to Completion:      {self.metrics['completion_time']:.3f}s")
+
+        total_time = time.time() - self.start_time
+        print(f"Total Elapsed Time:      {total_time:.3f}s")
+        print("="*50 + "\n")
+
+    def export_json(self, filepath: str):
+        """
+        Export metrics to JSON file.
+
+        Args:
+            filepath: Path to output JSON file
+        """
+        output = {
+            'container_id': self.container_id,
+            'benchmark_mode': self.benchmark_mode,
+            'metrics': self.metrics,
+            'timestamp': datetime.now().isoformat()
+        }
+
+        with open(filepath, 'w') as f:
+            json.dump(output, f, indent=2)
+
+        print(f"Metrics exported to {filepath}")
diff --git a/scripts/fastpull/build.py b/scripts/fastpull/build.py
new file mode 100644
index 0000000..7418d17
--- /dev/null
+++ b/scripts/fastpull/build.py
@@ -0,0 +1,428 @@
+"""
+FastPull build command - Build and convert container images.
+
+Supports two modes:
+1. Build from Dockerfile: docker build → push → convert
+2. Convert existing image: pull (if needed) → push → convert
+"""
+
+import argparse
+import os
+import subprocess
+import sys
+from typing import List
+
+from . import common
+
+
+def add_parser(subparsers):
+    """Add build subcommand parser."""
+    parser = subparsers.add_parser(
+        'build',
+        help='Build and convert container images',
+        description='Build Docker images and convert to Nydus/SOCI/eStarGZ formats'
+    )
+
+    # Image specification
+    parser.add_argument(
+        '--repository-url',
+        required=True,
+        help='Full image reference (e.g., account.dkr.ecr.region.amazonaws.com/myapp:v1)'
+    )
+    parser.add_argument(
+        '--dockerfile-path',
+        help='Path to Dockerfile directory (optional - if not provided, assumes image exists)'
+    )
+
+    # Registry configuration
+    parser.add_argument(
+        '--registry',
+        choices=['ecr', 'gar', 'dockerhub', 'auto'],
+        default='auto',
+        help='Registry type (default: auto-detect from image URL)'
+    )
+
+    # Google GAR parameters
+    parser.add_argument(
+        '--project-id',
+        help='GCP project ID (for GAR)'
+    )
+    parser.add_argument(
+        '--location',
+        default='us-central1',
+        help='GCP location (default: us-central1)'
+    )
+    parser.add_argument(
+        '--repository',
+        help='GAR repository name (for GAR)'
+    )
+
+    # Build options
+    parser.add_argument(
+        '--format',
+        default='docker,nydus',
+        help='Comma-separated formats: docker, nydus, soci, estargz (default: docker,nydus)'
+    )
+    parser.add_argument(
+        '--no-cache',
+        action='store_true',
+        help='Build without cache'
+    )
+    parser.add_argument(
+        '--build-arg',
+        action='append',
+        help='Build arguments (can be used multiple times)'
+    )
+    parser.add_argument(
+        '--dockerfile',
+        default='Dockerfile',
+        help='Dockerfile name (default: Dockerfile)'
+    )
+
+    parser.set_defaults(func=build_command)
+    return parser
+
+
+def build_command(args):
+    """Execute the build command."""
+    # Auto-detect registry
+    if args.registry == 'auto':
+        args.registry = common.detect_registry_type(args.repository_url)
+        if args.registry == 'unknown':
+            print(f"Error: Could not auto-detect registry from image: {args.repository_url}")
+            print("Please specify --registry explicitly")
+            sys.exit(1)
+        print(f"Auto-detected registry: {args.registry}")
+
+    # Validate registry-specific parameters
+    if args.registry == 'ecr':
+        # Get account and region from AWS CLI
+        args.account = common.get_aws_account_id()
+        args.region = common.get_aws_region()
+
+        if not args.account:
+            print("Error: Could not detect AWS account ID. Please configure AWS CLI (aws configure)")
+            sys.exit(1)
+
+        if not args.region:
+            args.region = 'us-east-1'  # Fallback to default
+
+        print(f"Using AWS account: {args.account}, region: {args.region}")
+
+    if args.registry == 'gar' and not args.repository:
+        parsed = common.parse_gar_url(args.repository_url)
+        if parsed:
+            args.location, args.project_id, args.repository = parsed
+        else:
+            print("Error: --repository required for GAR")
+            sys.exit(1)
+
+    # Parse formats
+    formats = [f.strip().lower() for f in args.format.split(',')]
+    valid_formats = ['docker', 'nydus', 'soci', 'estargz']
+    for fmt in formats:
+        if fmt not in valid_formats:
+            print(f"Error: Invalid format '{fmt}'. Valid: {', '.join(valid_formats)}")
+            sys.exit(1)
+
+    # Determine build mode
+    if args.dockerfile_path:
+        # Mode 1: Build from Dockerfile
+        build_from_dockerfile(args, formats)
+    else:
+        # Mode 2: Convert existing image
+        if 'docker' in formats:
+            print("Warning: --image-path not provided, skipping docker build")
+            formats.remove('docker')
+
+        if not formats:
+            print("Error: No formats to build (docker requires --image-path)")
+            sys.exit(1)
+
+        convert_existing_image(args, formats)
+
+    print("\n" + "="*60)
+    print("BUILD COMPLETE")
+    print("="*60)
+
+
+def authenticate_registry(args) -> bool:
+    """Authenticate with the registry."""
+    if args.registry == 'ecr':
+        return authenticate_ecr(args)
+    elif args.registry == 'gar':
+        return authenticate_gar(args)
+    elif args.registry == 'dockerhub':
+        print("Assuming Docker Hub authentication already configured")
+        return True
+    return False
+
+
+def authenticate_ecr(args) -> bool:
+    """Authenticate with AWS ECR."""
+    try:
+        # Get login password
+        result = subprocess.run(
+            ['aws', 'ecr', 'get-login-password', '--region', args.region],
+            check=True,
+            capture_output=True,
+            text=True
+        )
+        password = result.stdout.strip()
+
+        # Login with docker
+        registry_url = f"{args.account}.dkr.ecr.{args.region}.amazonaws.com"
+        subprocess.run(
+            ['docker', 'login', '--username', 'AWS', '--password-stdin', registry_url],
+            input=password,
+            check=True,
+            capture_output=True,
+            text=True
+        )
+
+        # Login with nerdctl
+        subprocess.run(
+            ['sudo', 'nerdctl', 'login', '--username', 'AWS', '--password-stdin', registry_url],
+            input=password,
+            check=True,
+            capture_output=True,
+            text=True
+        )
+
+        print(f"✓ Authenticated with ECR")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"✗ ECR authentication failed: {e}")
+        return False
+
+
+def authenticate_gar(args) -> bool:
+    """Authenticate with Google Artifact Registry."""
+    try:
+        if not args.project_id:
+            result = subprocess.run(
+                ['gcloud', 'config', 'get', 'project'],
+                check=True,
+                capture_output=True,
+                text=True
+            )
+            args.project_id = result.stdout.strip()
+
+        registry_url = f"{args.location}-docker.pkg.dev"
+        subprocess.run(
+            ['gcloud', 'auth', 'configure-docker', registry_url, '--quiet'],
+            check=True,
+            capture_output=True
+        )
+
+        print(f"✓ Authenticated with GAR")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"✗ GAR authentication failed: {e}")
+        return False
+
+
+def build_from_dockerfile(args, formats: List[str]):
+    """Mode 1: Build from Dockerfile, push, and convert."""
+    print("\n" + "="*60)
+    print("MODE: Build from Dockerfile")
+    print("="*60)
+
+    # Auto-detect if dockerfile_path is a file or directory
+    if os.path.isfile(args.dockerfile_path):
+        # User provided a file path, extract directory and filename
+        dockerfile_dir = os.path.dirname(args.dockerfile_path)
+        dockerfile_name = os.path.basename(args.dockerfile_path)
+
+        # Use current directory if no directory in path
+        if not dockerfile_dir:
+            dockerfile_dir = '.'
+
+        # Override the dockerfile argument with detected filename
+        args.dockerfile = dockerfile_name
+        args.dockerfile_path = dockerfile_dir
+
+        print(f"Detected Dockerfile: {dockerfile_name} in {dockerfile_dir}")
+
+    # Validate directory exists
+    if not os.path.isdir(args.dockerfile_path):
+        print(f"Error: Directory not found: {args.dockerfile_path}")
+        sys.exit(1)
+
+    # Construct full Dockerfile path
+    dockerfile_path = os.path.join(args.dockerfile_path, args.dockerfile)
+    if not os.path.isfile(dockerfile_path):
+        print(f"Error: Dockerfile not found: {dockerfile_path}")
+        sys.exit(1)
+
+    built_images = []
+
+    # Build and push Docker image
+    if 'docker' in formats:
+        if build_and_push_docker(args):
+            built_images.append(args.repository_url)
+
+    # Convert to other formats
+    if 'nydus' in formats:
+        nydus_image = f"{args.repository_url.rsplit(':', 1)[0]}:{args.repository_url.rsplit(':', 1)[1]}-fastpull"
+        if convert_to_nydus(args.repository_url, nydus_image):
+            built_images.append(nydus_image)
+
+    if 'soci' in formats:
+        soci_image = f"{args.repository_url.rsplit(':', 1)[0]}:{args.repository_url.rsplit(':', 1)[1]}-soci"
+        if convert_to_soci(args.repository_url, soci_image):
+            built_images.append(soci_image)
+
+    if 'estargz' in formats:
+        estargz_image = f"{args.repository_url.rsplit(':', 1)[0]}:{args.repository_url.rsplit(':', 1)[1]}-estargz"
+        if convert_to_estargz(args.repository_url, estargz_image):
+            built_images.append(estargz_image)
+
+    # Summary
+    print_summary(built_images)
+
+
+def convert_existing_image(args, formats: List[str]):
+    """Mode 2: Convert existing image (no docker build)."""
+    print("\n" + "="*60)
+    print("MODE: Convert Existing Image")
+    print("="*60)
+
+    built_images = []
+
+    # Convert to requested formats
+    if 'nydus' in formats:
+        nydus_image = f"{args.repository_url.rsplit(':', 1)[0]}:{args.repository_url.rsplit(':', 1)[1]}-fastpull"
+        if convert_to_nydus(args.repository_url, nydus_image):
+            built_images.append(nydus_image)
+
+    if 'soci' in formats:
+        soci_image = f"{args.repository_url.rsplit(':', 1)[0]}:{args.repository_url.rsplit(':', 1)[1]}-soci"
+        if convert_to_soci(args.repository_url, soci_image):
+            built_images.append(soci_image)
+
+    if 'estargz' in formats:
+        estargz_image = f"{args.repository_url.rsplit(':', 1)[0]}:{args.repository_url.rsplit(':', 1)[1]}-estargz"
+        if convert_to_estargz(args.repository_url, estargz_image):
+            built_images.append(estargz_image)
+
+    # Summary
+    print_summary(built_images)
+
+
+def build_and_push_docker(args) -> bool:
+    """Build and push Docker image."""
+    print(f"\n[Docker] Building {args.repository_url}...")
+
+    # Build
+    cmd = [
+        'sudo', 'docker', 'build',
+        '-t', args.repository_url,
+        '-f', os.path.join(args.dockerfile_path, args.dockerfile)
+    ]
+
+    if args.no_cache:
+        cmd.append('--no-cache')
+
+    if args.build_arg:
+        for build_arg in args.build_arg:
+            cmd.extend(['--build-arg', build_arg])
+
+    cmd.append(args.dockerfile_path)
+
+    try:
+        subprocess.run(cmd, check=True)
+        print(f"[Docker] ✓ Built {args.repository_url}")
+    except subprocess.CalledProcessError:
+        print(f"[Docker] ✗ Build failed")
+        return False
+
+    # Push
+    print(f"[Docker] Pushing {args.repository_url}...")
+    try:
+        subprocess.run(['sudo', 'docker', 'push', args.repository_url], check=True)
+        print(f"[Docker] ✓ Pushed {args.repository_url}")
+        return True
+    except subprocess.CalledProcessError:
+        print(f"[Docker] ✗ Push failed")
+        return False
+
+
+def convert_to_nydus(source_image: str, target_image: str) -> bool:
+    """Convert to Nydus format."""
+    print(f"\n[Nydus] Converting {source_image} → {target_image}...")
+
+    cmd = [
+        'nydusify', 'convert',
+        '--source', source_image,
+        '--target', target_image
+    ]
+
+    try:
+        subprocess.run(cmd, check=True)
+        print(f"[Nydus] ✓ Converted and pushed {target_image}")
+        return True
+    except subprocess.CalledProcessError:
+        print(f"[Nydus] ✗ Conversion failed")
+        return False
+
+
+def convert_to_soci(source_image: str, target_image: str) -> bool:
+    """Convert to SOCI format."""
+    print(f"\n[SOCI] Converting {source_image} → {target_image}...")
+
+    # Pull with nerdctl
+    try:
+        subprocess.run(['sudo', 'nerdctl', 'pull', source_image], check=True, capture_output=True)
+    except subprocess.CalledProcessError:
+        print(f"[SOCI] ✗ Pull failed")
+        return False
+
+    # Convert
+    try:
+        subprocess.run(['sudo', 'soci', 'create', source_image], check=True)
+    except subprocess.CalledProcessError:
+        print(f"[SOCI] ✗ Conversion failed")
+        return False
+
+    # Tag and push
+    try:
+        subprocess.run(['sudo', 'nerdctl', 'tag', source_image, target_image], check=True)
+        subprocess.run(['sudo', 'nerdctl', 'push', target_image], check=True)
+        print(f"[SOCI] ✓ Converted and pushed {target_image}")
+        return True
+    except subprocess.CalledProcessError:
+        print(f"[SOCI] ✗ Push failed")
+        return False
+
+
+def convert_to_estargz(source_image: str, target_image: str) -> bool:
+    """Convert to eStarGZ format."""
+    print(f"\n[eStarGZ] Converting {source_image} → {target_image}...")
+
+    try:
+        subprocess.run(['sudo', 'nerdctl', '--snapshotter', 'stargz', 'pull', source_image],
+                      check=True, capture_output=True)
+        subprocess.run(['sudo', 'nerdctl', '--snapshotter', 'stargz', 'tag', source_image, target_image],
+                      check=True)
+        subprocess.run(['sudo', 'nerdctl', '--snapshotter', 'stargz', 'push', target_image],
+                      check=True)
+        print(f"[eStarGZ] ✓ Converted and pushed {target_image}")
+        return True
+    except subprocess.CalledProcessError:
+        print(f"[eStarGZ] ✗ Conversion failed")
+        return False
+
+
+def print_summary(images: List[str]):
+    """Print build summary."""
+    print("\n" + "="*60)
+    print("SUMMARY")
+    print("="*60)
+    if images:
+        print("Successfully built and pushed:")
+        for img in images:
+            print(f"  ✓ {img}")
+    else:
+        print("No images were built successfully")
+    print("="*60)
diff --git a/scripts/fastpull/clean.py b/scripts/fastpull/clean.py
new file mode 100644
index 0000000..85a53f1
--- /dev/null
+++ b/scripts/fastpull/clean.py
@@ -0,0 +1,181 @@
+"""
+FastPull clean command - Remove local images and artifacts.
+"""
+
+import argparse
+import subprocess
+import sys
+from typing import List
+
+
+def add_parser(subparsers):
+    """Add clean subcommand parser."""
+    parser = subparsers.add_parser(
+        'clean',
+        help='Remove local images and artifacts',
+        description='Clean up fastpull images and containers'
+    )
+
+    parser.add_argument(
+        '--images',
+        action='store_true',
+        help='Remove all fastpull images'
+    )
+    parser.add_argument(
+        '--containers',
+        action='store_true',
+        help='Remove stopped containers'
+    )
+    parser.add_argument(
+        '--all',
+        action='store_true',
+        help='Remove all images and containers'
+    )
+    parser.add_argument(
+        '--snapshotter',
+        choices=['nydus', 'overlayfs', 'all'],
+        default='all',
+        help='Target specific snapshotter (default: all)'
+    )
+    parser.add_argument(
+        '--dry-run',
+        action='store_true',
+        help='Show what would be removed without removing'
+    )
+    parser.add_argument(
+        '--force',
+        action='store_true',
+        help='Force removal without confirmation'
+    )
+
+    parser.set_defaults(func=clean_command)
+    return parser
+
+
+def clean_command(args):
+    """Execute the clean command."""
+    # If no specific target, clean all
+    if not args.images and not args.containers and not args.all:
+        print("Please specify what to clean: --images, --containers, or --all")
+        sys.exit(1)
+
+    if args.all:
+        args.images = True
+        args.containers = True
+
+    # Determine which snapshotters to clean
+    snapshotters = ['nydus', 'overlayfs'] if args.snapshotter == 'all' else [args.snapshotter]
+
+    # Clean containers first
+    if args.containers:
+        clean_containers(snapshotters, args.dry_run, args.force)
+
+    # Clean images
+    if args.images:
+        clean_images(snapshotters, args.dry_run, args.force)
+
+
+def clean_containers(snapshotters: List[str], dry_run: bool = False, force: bool = False):
+    """
+    Remove stopped containers.
+
+    Args:
+        snapshotters: List of snapshotters to target
+        dry_run: If True, only show what would be removed
+        force: If True, skip confirmation
+    """
+    print("\n=== Cleaning Containers ===")
+
+    for snapshotter in snapshotters:
+        # Get all containers (including stopped ones)
+        result = subprocess.run(
+            ['sudo', 'nerdctl', '--snapshotter', snapshotter, 'ps', '-a', '-q'],
+            capture_output=True,
+            text=True
+        )
+
+        container_ids = result.stdout.strip().split('\n') if result.stdout.strip() else []
+
+        if not container_ids:
+            print(f"[{snapshotter}] No containers to clean")
+            continue
+
+        print(f"[{snapshotter}] Found {len(container_ids)} container(s)")
+
+        if dry_run:
+            print(f"[{snapshotter}] Would remove {len(container_ids)} container(s)")
+            for cid in container_ids:
+                print(f"  - {cid}")
+            continue
+
+        # Confirm removal
+        if not force:
+            response = input(f"Remove {len(container_ids)} container(s) for {snapshotter}? [y/N]: ")
+            if response.lower() != 'y':
+                print(f"[{snapshotter}] Skipped")
+                continue
+
+        # Remove containers
+        for cid in container_ids:
+            subprocess.run(
+                ['sudo', 'nerdctl', '--snapshotter', snapshotter, 'rm', '-f', cid],
+                capture_output=True
+            )
+
+        print(f"[{snapshotter}] Removed {len(container_ids)} container(s)")
+
+
+def clean_images(snapshotters: List[str], dry_run: bool = False, force: bool = False):
+    """
+    Remove all images.
+
+    Args:
+        snapshotters: List of snapshotters to target
+        dry_run: If True, only show what would be removed
+        force: If True, skip confirmation
+    """
+    print("\n=== Cleaning Images ===")
+
+    for snapshotter in snapshotters:
+        # Get all images
+        result = subprocess.run(
+            ['sudo', 'nerdctl', '--snapshotter', snapshotter, 'images', '-q'],
+            capture_output=True,
+            text=True
+        )
+
+        image_ids = result.stdout.strip().split('\n') if result.stdout.strip() else []
+
+        if not image_ids:
+            print(f"[{snapshotter}] No images to clean")
+            continue
+
+        print(f"[{snapshotter}] Found {len(image_ids)} image(s)")
+
+        if dry_run:
+            print(f"[{snapshotter}] Would remove {len(image_ids)} image(s)")
+            # Show image details
+            result = subprocess.run(
+                ['sudo', 'nerdctl', '--snapshotter', snapshotter, 'images'],
+                capture_output=True,
+                text=True
+            )
+            print(result.stdout)
+            continue
+
+        # Confirm removal
+        if not force:
+            response = input(f"Remove {len(image_ids)} image(s) for {snapshotter}? [y/N]: ")
+            if response.lower() != 'y':
+                print(f"[{snapshotter}] Skipped")
+                continue
+
+        # Remove images
+        subprocess.run(
+            ['sudo', 'nerdctl', '--snapshotter', snapshotter, 'rmi', '-f'] + image_ids,
+            capture_output=True
+        )
+
+        print(f"[{snapshotter}] Removed {len(image_ids)} image(s)")
+
+    print("\n=== Cleanup Complete ===\n")
diff --git a/scripts/fastpull/cli.py b/scripts/fastpull/cli.py
new file mode 100644
index 0000000..1a2f4cb
--- /dev/null
+++ b/scripts/fastpull/cli.py
@@ -0,0 +1,73 @@
+#!/usr/bin/env python3
+"""
+FastPull - Accelerate AI/ML container startup with lazy-loading snapshotters.
+
+Main CLI entry point for the unified fastpull command.
+"""
+
+import argparse
+import sys
+
+from . import __version__, run, build, quickstart, clean
+
+
+def main():
+    """Main CLI entry point."""
+    parser = argparse.ArgumentParser(
+        prog='fastpull',
+        description='FastPull - Accelerate AI/ML container startup with lazy-loading snapshotters',
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  # Run container with benchmarking
+  fastpull run --snapshotter nydus --image myapp:latest-nydus \\
+    --benchmark-mode readiness --readiness-endpoint http://localhost:8080/health -p 8080:8080
+
+  # Build and push Docker and Nydus images
+  fastpull build --image-path ./app --image myapp:v1 --format docker,nydus
+
+For more information, visit: https://github.com/tensorfuse/fastpull
+        """
+    )
+
+    parser.add_argument(
+        '--version',
+        action='version',
+        version=f'%(prog)s {__version__}'
+    )
+
+    # Create subparsers for commands
+    subparsers = parser.add_subparsers(
+        dest='command',
+        title='commands',
+        description='Available fastpull commands',
+        help='Command to execute'
+    )
+
+    # Add subcommands
+    run.add_parser(subparsers)
+    build.add_parser(subparsers)
+    quickstart.add_parser(subparsers)
+    clean.add_parser(subparsers)
+
+    # Parse arguments
+    args = parser.parse_args()
+
+    # If no command specified, print help
+    if not args.command:
+        parser.print_help()
+        sys.exit(1)
+
+    # Execute the command
+    try:
+        args.func(args)
+    except KeyboardInterrupt:
+        print("\n\nInterrupted by user")
+        sys.exit(130)
+    except Exception as e:
+        print(f"Error: {e}")
+        sys.exit(1)
+
+
+if __name__ == '__main__':
+    main()
diff --git a/scripts/fastpull/common.py b/scripts/fastpull/common.py
new file mode 100644
index 0000000..bb01a07
--- /dev/null
+++ b/scripts/fastpull/common.py
@@ -0,0 +1,139 @@
+"""
+Common utilities for fastpull commands.
+
+Includes registry detection, authentication helpers, and shared functions.
+"""
+
+import re
+import subprocess
+from typing import Optional, Tuple
+
+
+def detect_registry_type(image: str) -> str:
+    """
+    Auto-detect registry type from image URL.
+
+    Args:
+        image: Container image URL
+
+    Returns:
+        Registry type: 'ecr', 'gar', 'dockerhub', or 'unknown'
+    """
+    if 'dkr.ecr' in image or 'ecr.aws' in image:
+        return 'ecr'
+    elif 'pkg.dev' in image:
+        return 'gar'
+    elif 'docker.io' in image or '/' not in image or image.count('/') == 1:
+        return 'dockerhub'
+    return 'unknown'
+
+
+def parse_ecr_url(image: str) -> Optional[Tuple[str, str, str]]:
+    """
+    Parse ECR image URL to extract account, region, and repository.
+
+    Args:
+        image: ECR image URL
+
+    Returns:
+        Tuple of (account_id, region, repository) or None if invalid
+    """
+    pattern = r'(\d+)\.dkr\.ecr\.([^.]+)\.amazonaws\.com/(.+)'
+    match = re.match(pattern, image)
+    if match:
+        return match.group(1), match.group(2), match.group(3)
+    return None
+
+
+def parse_gar_url(image: str) -> Optional[Tuple[str, str, str]]:
+    """
+    Parse GAR image URL to extract location, project, and repository.
+
+    Args:
+        image: GAR image URL (e.g., us-central1-docker.pkg.dev/project/repo/image:tag)
+
+    Returns:
+        Tuple of (location, project_id, repository) or None if invalid
+    """
+    # Pattern: location-docker.pkg.dev/project/repository/image:tag
+    # Use .+? for location to handle hyphens (e.g., us-central1)
+    pattern = r'(.+?)-docker\.pkg\.dev/([^/]+)/([^/]+)'
+    match = re.match(pattern, image)
+    if match:
+        return match.group(1), match.group(2), match.group(3)
+    return None
+
+
+def run_command(cmd: list, check: bool = True, capture_output: bool = True) -> subprocess.CompletedProcess:
+    """
+    Run a shell command with consistent error handling.
+
+    Args:
+        cmd: Command to run as list of strings
+        check: Raise exception on non-zero exit code
+        capture_output: Capture stdout/stderr
+
+    Returns:
+        CompletedProcess instance
+    """
+    return subprocess.run(
+        cmd,
+        check=check,
+        capture_output=capture_output,
+        text=True
+    )
+
+
+def get_snapshotter_binary(snapshotter: str) -> str:
+    """
+    Get the appropriate binary for the snapshotter.
+
+    Args:
+        snapshotter: Snapshotter type
+
+    Returns:
+        Binary name ('nerdctl' or 'docker')
+    """
+    # All snapshotters use nerdctl except for plain docker
+    if snapshotter in ['docker', 'overlayfs']:
+        return 'docker'
+    return 'nerdctl'
+
+
+def get_aws_account_id() -> Optional[str]:
+    """
+    Get AWS account ID from AWS CLI.
+
+    Returns:
+        Account ID or None if failed
+    """
+    try:
+        result = subprocess.run(
+            ['aws', 'sts', 'get-caller-identity', '--query', 'Account', '--output', 'text'],
+            check=True,
+            capture_output=True,
+            text=True
+        )
+        return result.stdout.strip()
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
+
+
+def get_aws_region() -> Optional[str]:
+    """
+    Get AWS region from AWS CLI configuration.
+
+    Returns:
+        Region or None if failed
+    """
+    try:
+        result = subprocess.run(
+            ['aws', 'configure', 'get', 'region'],
+            check=True,
+            capture_output=True,
+            text=True
+        )
+        region = result.stdout.strip()
+        return region if region else None
+    except (subprocess.CalledProcessError, FileNotFoundError):
+        return None
diff --git a/scripts/fastpull/quickstart.py b/scripts/fastpull/quickstart.py
new file mode 100644
index 0000000..1795b8e
--- /dev/null
+++ b/scripts/fastpull/quickstart.py
@@ -0,0 +1,81 @@
+"""
+FastPull quickstart command - Quick benchmarking comparisons.
+"""
+
+import argparse
+import subprocess
+import sys
+import os
+
+
+# Workload configurations: (name, base_image, endpoint)
+WORKLOADS = {
+    'tensorrt': ('TensorRT', 'tensorrt', '/health'),
+    'vllm': ('vLLM', 'vllm', '/health'),
+    'sglang': ('SGLang', 'sglang', '/health_generate'),
+}
+
+
+def add_parser(subparsers):
+    """Add quickstart subcommand parser."""
+    parser = subparsers.add_parser(
+        'quickstart',
+        help='Quick benchmark comparisons',
+        description='Run pre-configured benchmarks'
+    )
+
+    subparsers_qs = parser.add_subparsers(dest='workload', help='Workload to benchmark')
+
+    for workload in WORKLOADS:
+        wp = subparsers_qs.add_parser(workload, help=f'Benchmark {WORKLOADS[workload][0]} (nydus vs overlayfs)')
+        wp.add_argument('--output-dir', help='Directory to save results')
+        wp.set_defaults(func=run_quickstart)
+
+    parser.set_defaults(func=lambda args: parser.print_help() if not args.workload else None)
+    return parser
+
+
+def run_quickstart(args):
+    """Run benchmark comparison for a workload."""
+    name, image_name, endpoint = WORKLOADS[args.workload]
+
+    print(f"\n{'='*60}\n{name} Benchmark: FastPull vs Normal\n{'='*60}\n")
+
+    base = f"public.ecr.aws/s6z9f6e5/tensorfuse/fastpull/{image_name}:latest"
+
+    for mode in ['nydus', 'normal']:
+        print(f"\n[{mode.upper()}] Starting benchmark...")
+
+        # Use fastpull command directly (works when installed via pip)
+        cmd = [
+            'fastpull', 'run',
+            '--mode', mode,
+            '--benchmark-mode', 'readiness',
+            '--readiness-endpoint', f'http://localhost:8080{endpoint}',
+            '-p', '8080:8000',
+            '--gpus', 'all',
+            base  # Image as positional argument (tag suffix added automatically by run command)
+        ]
+
+        if args.output_dir:
+            os.makedirs(args.output_dir, exist_ok=True)
+            cmd.extend(['--output-json', f'{args.output_dir}/{image_name}-{mode}.json'])
+
+        try:
+            subprocess.run(cmd, check=True)
+        except (subprocess.CalledProcessError, KeyboardInterrupt):
+            sys.exit(1)
+
+    print(f"\n{'='*60}\nBenchmark complete!")
+    if args.output_dir:
+        print(f"Results: {args.output_dir}/")
+    print(f"{'='*60}\n")
+
+    # Auto cleanup after benchmarks complete
+    print("\nCleaning up containers and images...")
+    cleanup_cmd = ['fastpull', 'clean', '--all', '--force']
+    try:
+        subprocess.run(cleanup_cmd, check=False)  # Don't fail if cleanup has issues
+    except Exception as e:
+        print(f"Warning: Cleanup had issues: {e}")
+    print("Cleanup complete!\n")
diff --git a/scripts/fastpull/run.py b/scripts/fastpull/run.py
new file mode 100644
index 0000000..3cfb0cb
--- /dev/null
+++ b/scripts/fastpull/run.py
@@ -0,0 +1,325 @@
+"""
+FastPull run command - Run containers with specified snapshotters and benchmarking.
+"""
+
+import argparse
+import subprocess
+import sys
+import threading
+import time
+from typing import List, Optional
+
+from . import benchmark
+from . import common
+
+
+def add_parser(subparsers):
+    """Add run subcommand parser."""
+    parser = subparsers.add_parser(
+        'run',
+        help='Run container with specified snapshotter',
+        description='Run containers with Nydus or OverlayFS snapshotter'
+    )
+
+    # Mode selection (replaces --snapshotter)
+    parser.add_argument(
+        '--mode',
+        choices=['nydus', 'normal'],
+        default='nydus',
+        help='Run mode: nydus (default, adds -fastpull suffix) or normal (overlayfs, no suffix)'
+    )
+
+    # Benchmarking arguments
+    parser.add_argument(
+        '--benchmark-mode',
+        choices=['none', 'completion', 'readiness'],
+        default='none',
+        help='Benchmarking mode (default: none)'
+    )
+    parser.add_argument(
+        '--readiness-endpoint',
+        help='HTTP endpoint to poll for readiness (required if benchmark-mode=readiness)'
+    )
+    parser.add_argument(
+        '--output-json',
+        help='Export benchmark metrics to JSON file'
+    )
+
+    # Common container flags
+    parser.add_argument('--name', help='Container name')
+    parser.add_argument('-p', '--publish', action='append', help='Publish ports (can be used multiple times)')
+    parser.add_argument('-e', '--env', action='append', help='Set environment variables')
+    parser.add_argument('-v', '--volume', action='append', help='Bind mount volumes')
+    parser.add_argument('--gpus', help='GPU devices to use (e.g., "all")')
+    parser.add_argument('--rm', action='store_true', help='Automatically remove container when it exits')
+    parser.add_argument('-d', '--detach', action='store_true', help='Run container in background')
+
+    # Image as positional argument (like docker/nerdctl run)
+    parser.add_argument(
+        'image',
+        help='Container image to run'
+    )
+
+    # Pass-through for additional nerdctl flags (optional trailing args)
+    parser.add_argument(
+        'nerdctl_args',
+        nargs='*',
+        help='Additional arguments to pass to nerdctl/docker (e.g., command to run in container)'
+    )
+
+    parser.set_defaults(func=run_command)
+    return parser
+
+
+def run_command(args):
+    """Execute the run command."""
+    # Validate benchmark mode
+    if args.benchmark_mode == 'readiness' and not args.readiness_endpoint:
+        print("Error: --readiness-endpoint is required when --benchmark-mode=readiness")
+        sys.exit(1)
+
+    # Determine snapshotter and modify image tag based on mode
+    if args.mode == 'nydus':
+        args.snapshotter = 'nydus'
+        # Add -fastpull suffix to image tag if not already present
+        if ':' in args.image:
+            base, tag = args.image.rsplit(':', 1)
+            if not tag.endswith('-fastpull'):
+                args.image = f"{base}:{tag}-fastpull"
+        else:
+            args.image = f"{args.image}:latest-fastpull"
+    else:  # normal mode
+        args.snapshotter = 'overlayfs'
+        # Use image as-is for normal mode
+
+    # Build the nerdctl/docker command
+    cmd = build_run_command(args)
+
+    print(f"Running container with {args.snapshotter} snapshotter...")
+    print(f"Image: {args.image}")
+    print(f"Command: {' '.join(cmd)}\n")
+
+    # For benchmarking, we need to track the container
+    if args.benchmark_mode != 'none':
+        run_with_benchmark(cmd, args)
+    else:
+        run_without_benchmark(cmd)
+
+
+def build_run_command(args) -> List[str]:
+    """
+    Build the nerdctl/docker run command from arguments.
+
+    Args:
+        args: Parsed command-line arguments
+
+    Returns:
+        Command as list of strings
+    """
+    # Determine binary (use sudo)
+    if args.snapshotter == 'overlayfs':
+        cmd = ['sudo', 'nerdctl', '--snapshotter', 'overlayfs', 'run']
+    else:
+        cmd = ['sudo', 'nerdctl', '--snapshotter', args.snapshotter, 'run']
+
+    # Add common flags
+    if args.name:
+        cmd.extend(['--name', args.name])
+
+    if args.rm:
+        cmd.append('--rm')
+
+    if args.detach:
+        cmd.append('-d')
+
+    # Add ports
+    if args.publish:
+        for port in args.publish:
+            cmd.extend(['-p', port])
+
+    # Add environment variables
+    if args.env:
+        for env in args.env:
+            cmd.extend(['-e', env])
+
+    # Add volumes
+    if args.volume:
+        for vol in args.volume:
+            cmd.extend(['-v', vol])
+
+    # Add GPU support
+    if args.gpus:
+        cmd.extend(['--gpus', args.gpus])
+
+    # Add any additional pass-through arguments
+    if args.nerdctl_args:
+        cmd.extend(args.nerdctl_args)
+
+    # Add image (must be last)
+    cmd.append(args.image)
+
+    return cmd
+
+
+def run_without_benchmark(cmd: List[str]):
+    """
+    Run container without benchmarking.
+
+    Args:
+        cmd: Command to execute
+    """
+    try:
+        subprocess.run(cmd, check=True)
+    except subprocess.CalledProcessError as e:
+        print(f"Error running container: {e}")
+        sys.exit(1)
+
+
+def run_with_benchmark(cmd: List[str], args):
+    """
+    Run container with benchmarking enabled.
+
+    Args:
+        cmd: Command to execute
+        args: Parsed arguments
+    """
+    # Force detached mode for benchmarking
+    if '-d' not in cmd and '--detach' not in cmd:
+        cmd.insert(cmd.index('run') + 1, '-d')
+
+    # Initialize benchmark tracker early (before starting container)
+    # We'll set container_id later, but we need to start event monitoring first
+    bench = benchmark.ContainerBenchmark(
+        container_id='',  # Will be set after container starts
+        benchmark_mode=args.benchmark_mode,
+        readiness_endpoint=args.readiness_endpoint,
+        mode=args.mode
+    )
+
+    # Start event monitoring BEFORE starting the container
+    print("Starting containerd events monitoring...")
+    bench.start_event_monitoring()
+
+    # Small delay to ensure event monitoring is ready
+    time.sleep(0.5)
+
+    # Start the container
+    try:
+        print(f"Running container...")
+        result = subprocess.run(
+            cmd,
+            check=True,
+            capture_output=True,
+            text=True
+        )
+        container_id = result.stdout.strip()
+
+        if not container_id:
+            print("Error: Failed to get container ID")
+            sys.exit(1)
+
+        print(f"Container started: {container_id[:12]}")
+
+        # Update benchmark tracker with container ID
+        bench.container_id = container_id
+
+    except subprocess.CalledProcessError as e:
+        print(f"Error starting container: {e}")
+        if e.stderr:
+            print(f"stderr: {e.stderr}")
+        sys.exit(1)
+
+    # Start monitoring logs in background
+    print("Monitoring container logs...")
+    stop_logs_event = threading.Event()
+    log_thread = start_log_monitoring(container_id, args.snapshotter, bench.start_time, stop_logs_event)
+
+    # Wait for completion or readiness
+    try:
+        if args.benchmark_mode == 'completion':
+            success = bench.wait_for_completion()
+        elif args.benchmark_mode == 'readiness':
+            success = bench.wait_for_readiness()
+        else:
+            success = True
+
+        # Stop log monitoring after benchmark completes
+        stop_logs_event.set()
+
+        if not success:
+            print("Benchmark failed (timeout)")
+            # Cleanup on failure
+            cleanup_container(container_id, args.snapshotter)
+            sys.exit(1)
+
+        # Print summary
+        bench.print_summary()
+
+        # Export JSON if requested
+        if args.output_json:
+            bench.export_json(args.output_json)
+
+        # Cleanup container after successful benchmark
+        print("\nBenchmark complete, cleaning up container...")
+        cleanup_container(container_id, args.snapshotter)
+
+    except KeyboardInterrupt:
+        print("\nInterrupted by user")
+        # Stop and remove container
+        cleanup_container(container_id, args.snapshotter)
+        sys.exit(1)
+
+
+def start_log_monitoring(container_id: str, snapshotter: str, start_time: float, stop_event: threading.Event) -> threading.Thread:
+    """
+    Start monitoring container logs in background thread.
+
+    Args:
+        container_id: Container ID
+        snapshotter: Snapshotter type
+        start_time: Benchmark start time
+        stop_event: Event to signal when to stop monitoring
+
+    Returns:
+        Log monitoring thread
+    """
+    def log_reader():
+        try:
+            cmd = ['sudo', 'nerdctl', 'logs', '-f', container_id]
+
+            process = subprocess.Popen(
+                cmd,
+                stdout=subprocess.PIPE,
+                stderr=subprocess.STDOUT,
+                text=True,
+                bufsize=1,
+                universal_newlines=True
+            )
+
+            for line in process.stdout:
+                if stop_event.is_set():
+                    process.terminate()
+                    break
+                if line:
+                    elapsed = time.time() - start_time
+                    print(f"[{elapsed:.3f}s] {line.rstrip()}")
+
+        except Exception as e:
+            pass  # Silently handle errors (container might be stopped)
+
+    thread = threading.Thread(target=log_reader, daemon=True)
+    thread.start()
+    return thread
+
+
+def cleanup_container(container_id: str, snapshotter: str):
+    """
+    Stop and remove container.
+
+    Args:
+        container_id: Container ID
+        snapshotter: Snapshotter type
+    """
+    print(f"Cleaning up container {container_id[:12]}...")
+    subprocess.run(['sudo', 'nerdctl', 'stop', container_id], capture_output=True)
+    subprocess.run(['sudo', 'nerdctl', 'rm', container_id], capture_output=True)
diff --git a/scripts/install_snapshotters.py b/scripts/install_snapshotters.py
deleted file mode 100755
index ec959b7..0000000
--- a/scripts/install_snapshotters.py
+++ /dev/null
@@ -1,523 +0,0 @@
-#!/usr/bin/env python3
-"""
-Container Snapshotter Installation Script
-
-This script installs and configures multiple container snapshotters:
-- Nydus: Efficient container image storage with lazy loading
-- SOCI (Seekable OCI): AWS-developed snapshotter for faster container startup
-- StarGZ: Google-developed snapshotter with eStargz format support
-
-The script also installs supporting tools like nerdctl and CNI plugins,
-configures systemd services, and sets up containerd integration.
-
-Requirements:
-- Must be run as root
-- Linux system with systemd
-- Internet access for downloading binaries
-"""
-
-import os
-import sys
-import subprocess
-import shutil
-import tempfile
-from pathlib import Path
-
-# Configuration constants for component versions
-NYDUS_VERSION = "2.3.6"
-NYDUS_SNAPSHOTTER_VERSION = "0.15.3"
-NERDCTL_VERSION = "2.1.4"
-CNI_VERSION = "v1.8.0"
-SOCI_VERSION = "0.11.1"
-STARGZ_VERSION = "0.17.0"
-
-def run_command(cmd, check=True, shell=False):
-    """
-    Execute a shell command with error handling.
-    
-    Args:
-        cmd: Command to execute (list or string)
-        check: Whether to raise exception on non-zero exit code
-        shell: Whether to use shell execution
-        
-    Returns:
-        subprocess.CompletedProcess: Command execution result
-    """
-    if shell:
-        result = subprocess.run(cmd, shell=True, check=check, capture_output=True, text=True)
-    else:
-        result = subprocess.run(cmd, check=check, capture_output=True, text=True)
-    return result
-
-def check_root():
-    """
-    Verify that the script is running with root privileges.
-    Exits with error code 1 if not running as root.
-    """
-    if os.geteuid() != 0:
-        print("This script must be run as root")
-        sys.exit(1)
-
-def download_and_extract(url, extract_to=None):
-    """
-    Download and extract a tar.gz archive from a URL.
-    
-    Args:
-        url: URL to download the archive from
-        extract_to: Optional directory to extract to (current dir if None)
-        
-    Returns:
-        str: Filename of the downloaded archive
-    """
-    filename = url.split('/')[-1]
-    
-    # Download the archive
-    print(f"  Downloading {filename}...")
-    run_command(['wget', url])
-    
-    # Extract the archive
-    print(f"  Extracting {filename}...")
-    if extract_to:
-        run_command(['tar', '-xzf', filename, '-C', extract_to])
-    else:
-        run_command(['tar', '-xzf', filename])
-    
-    # Clean up the downloaded archive
-    os.remove(filename)
-    return filename
-
-def install_nydus():
-    """
-    Install Nydus container image acceleration toolkit.
-    
-    Nydus provides lazy loading capabilities for container images,
-    reducing startup time and bandwidth usage.
-    """
-    print("------------------ Installing Nydus -------------------------------")
-    print(f"Installing Nydus v{NYDUS_VERSION}...")
-    
-    # Download and extract Nydus static binaries
-    url = f"https://github.com/dragonflyoss/nydus/releases/download/v{NYDUS_VERSION}/nydus-static-v{NYDUS_VERSION}-linux-amd64.tgz"
-    download_and_extract(url)
-    
-    # Install binaries to system path
-    print("  Installing Nydus binaries...")
-    nydus_binaries = list(Path('nydus-static').glob('*'))
-    run_command(['cp', '-r'] + [str(b) for b in nydus_binaries] + ['/usr/local/bin/'])
-    
-    # Make binaries executable
-    nydus_installed = list(Path('/usr/local/bin').glob('nydus*'))
-    run_command(['chmod', '+x'] + [str(p) for p in nydus_installed])
-    
-    # Clean up temporary files
-    shutil.rmtree('nydus-static', ignore_errors=True)
-
-def install_nydus_snapshotter():
-    """
-    Install Nydus Snapshotter for containerd integration.
-    
-    This component bridges Nydus with containerd, enabling
-    container runtime to use Nydus-optimized images.
-    """
-    print(f"Installing Nydus Snapshotter v{NYDUS_SNAPSHOTTER_VERSION}...")
-    
-    # Download Nydus Snapshotter
-    url = f"https://github.com/containerd/nydus-snapshotter/releases/download/v{NYDUS_SNAPSHOTTER_VERSION}/nydus-snapshotter-v{NYDUS_SNAPSHOTTER_VERSION}-linux-amd64.tar.gz"
-    download_and_extract(url)
-    
-    # Install the containerd-nydus-grpc binary
-    print("  Installing Nydus Snapshotter binary...")
-    run_command(['cp', 'bin/containerd-nydus-grpc', '/usr/local/bin/'])
-    run_command(['chmod', '+x', '/usr/local/bin/containerd-nydus-grpc'])
-    
-    # Clean up temporary files
-    shutil.rmtree('bin', ignore_errors=True)
-
-def install_nerdctl():
-    """
-    Install nerdctl - containerd-compatible Docker CLI.
-    
-    nerdctl provides a Docker-compatible command line interface
-    for containerd, enabling easy container management.
-    """
-    print(f"Installing nerdctl v{NERDCTL_VERSION}...")
-    
-    # Download nerdctl
-    url = f"https://github.com/containerd/nerdctl/releases/download/v{NERDCTL_VERSION}/nerdctl-{NERDCTL_VERSION}-linux-amd64.tar.gz"
-    download_and_extract(url)
-    
-    # Install nerdctl binary
-    print("  Installing nerdctl binary...")
-    run_command(['cp', 'nerdctl', '/usr/local/bin/'])
-    
-    # Clean up temporary files
-    os.remove('nerdctl')
-
-def install_cni_plugins():
-    """
-    Install Container Network Interface (CNI) plugins.
-    
-    CNI plugins provide networking capabilities for containers,
-    enabling network isolation and communication.
-    """
-    print("Installing CNI plugins...")
-    
-    # Create CNI plugin directory
-    print("  Creating CNI plugin directory...")
-    os.makedirs('/opt/cni/bin', exist_ok=True)
-    
-    # Download and install CNI plugins
-    url = f"https://github.com/containernetworking/plugins/releases/download/{CNI_VERSION}/cni-plugins-linux-amd64-{CNI_VERSION}.tgz"
-    filename = url.split('/')[-1]
-    
-    print(f"  Downloading CNI plugins {CNI_VERSION}...")
-    run_command(['wget', url])
-    
-    print("  Installing CNI plugins...")
-    run_command(['tar', '-xzf', filename, '-C', '/opt/cni/bin'])
-    os.remove(filename)
-
-def test_nydus_installation():
-    """
-    Verify that Nydus components are properly installed.
-    
-    Tests the installation by checking version information
-    for core Nydus tools.
-    """
-    print("Testing Nydus installation...")
-    
-    # List of Nydus tools to test
-    commands = [
-        ['nydus-image', '--version'],  # Image conversion tool
-        ['nydusd', '--version'],       # Nydus daemon
-        ['nydusify', '--version']      # Image format converter
-    ]
-    
-    # Test each tool and report any failures
-    for cmd in commands:
-        try:
-            result = run_command(cmd)
-            print(f"  ✓ {cmd[0]} is working")
-        except subprocess.CalledProcessError as e:
-            print(f"  ✗ Warning: {' '.join(cmd)} failed: {e}")
-
-def configure_nydus_snapshotter():
-    """
-    Create configuration files for Nydus Snapshotter.
-    
-    Sets up the nydusd daemon configuration with optimized
-    settings for registry backend and filesystem prefetching.
-    """
-    print("=== Nydus Snapshotter Configuration Deployment ===")
-    
-    # Create Nydus configuration directory
-    print("  Creating Nydus configuration directory...")
-    os.makedirs('/etc/nydus', exist_ok=True)
-    
-    # Nydus daemon configuration for FUSE mode
-    config_content = """{
-  "device": {
-    "backend": {
-      "type": "registry",
-      "config": {
-        "timeout": 5,
-        "connect_timeout": 5,
-        "retry_limit": 2
-      }
-    },
-    "cache": {
-      "type": "blobcache"
-    }
-  },
-  "mode": "direct",
-  "digest_validate": false,
-  "iostats_files": false,
-  "enable_xattr": true,
-  "amplify_io": 1048576,
-  "fs_prefetch": {
-    "enable": true,
-    "threads_count": 64,
-    "merging_size": 1048576,
-    "prefetch_all": true
-  }
-}"""
-    
-    # Write configuration file
-    print("  Writing Nydus daemon configuration...")
-    with open('/etc/nydus/nydusd-config.fusedev.json', 'w') as f:
-        f.write(config_content)
-
-def install_soci():
-    """
-    Install SOCI (Seekable OCI) snapshotter.
-    
-    SOCI is AWS's container image format that enables
-    faster container startup through lazy loading.
-    """
-    print("------------------ Installing Soci -------------------------------")
-    print(f"Installing SOCI v{SOCI_VERSION}...")
-    
-    # Download SOCI snapshotter
-    url = f"https://github.com/awslabs/soci-snapshotter/releases/download/v{SOCI_VERSION}/soci-snapshotter-{SOCI_VERSION}-linux-amd64.tar.gz"
-    filename = url.split('/')[-1]
-    
-    print("  Downloading SOCI snapshotter...")
-    run_command(['wget', url])
-    
-    # Extract specific binaries directly to system path
-    print("  Installing SOCI binaries...")
-    run_command(['tar', '-C', '/usr/local/bin', '-xvf', filename, 'soci', 'soci-snapshotter-grpc'])
-    os.remove(filename)
-
-def install_stargz():
-    """
-    Install StarGZ snapshotter.
-    
-    StarGZ (Stargz/eStargz) is Google's container image format
-    that provides lazy loading capabilities similar to Nydus.
-    """
-    print("------------------ Installing (e)StarGZ -------------------------------")
-    print(f"Installing StarGZ v{STARGZ_VERSION}...")
-    
-    # Download StarGZ snapshotter
-    url = f"https://github.com/containerd/stargz-snapshotter/releases/download/v{STARGZ_VERSION}/stargz-snapshotter-v{STARGZ_VERSION}-linux-amd64.tar.gz"
-    filename = url.split('/')[-1]
-    
-    print("  Downloading StarGZ snapshotter...")
-    run_command(['wget', url])
-    
-    # Extract specific binaries directly to system path
-    print("  Installing StarGZ binaries...")
-    run_command(['tar', '-C', '/usr/local/bin', '-xvf', filename, 'containerd-stargz-grpc', 'ctr-remote'])
-    os.remove(filename)
-
-def setup_systemd_services(snapshotters):
-    """
-    Create and start systemd services for specified snapshotters.
-
-    Creates service files for each snapshotter daemon and starts them.
-    This enables automatic startup and management via systemctl.
-
-    Args:
-        snapshotters: List of snapshotters to set up ('nydus', 'soci', 'stargz')
-    """
-    print("------------------ Setting up Snapshotter Services -------------------------------")
-
-    services_to_start = []
-
-    if 'nydus' in snapshotters:
-        # Nydus Snapshotter service configuration
-        print("  Creating Nydus Snapshotter service...")
-        nydus_service = """[Unit]
-Description=nydus snapshotter (fuse mode)
-After=network.target
-
-[Service]
-Type=simple
-ExecStart=/usr/local/bin/containerd-nydus-grpc --nydusd-config /etc/nydus/nydusd-config.fusedev.json
-Restart=always
-StandardOutput=journal
-StandardError=journal
-
-[Install]
-WantedBy=multi-user.target
-"""
-
-        with open('/etc/systemd/system/nydus-snapshotter-fuse.service', 'w') as f:
-            f.write(nydus_service)
-        services_to_start.append('nydus-snapshotter-fuse.service')
-
-    if 'soci' in snapshotters:
-        # SOCI Snapshotter service configuration
-        print("  Creating SOCI Snapshotter service...")
-        soci_service = """[Unit]
-Description=SOCI Snapshotter GRPC daemon
-After=network.target
-
-[Service]
-Type=simple
-ExecStart=/usr/local/bin/soci-snapshotter-grpc
-Restart=on-failure
-
-[Install]
-WantedBy=multi-user.target
-"""
-
-        with open('/etc/systemd/system/soci-snapshotter-grpc.service', 'w') as f:
-            f.write(soci_service)
-        services_to_start.append('soci-snapshotter-grpc.service')
-
-    if 'stargz' in snapshotters:
-        # StarGZ Snapshotter service configuration
-        print("  Creating StarGZ Snapshotter service...")
-        stargz_service = """[Unit]
-Description=Stargz Snapshotter daemon
-After=network.target
-
-[Service]
-Type=simple
-ExecStart=/usr/local/bin/containerd-stargz-grpc
-Restart=on-failure
-
-[Install]
-WantedBy=multi-user.target
-"""
-
-        with open('/etc/systemd/system/stargz-snapshotter.service', 'w') as f:
-            f.write(stargz_service)
-        services_to_start.append('stargz-snapshotter.service')
-
-    # Start all snapshotter services
-    if services_to_start:
-        print("  Starting snapshotter services...")
-        for service in services_to_start:
-            print(f"    Starting {service}...")
-            run_command(['systemctl', 'start', service])
-
-def setup_containerd(snapshotters):
-    """
-    Configure containerd to use the installed snapshotters.
-
-    Creates containerd configuration that registers specified
-    snapshotters as proxy plugins, then restarts containerd.
-
-    Args:
-        snapshotters: List of snapshotters to configure ('nydus', 'soci', 'stargz')
-    """
-    print("------------------ Setting up Containerd -------------------------------")
-
-    # Ensure containerd configuration directory exists
-    print("  Creating containerd configuration directory...")
-    os.makedirs('/etc/containerd', exist_ok=True)
-
-    # Build containerd configuration with proxy plugins for specified snapshotters
-    containerd_config = "version = 2\n\n[proxy_plugins]\n"
-
-    if 'soci' in snapshotters:
-        containerd_config += """  [proxy_plugins.soci]
-    type = "snapshot"
-    address = "/run/soci-snapshotter-grpc/soci-snapshotter-grpc.sock"
-"""
-
-    if 'nydus' in snapshotters:
-        containerd_config += """  [proxy_plugins.nydus]
-    type = "snapshot"
-    address = "/run/containerd-nydus/containerd-nydus-grpc.sock"
-"""
-
-    if 'stargz' in snapshotters:
-        containerd_config += """  [proxy_plugins.stargz]
-    type = "snapshot"
-    address = "/run/containerd-stargz-grpc/containerd-stargz-grpc.sock"
-    [proxy_plugins.stargz.exports]
-      root = "/var/lib/containerd-stargz-grpc/"
-"""
-
-    # Write containerd configuration
-    print("  Writing containerd configuration...")
-    with open('/etc/containerd/config.toml', 'w') as f:
-        f.write(containerd_config)
-
-    # Restart containerd to apply new configuration
-    print("  Restarting containerd service...")
-    run_command(['systemctl', 'restart', 'containerd'])
-
-def main():
-    """
-    Main installation orchestrator.
-
-    Performs the complete installation sequence:
-    1. Verify root privileges
-    2. Install specified snapshotter components and dependencies
-    3. Configure services and containerd integration
-    4. Start all services
-
-    Uses a temporary directory for downloads to avoid cluttering
-    the current working directory.
-    """
-    import argparse
-
-    # Parse command line arguments
-    parser = argparse.ArgumentParser(
-        description="Install container snapshotters for lazy-loading container images.",
-        formatter_class=argparse.RawDescriptionHelpFormatter,
-        epilog="""
-Examples:
-  # Install only Nydus (default)
-  sudo python3 install_snapshotters.py
-
-  # Install all snapshotters
-  sudo python3 install_snapshotters.py --snapshotters nydus,soci,stargz
-
-  # Install Nydus and SOCI
-  sudo python3 install_snapshotters.py --snapshotters nydus,soci
-        """)
-
-    parser.add_argument(
-        "--snapshotters",
-        default="nydus",
-        help="Comma-separated list of snapshotters to install (nydus,soci,stargz). Default: nydus"
-    )
-
-    args = parser.parse_args()
-
-    # Parse and validate snapshotters
-    requested_snapshotters = [s.strip() for s in args.snapshotters.split(",")]
-    valid_snapshotters = {"nydus", "soci", "stargz"}
-    invalid_snapshotters = set(requested_snapshotters) - valid_snapshotters
-
-    if invalid_snapshotters:
-        print(f"Error: Invalid snapshotters: {invalid_snapshotters}")
-        print(f"Valid options: {valid_snapshotters}")
-        sys.exit(1)
-
-    # Ensure script is run with root privileges
-    check_root()
-
-    snapshotter_names = ", ".join(requested_snapshotters)
-    print("Starting container snapshotter installation...")
-    print(f"Installing: {snapshotter_names}, nerdctl, and CNI plugins")
-    print()
-
-    # Use temporary directory for all downloads and extraction
-    with tempfile.TemporaryDirectory() as tmpdir:
-        original_dir = os.getcwd()
-        os.chdir(tmpdir)
-
-        try:
-            # Install core container runtime tools first
-            install_nerdctl()
-            install_cni_plugins()
-
-            # Install Nydus components if requested
-            if 'nydus' in requested_snapshotters:
-                install_nydus()
-                install_nydus_snapshotter()
-                test_nydus_installation()
-                configure_nydus_snapshotter()
-
-            # Install SOCI if requested
-            if 'soci' in requested_snapshotters:
-                install_soci()
-
-            # Install StarGZ if requested
-            if 'stargz' in requested_snapshotters:
-                install_stargz()
-
-            # Set up system integration for installed snapshotters
-            setup_systemd_services(requested_snapshotters)
-            setup_containerd(requested_snapshotters)
-
-        finally:
-            # Return to original directory
-            os.chdir(original_dir)
-
-    print()
-    print("------------------ INSTALLATION COMPLETE -------------------")
-    print(f"Installed snapshotters: {snapshotter_names}")
-    print("You can now use nerdctl with --snapshotter flag to specify:")
-    for snapshotter in requested_snapshotters:
-        print(f"  --snapshotter={snapshotter}")
-
-if __name__ == "__main__":
-    main()
diff --git a/scripts/setup.py b/scripts/setup.py
new file mode 100755
index 0000000..d1c31fd
--- /dev/null
+++ b/scripts/setup.py
@@ -0,0 +1,550 @@
+#!/usr/bin/env python3
+"""
+FastPull Setup Script
+
+Installs containerd, Nydus snapshotter, and FastPull CLI via pip.
+"""
+
+import argparse
+import os
+import subprocess
+import sys
+
+
+SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
+PROJECT_ROOT = os.path.dirname(SCRIPT_DIR)
+VENV_PATH = os.path.join(PROJECT_ROOT, '.venv')
+FASTPULL_BIN = '/usr/local/bin/fastpull'
+
+
+def run_command(cmd, check=True, capture_output=False, shell=False):
+    """Run a command and return result."""
+    try:
+        if shell:
+            result = subprocess.run(cmd, shell=True, check=check, capture_output=capture_output, text=True)
+        else:
+            result = subprocess.run(cmd, check=check, capture_output=capture_output, text=True)
+        return result
+    except subprocess.CalledProcessError as e:
+        if not check:
+            return e
+        raise
+
+
+def detect_package_manager():
+    """Detect the system package manager."""
+    # Check for apt (Debian/Ubuntu)
+    if os.path.exists('/usr/bin/apt-get') or os.path.exists('/usr/bin/apt'):
+        return 'apt'
+    # Check for yum (RHEL/CentOS 7)
+    elif os.path.exists('/usr/bin/yum'):
+        return 'yum'
+    # Check for dnf (RHEL/CentOS 8+/Fedora)
+    elif os.path.exists('/usr/bin/dnf'):
+        return 'dnf'
+    else:
+        return None
+
+
+def install_system_dependencies():
+    """Install required system packages (python3-venv, wget)."""
+    pkg_mgr = detect_package_manager()
+
+    if not pkg_mgr:
+        print("⚠ Warning: Could not detect package manager (apt/yum/dnf)")
+        print("Please manually install: python3-venv, wget")
+        return False
+
+    print(f"Detected package manager: {pkg_mgr}")
+    print("Installing system dependencies (python3-venv, wget)...")
+
+    try:
+        if pkg_mgr == 'apt':
+            # Update package list and install dependencies
+            run_command(['apt-get', 'update', '-qq'], check=True)
+            run_command(['apt-get', 'install', '-y', 'python3-venv', 'wget'], check=True)
+        elif pkg_mgr == 'yum':
+            run_command(['yum', 'install', '-y', 'python3-venv', 'wget'], check=True)
+        elif pkg_mgr == 'dnf':
+            run_command(['dnf', 'install', '-y', 'python3-venv', 'wget'], check=True)
+
+        print("✓ System dependencies installed")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"✗ Failed to install system dependencies: {e}")
+        return False
+
+
+def check_root():
+    """Check if running as root."""
+    if os.geteuid() != 0:
+        print("Error: This script must be run as root (use sudo)")
+        sys.exit(1)
+
+
+def install_containerd_nerdctl():
+    """Install containerd and nerdctl."""
+    print("\n" + "="*60)
+    print("Installing Containerd & Nerdctl")
+    print("="*60)
+
+    # Check if already installed
+    nerdctl_path = "/usr/local/bin/nerdctl"
+    if os.path.exists(nerdctl_path):
+        print(f"✓ nerdctl already installed at {nerdctl_path}")
+        result = run_command([nerdctl_path, "--version"], capture_output=True)
+        print(f"  {result.stdout.strip()}")
+        return True
+
+    print("\nInstalling containerd and nerdctl...")
+
+    install_script = """
+set -e
+
+cd /tmp
+
+# Remove old download if exists
+rm -f /tmp/nerdctl-full.tar.gz
+
+# Download nerdctl-full
+NERDCTL_VERSION="1.7.3"
+echo "Downloading nerdctl-full ${NERDCTL_VERSION}..."
+wget -O /tmp/nerdctl-full.tar.gz https://github.com/containerd/nerdctl/releases/download/v${NERDCTL_VERSION}/nerdctl-full-${NERDCTL_VERSION}-linux-amd64.tar.gz
+
+# Extract to /usr/local
+echo "Extracting to /usr/local..."
+tar -C /usr/local -xzf /tmp/nerdctl-full.tar.gz
+
+# Enable and start containerd service
+echo "Enabling containerd service..."
+systemctl enable containerd
+systemctl start containerd
+
+# Clean up
+rm -f /tmp/nerdctl-full.tar.gz
+
+echo "✓ Containerd and nerdctl installed"
+"""
+
+    try:
+        result = run_command(install_script, shell=True, capture_output=True)
+        print("✓ Containerd and nerdctl installed successfully")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"✗ Failed to install containerd: {e}")
+        if e.stdout:
+            print(f"stdout: {e.stdout}")
+        if e.stderr:
+            print(f"stderr: {e.stderr}")
+        return False
+
+
+def install_nydus():
+    """Install Nydus snapshotter."""
+    print("\n" + "="*60)
+    print("Installing Nydus Snapshotter")
+    print("="*60)
+
+    nydus_path = "/usr/local/bin/containerd-nydus-grpc"
+    service_path = "/etc/systemd/system/fastpull.service"
+
+    # Check if binary exists
+    if os.path.exists(nydus_path):
+        print(f"✓ Nydus binary found at {nydus_path}")
+        # Always recreate service and config (to ensure latest settings)
+        print("Updating service and configuration...")
+        create_nydus_service()
+        return True
+
+    install_script = """
+set -e
+
+NYDUS_SNAPSHOTTER_VERSION="0.15.3"
+echo "Downloading Nydus Snapshotter v${NYDUS_SNAPSHOTTER_VERSION}..."
+
+# Download Nydus Snapshotter
+cd /tmp
+wget https://github.com/containerd/nydus-snapshotter/releases/download/v${NYDUS_SNAPSHOTTER_VERSION}/nydus-snapshotter-v${NYDUS_SNAPSHOTTER_VERSION}-linux-amd64.tar.gz
+
+# Extract and install
+tar -xzf nydus-snapshotter-v${NYDUS_SNAPSHOTTER_VERSION}-linux-amd64.tar.gz
+cp bin/containerd-nydus-grpc /usr/local/bin/
+chmod +x /usr/local/bin/containerd-nydus-grpc
+
+# Also install nydusd (required by snapshotter)
+NYDUS_VERSION="v2.3.6"
+echo "Downloading Nydus tools ${NYDUS_VERSION}..."
+wget -O nydus.tgz https://github.com/dragonflyoss/nydus/releases/download/${NYDUS_VERSION}/nydus-static-${NYDUS_VERSION}-linux-amd64.tgz
+tar xzf nydus.tgz
+cp nydus-static/nydusd /usr/local/bin/
+cp nydus-static/nydus-image /usr/local/bin/
+cp nydus-static/nydusify /usr/local/bin/
+chmod +x /usr/local/bin/nydusd /usr/local/bin/nydus-image /usr/local/bin/nydusify
+
+# Clean up
+rm -rf bin nydus-snapshotter-v${NYDUS_SNAPSHOTTER_VERSION}-linux-amd64.tar.gz nydus-static nydus.tgz
+
+echo "✓ Nydus binaries installed"
+"""
+
+    try:
+        result = run_command(install_script, shell=True, capture_output=True)
+        print("✓ Nydus binaries installed successfully")
+
+        # Now create the service (shared code)
+        create_nydus_service()
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"✗ Failed to install Nydus: {e}")
+        if e.stderr:
+            print(f"stderr: {e.stderr}")
+        return False
+
+
+def create_nydus_service():
+    """Create systemd service for Nydus snapshotter."""
+    service_script = """
+# Create systemd service
+cat > /etc/systemd/system/fastpull.service <<'EOF'
+[Unit]
+Description=nydus snapshotter (fuse mode)
+After=network.target
+
+[Service]
+Type=simple
+ExecStart=/usr/local/bin/containerd-nydus-grpc --nydusd-config /etc/nydus/nydusd-config.fusedev.json
+Restart=always
+StandardOutput=journal
+StandardError=journal
+
+[Install]
+WantedBy=multi-user.target
+EOF
+
+# Create necessary directories
+mkdir -p /etc/nydus
+mkdir -p /var/lib/nydus/cache
+
+# Create Nydus config if it doesn't exist
+if [ ! -f /etc/nydus/nydusd-config.fusedev.json ]; then
+cat > /etc/nydus/nydusd-config.fusedev.json <<'EOF'
+{
+  "device": {
+    "backend": {
+      "type": "registry",
+      "config": {
+        "timeout": 5,
+        "connect_timeout": 5,
+        "retry_limit": 2
+      }
+    },
+    "cache": {
+      "type": "blobcache"
+    }
+  },
+  "mode": "direct",
+  "digest_validate": false,
+  "iostats_files": false,
+  "enable_xattr": true,
+  "amplify_io": 10485760,
+  "fs_prefetch": {
+    "enable": true,
+    "threads_count": 16,
+    "merging_size": 1048576,
+    "prefetch_all": true
+  }
+}
+EOF
+fi
+
+# Enable and start service
+systemctl daemon-reload
+systemctl enable fastpull.service
+systemctl start fastpull.service
+
+echo "✓ Nydus service created and started"
+"""
+
+    try:
+        run_command(service_script, shell=True, capture_output=True)
+        print("✓ Created and started fastpull.service")
+        return True
+    except subprocess.CalledProcessError as e:
+        print(f"✗ Failed to create service: {e}")
+        return False
+
+
+def configure_containerd_for_nydus():
+    """Configure containerd to use Nydus snapshotter."""
+    print("\nConfiguring containerd for Nydus...")
+
+    config_dir = "/etc/containerd"
+    config_file = os.path.join(config_dir, "config.toml")
+
+    os.makedirs(config_dir, exist_ok=True)
+
+    # Create containerd config with Nydus proxy plugin
+    config_content = """version = 2
+
+[proxy_plugins]
+  [proxy_plugins.nydus]
+    type = "snapshot"
+    address = "/run/containerd-nydus/containerd-nydus-grpc.sock"
+
+[plugins."io.containerd.grpc.v1.cri".containerd]
+  snapshotter = "nydus"
+  disable_snapshot_annotations = false
+"""
+
+    with open(config_file, 'w') as f:
+        f.write(config_content)
+
+    print(f"✓ Updated containerd config at {config_file}")
+
+    # Restart fastpull service first
+    print("Restarting fastpull service...")
+    run_command(["systemctl", "restart", "fastpull.service"], check=False)
+
+    # Then restart containerd service
+    print("Restarting containerd service...")
+    run_command(["systemctl", "restart", "containerd.service"], check=False)
+
+    print("✓ Services restarted")
+
+    return True
+
+
+def install_cli():
+    """Install fastpull CLI via pip in a venv."""
+    print("\n" + "="*60)
+    print("Installing FastPull CLI")
+    print("="*60)
+
+    try:
+        # Create venv if it doesn't exist
+        if not os.path.exists(VENV_PATH):
+            print(f"Creating virtual environment at {VENV_PATH}...")
+            result = run_command(['python3', '-m', 'venv', VENV_PATH], check=False, capture_output=True)
+            if result.returncode != 0:
+                print(f"✗ Failed to create venv: {result.stderr}")
+                return False
+            print(f"✓ Created virtual environment")
+
+        # Get pip path in venv
+        venv_pip = os.path.join(VENV_PATH, 'bin', 'pip')
+        venv_python = os.path.join(VENV_PATH, 'bin', 'python3')
+
+        # Install fastpull in venv
+        print("Installing fastpull in virtual environment...")
+        result = run_command([venv_pip, 'install', '-e', PROJECT_ROOT], check=False, capture_output=True)
+        if result.returncode != 0:
+            print(f"✗ Failed to install in venv: {result.stderr}")
+            return False
+        print("✓ Installed fastpull in virtual environment")
+
+        # Create wrapper script in /usr/local/bin
+        wrapper_script = f"""#!/bin/bash
+# FastPull CLI wrapper script
+# Activates venv and runs fastpull
+
+exec {venv_python} -m scripts.fastpull.cli "$@"
+"""
+
+        print(f"Creating wrapper script at {FASTPULL_BIN}...")
+        with open(FASTPULL_BIN, 'w') as f:
+            f.write(wrapper_script)
+        os.chmod(FASTPULL_BIN, 0o755)
+        print(f"✓ Created fastpull command at {FASTPULL_BIN}")
+
+        return True
+
+    except Exception as e:
+        print(f"✗ Failed to install fastpull: {e}")
+        return False
+
+
+def verify_installation():
+    """Verify fastpull installation."""
+    print("\n" + "="*60)
+    print("Verifying Installation")
+    print("="*60)
+
+    # Test CLI
+    try:
+        result = run_command(['fastpull', '--version'], capture_output=True, check=False)
+        if result.returncode == 0:
+            print(f"✓ fastpull CLI: {result.stdout.strip()}")
+        else:
+            print(f"✗ fastpull CLI not found in PATH")
+            print("Try running: hash -r (or restart your shell)")
+            return False
+    except Exception as e:
+        print(f"✗ fastpull CLI test failed: {e}")
+        return False
+
+    # Check nerdctl
+    nerdctl_path = "/usr/local/bin/nerdctl"
+    if os.path.exists(nerdctl_path):
+        try:
+            result = run_command([nerdctl_path, "--version"], capture_output=True)
+            print(f"✓ nerdctl: {result.stdout.strip().split()[2]}")
+        except:
+            print(f"  nerdctl found but version check failed")
+
+    # Check containerd service
+    try:
+        result = run_command(["systemctl", "is-active", "containerd.service"], capture_output=True)
+        if result.returncode == 0:
+            print(f"✓ containerd service: active")
+        else:
+            print(f"  containerd service: {result.stdout.strip()}")
+    except:
+        print(f"  Could not check containerd service")
+
+    # Check FastPull service
+    try:
+        result = run_command(["systemctl", "is-active", "fastpull.service"], capture_output=True)
+        if result.returncode == 0:
+            print(f"✓ fastpull service: active")
+        else:
+            print(f"  fastpull service: {result.stdout.strip()}")
+    except:
+        print(f"  Could not check fastpull service")
+
+    return True
+
+
+def main():
+    """Main setup function."""
+    parser = argparse.ArgumentParser(
+        description='Install FastPull with containerd and Nydus snapshotter',
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  # Full installation (containerd + Nydus + CLI)
+  sudo python3 scripts/setup.py
+
+  # Install only CLI (skip containerd/Nydus setup)
+  sudo python3 scripts/setup.py --cli-only
+
+  # Uninstall fastpull CLI
+  sudo python3 scripts/setup.py --uninstall
+"""
+    )
+    parser.add_argument(
+        '--cli-only',
+        action='store_true',
+        help='Install only the fastpull CLI, skip containerd/Nydus setup'
+    )
+    parser.add_argument(
+        '--uninstall',
+        action='store_true',
+        help='Uninstall fastpull CLI'
+    )
+
+    args = parser.parse_args()
+
+    # Check root
+    check_root()
+
+    if args.uninstall:
+        print("Uninstalling fastpull...")
+        removed = False
+
+        # Remove wrapper script
+        if os.path.exists(FASTPULL_BIN):
+            os.remove(FASTPULL_BIN)
+            print(f"✓ Removed {FASTPULL_BIN}")
+            removed = True
+
+        # Remove venv
+        if os.path.exists(VENV_PATH):
+            import shutil
+            shutil.rmtree(VENV_PATH)
+            print(f"✓ Removed virtual environment at {VENV_PATH}")
+            removed = True
+
+        if removed:
+            print("✓ Uninstall complete")
+        else:
+            print("✗ fastpull not found or already uninstalled")
+        return
+
+    print("="*60)
+    print("FastPull Setup")
+    print("="*60)
+
+    if args.cli_only:
+        print("\nThis will install:")
+        print("  • FastPull CLI tool (via pip)")
+        print()
+    else:
+        print("\nThis will install:")
+        print("  • Containerd and nerdctl")
+        print("  • Nydus snapshotter")
+        print("  • FastPull CLI tool (via pip)")
+        print()
+
+    # Install system dependencies first
+    print("\n" + "="*60)
+    print("Installing System Dependencies")
+    print("="*60)
+    if not install_system_dependencies():
+        print("\n⚠ Warning: System dependencies installation had issues")
+        print("Continuing anyway, but you may encounter errors...")
+
+    # Track installation status
+    success = True
+    warnings = []
+
+    if not args.cli_only:
+        # Install containerd and nerdctl
+        if not install_containerd_nerdctl():
+            print("\n⚠ Warning: Containerd installation failed")
+            print("You can still install the CLI with --cli-only")
+            sys.exit(1)
+
+        # Install Nydus snapshotter
+        if not install_nydus():
+            print("\n⚠ Warning: Nydus installation failed")
+            success = False
+            warnings.append("Nydus snapshotter installation failed")
+        else:
+            # Only configure containerd if Nydus installed successfully
+            configure_containerd_for_nydus()
+
+    # Install CLI
+    if not install_cli():
+        print("\nSetup incomplete: CLI installation failed")
+        if not args.cli_only:
+            print("Note: Snapshotters may have been installed")
+        sys.exit(1)
+
+    # Verify
+    verify_installation()
+
+    print("\n" + "="*60)
+    if success:
+        print("✅ Fastpull installed successfully on your VM")
+    else:
+        print("⚠️  Fastpull installed with warnings")
+        print("\nWarnings:")
+        for warning in warnings:
+            print(f"  • {warning}")
+    print("="*60)
+    print("\n📋 Usage:")
+    print("  fastpull --help")
+    print("  fastpull run --help")
+    print("  fastpull build --help")
+    print("  fastpull quickstart --help")
+    if not args.cli_only:
+        print("\n🔍 Check services:")
+        print("  systemctl status containerd")
+        print("  systemctl status fastpull")
+    print("\n📖 Example:")
+    print("  fastpull quickstart tensorrt")
+    print("="*60)
+
+
+if __name__ == '__main__':
+    main()