PassNet is an AI system for compiler optimization that leverages LLM-driven agents to automatically generate high-performance GPU kernels through compiler pass mechanisms for computation graph optimization. PassNet includes a complete optimization toolchain, the PassBench evaluation benchmark, and the PassAgent agent evaluation framework.
English | 中文
- Project Structure
- Architecture Overview
- Core Components
- DataSet
- Quick Start
- PassBench Evaluation Pipeline
- PassAgent Evaluation
- License
PassNet/
├── pass_bench/ # PassBench compiler evaluation framework: kernel compilation, correctness verification, performance benchmarking
├── pass_agent/ # PassAgent evaluation framework
├── samples/ # PassBench sample data
├── sample_lists/ # PassBench sample list files (eval/train splits)
├── entry_scripts/ # Evaluation entry scripts
├── graphs/ # Subgraph data
├── graph_lists/ # Subgraph lists and grouping info
├── test/ # Unit tests
├── Dockerfile.nvidia # Docker image definition
└── requirements.txt # Python dependencies
┌─────────────────────────────────────────────────────────────────────────┐
│ PassAgent │
│ (LLM-driven Pass Generation) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │◄───┐
│ │ Multi-step Iterative Solving · k-attempts · R2E-Gym Framework │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │ │
└────────────────┬───────────────────────────────────────┬────────────────┘ │
read data │ generated pass │ │
▼ ▼ │
┌───────────────────────────────────┐ ┌───────────────────────────────┐ │
│ DataSet │ │ PassBench │ │
│ ┌─────────────────────────────┐ │ │ ┌──────────────────────────┐ │ │
│ │ graphs/ │ │ │ │ 1. Execution & Eval │ │ │
│ │ sole_op (5,939) │ │ │ │ Eager Execution │ │ │
│ │ fusible (22,870) │ │ │ │ pass_mgr Execution │ │ │
│ │ typical (25,151) │ │ │ └────────────┬─────────────┘ │ │
│ └─────────────────────────────┘ │ │ │ │ │
│ ┌─────────────────────────────┐ │ │ ▼ │ feedback
│ │ samples/ │ │ │ ┌──────────────────────────┐ │ │
│ │ sole_op (1,029) │ │ │ │ 2. Result Checking │ │ │
│ │ fusible (4,676) │ │ │ │ Correctness & Speedup │ │ │
│ │ typical (4,278) │ │ │ └────────────┬─────────────┘ │ │
│ └─────────────────────────────┘ │ │ │ │ │
│ ┌─────────────────────────────┐ │ │ ▼ │ │
│ │ sample_lists/ │ │ │ ┌──────────────────────────┐ │ │
│ │ train/ │ │ │ │ 3. Score Aggregation │ │ │
│ │ eval/ │ │ │ │ ES(t) & AS Met │ │ │
│ └─────────────────────────────┘ │ │ └──────────────────────────┘ │ │
└───────────────────────────────────┘ └───────────────────────────────┘ │
└─────────────────────┘
PassBench — Compiler Evaluation Framework
Provides kernel compilation, correctness verification, and performance benchmarking:
- Kernel Compilation: Executes pass matching and replacement via the
pass_mgrcompiler method - Correctness Verification: Validates numerical correctness of optimized kernels using tolerance-based comparison
- Performance Benchmarking: Measures speedup and other metrics, outputs
aggregated_score.json - Score Aggregation:
aggregate_es_scores.pyaggregates results from multiple evaluation runs
PassAgent — R2E-Gym Agent Evaluation Framework
Evaluates agent capabilities for compiler optimization using the R2E-Gym framework. See pass_agent/README.md for details.
Stores raw computation subgraphs extracted from deep learning models, serving as the source for PassBench samples:
- fusible_subgraphs/: A small set of example fusible subgraphs (1,456), containing computation graphs with multi-operator fusion opportunities
- hf_subgraphs/ (Legacy): Previous version subgraph data, containing sole op (1,410), fusible (4,167), and typical (6,157) categories
- hf_subgraphs_v2/: HuggingFace model subgraphs, organized into three categories:
sole_op_subgraphs: Single-operator subgraphs (5,939)fusible_subgraphs: Fusible subgraphs (22,870)typical_subgraphs: Typical subgraphs (25,151)
Stores subgraph path lists, UID groupings, and other information for sample filtering and group management:
Subgraph Path Lists (line format: subgraph_UID\tsubgraph_relative_path)
| File | Subgraphs | Description |
|---|---|---|
fusible_subgraphs.txt |
1,455 | Example fusible subgraph paths |
hf_sole_op_subgraphs.txt |
1,410 | Legacy sole op subgraph paths |
hf_fusible_subgraphs.txt |
4,166 | Legacy fusible subgraph paths |
hf_typical_subgraphs.txt |
6,157 | Legacy typical subgraph paths |
hf_sole_op_subgraphs_v2.txt |
5,939 | v2 sole op subgraph paths |
hf_fusible_subgraphs_v2.txt |
22,870 | v2 fusible subgraph paths |
hf_typical_subgraphs_v2.txt |
25,151 | v2 typical subgraph paths |
Evaluation samples generated from graphs/, each serving as an independently executable evaluation unit:
- fusible_subgraphs/: A small set of example samples from TIMM models' fusible subgraphs, organized by
model_name/subgraph_index - hf_subgraphs/ (Legacy): Previous version subgraph samples, containing sole op (590), fusible (2,489), and typical (3,382) categories
- hf_subgraphs_v2/: v2 subgraph samples with extended multi-dtype support, containing sole op (1,029), fusible (4,676), and typical (4,278) categories, organized by hash path
xx/yy/hash/, dataset published at PassNet/PassNet
Each sample directory contains:
| File | Description |
|---|---|
entry.sh |
Evaluation entry script that executes compilation, verification, and performance statistics |
graph_list.txt |
List of computation graphs included in the sample |
graphs/ |
Computation graph definitions (model.py, weight_meta.py, etc.) |
pass_dir/ |
Output directory for generated optimization passes |
pass_bench/ |
Copy of the evaluation framework (for standalone execution within Docker containers) |
sample_uids.txt |
Unique sample identifier (hf_subgraphs_v2 only) |
Stores sample path lists for evaluation and training, organized by purpose and subgraph type, available in both txt and csv formats:
train/ (Training Set)
| File | Samples | Description |
|---|---|---|
hf_sole_op_train_samples_v2.txt |
1,028 | Sole op subgraph training samples |
hf_fusible_train_samples_v2.txt |
4,476 | Fusible subgraph training samples |
hf_typical_train_samples_v2.txt |
4,078 | Typical subgraph training samples |
hf_sole_op_train_samples.txt (Legacy) |
589 | Legacy sole op subgraph training samples |
hf_fusible_train_samples.txt (Legacy) |
2,289 | Legacy fusible subgraph training samples |
hf_typical_train_samples.txt (Legacy) |
3,182 | Legacy typical subgraph training samples |
eval/ (Evaluation Set)
| File | Samples | Description |
|---|---|---|
hf_fusible_eval_samples_v2.txt |
200 | Fusible subgraph evaluation samples |
hf_typical_eval_samples_v2.txt |
200 | Typical subgraph evaluation samples |
hf_fusible_eval_samples.txt (Legacy) |
200 | Legacy fusible subgraph evaluation samples |
hf_typical_eval_samples.txt (Legacy) |
200 | Legacy typical subgraph evaluation samples |
- Python 3.12+
- PyTorch 2.9+ (CUDA 12.8)
- NVIDIA GPU (CUDA support)
- Docker (optional, for containerized evaluation)
cd /path/to/passnet
# Install dependencies
pip install -r requirements.txt
# Set environment variables
export PYTHONPATH=$PYTHONPATH:/path/to/passnet# Verify sample evaluation
bash samples/fusible_subgraphs/crossvit_15_dagger_240.in1k/crossvit_15_dagger_240.in1k_0_start14_end16_4/entry.shdocker build . -t passnet:latest -f Dockerfile.nvidiadocker run --gpus all --privileged \
-v <path-to-passnet-project>:/workspace \
-w /workspace \
passnet:latest \
bash samples/fusible_subgraphs/crossvit_15_dagger_240.in1k/crossvit_15_dagger_240.in1k_0_start14_end16_4/entry.shThe PassNet evaluation pipeline works as follows:
- Analyze Computation Graph: Read the target subgraph's
model.pyandweight_meta.py - Generate Optimization Pass: Agent creates pattern matching rules and replacement functions
- Pass Matching and Replacement: The
pass_mgrcompiler applies the generated pass - Correctness Verification: Validate numerical consistency between the optimized and original kernels
- Performance Benchmarking: Measure speedup and output evaluation results
Evaluate agents using the PassAgent framework:
cd pass_agent
pip install -r requirements.txt
python examples/run_pass_agent_demo.py \
--llm-name openai/glm-4.7 \
--llm-base-url <your-llm-base-url> \
--openai-api-key <your-api-key> \
--dataset datasets/passbench_demo_dataset.jsonl \
--max-steps 50 \
--k 10See pass_agent/README.md for details.
Please refer to the license file in the project root directory.