Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

PaddlePaddle/PassNet

Open more actions menu

Repository files navigation

PassNet

Python 3.12 PyTorch 2.9 CUDA 12.8 HuggingFace Dataset

PassNet is an AI system for compiler optimization that leverages LLM-driven agents to automatically generate high-performance GPU kernels through compiler pass mechanisms for computation graph optimization. PassNet includes a complete optimization toolchain, the PassBench evaluation benchmark, and the PassAgent agent evaluation framework.

English | 中文

Table of Contents

Project Structure

PassNet/
├── pass_bench/               # PassBench compiler evaluation framework: kernel compilation, correctness verification, performance benchmarking
├── pass_agent/               # PassAgent evaluation framework
├── samples/                  # PassBench sample data
├── sample_lists/             # PassBench sample list files (eval/train splits)
├── entry_scripts/            # Evaluation entry scripts
├── graphs/                   # Subgraph data
├── graph_lists/              # Subgraph lists and grouping info
├── test/                     # Unit tests
├── Dockerfile.nvidia         # Docker image definition
└── requirements.txt          # Python dependencies

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────┐
│                             PassAgent                                   │
│                    (LLM-driven Pass Generation)                         │
│ ┌─────────────────────────────────────────────────────────────────────┐ │◄───┐
│ │  Multi-step Iterative Solving  ·  k-attempts  ·  R2E-Gym Framework  │ │    │
│ └─────────────────────────────────────────────────────────────────────┘ │    │
└────────────────┬───────────────────────────────────────┬────────────────┘    │
      read data  │                        generated pass │                     │
                 ▼                                       ▼                     │
┌───────────────────────────────────┐    ┌───────────────────────────────┐     │
│             DataSet               │    │          PassBench            │     │
│  ┌─────────────────────────────┐  │    │  ┌──────────────────────────┐ │     │
│  │ graphs/                     │  │    │  │ 1. Execution & Eval      │ │     │
│  │  sole_op  (5,939)           │  │    │  │    Eager Execution       │ │     │
│  │  fusible  (22,870)          │  │    │  │    pass_mgr Execution    │ │     │
│  │  typical  (25,151)          │  │    │  └────────────┬─────────────┘ │     │
│  └─────────────────────────────┘  │    │               │               │     │
│  ┌─────────────────────────────┐  │    │               ▼               │  feedback
│  │ samples/                    │  │    │  ┌──────────────────────────┐ │     │
│  │  sole_op  (1,029)           │  │    │  │ 2. Result Checking       │ │     │
│  │  fusible  (4,676)           │  │    │  │    Correctness & Speedup │ │     │
│  │  typical  (4,278)           │  │    │  └────────────┬─────────────┘ │     │
│  └─────────────────────────────┘  │    │               │               │     │
│  ┌─────────────────────────────┐  │    │               ▼               │     │
│  │ sample_lists/               │  │    │  ┌──────────────────────────┐ │     │
│  │  train/                     │  │    │  │ 3. Score Aggregation     │ │     │
│  │  eval/                      │  │    │  │    ES(t) & AS Met        │ │     │
│  └─────────────────────────────┘  │    │  └──────────────────────────┘ │     │
└───────────────────────────────────┘    └───────────────────────────────┘     │
                                                         └─────────────────────┘

Core Components

PassBench — Compiler Evaluation Framework

Provides kernel compilation, correctness verification, and performance benchmarking:

  • Kernel Compilation: Executes pass matching and replacement via the pass_mgr compiler method
  • Correctness Verification: Validates numerical correctness of optimized kernels using tolerance-based comparison
  • Performance Benchmarking: Measures speedup and other metrics, outputs aggregated_score.json
  • Score Aggregation: aggregate_es_scores.py aggregates results from multiple evaluation runs

PassAgent — R2E-Gym Agent Evaluation Framework

Evaluates agent capabilities for compiler optimization using the R2E-Gym framework. See pass_agent/README.md for details.

DataSet

graphs — Raw Subgraph Data

Stores raw computation subgraphs extracted from deep learning models, serving as the source for PassBench samples:

  • fusible_subgraphs/: A small set of example fusible subgraphs (1,456), containing computation graphs with multi-operator fusion opportunities
  • hf_subgraphs/ (Legacy): Previous version subgraph data, containing sole op (1,410), fusible (4,167), and typical (6,157) categories
  • hf_subgraphs_v2/: HuggingFace model subgraphs, organized into three categories:
    • sole_op_subgraphs: Single-operator subgraphs (5,939)
    • fusible_subgraphs: Fusible subgraphs (22,870)
    • typical_subgraphs: Typical subgraphs (25,151)

graph_lists — Subgraph Lists and Grouping

Stores subgraph path lists, UID groupings, and other information for sample filtering and group management:

Subgraph Path Lists (line format: subgraph_UID\tsubgraph_relative_path)

File Subgraphs Description
fusible_subgraphs.txt 1,455 Example fusible subgraph paths
hf_sole_op_subgraphs.txt 1,410 Legacy sole op subgraph paths
hf_fusible_subgraphs.txt 4,166 Legacy fusible subgraph paths
hf_typical_subgraphs.txt 6,157 Legacy typical subgraph paths
hf_sole_op_subgraphs_v2.txt 5,939 v2 sole op subgraph paths
hf_fusible_subgraphs_v2.txt 22,870 v2 fusible subgraph paths
hf_typical_subgraphs_v2.txt 25,151 v2 typical subgraph paths

samples — PassBench Evaluation Samples

Evaluation samples generated from graphs/, each serving as an independently executable evaluation unit:

  • fusible_subgraphs/: A small set of example samples from TIMM models' fusible subgraphs, organized by model_name/subgraph_index
  • hf_subgraphs/ (Legacy): Previous version subgraph samples, containing sole op (590), fusible (2,489), and typical (3,382) categories
  • hf_subgraphs_v2/: v2 subgraph samples with extended multi-dtype support, containing sole op (1,029), fusible (4,676), and typical (4,278) categories, organized by hash path xx/yy/hash/, dataset published at PassNet/PassNet

Each sample directory contains:

File Description
entry.sh Evaluation entry script that executes compilation, verification, and performance statistics
graph_list.txt List of computation graphs included in the sample
graphs/ Computation graph definitions (model.py, weight_meta.py, etc.)
pass_dir/ Output directory for generated optimization passes
pass_bench/ Copy of the evaluation framework (for standalone execution within Docker containers)
sample_uids.txt Unique sample identifier (hf_subgraphs_v2 only)

sample_lists — Eval/Train Sample Splits

Stores sample path lists for evaluation and training, organized by purpose and subgraph type, available in both txt and csv formats:

train/ (Training Set)

File Samples Description
hf_sole_op_train_samples_v2.txt 1,028 Sole op subgraph training samples
hf_fusible_train_samples_v2.txt 4,476 Fusible subgraph training samples
hf_typical_train_samples_v2.txt 4,078 Typical subgraph training samples
hf_sole_op_train_samples.txt (Legacy) 589 Legacy sole op subgraph training samples
hf_fusible_train_samples.txt (Legacy) 2,289 Legacy fusible subgraph training samples
hf_typical_train_samples.txt (Legacy) 3,182 Legacy typical subgraph training samples

eval/ (Evaluation Set)

File Samples Description
hf_fusible_eval_samples_v2.txt 200 Fusible subgraph evaluation samples
hf_typical_eval_samples_v2.txt 200 Typical subgraph evaluation samples
hf_fusible_eval_samples.txt (Legacy) 200 Legacy fusible subgraph evaluation samples
hf_typical_eval_samples.txt (Legacy) 200 Legacy typical subgraph evaluation samples

Quick Start

Requirements

  • Python 3.12+
  • PyTorch 2.9+ (CUDA 12.8)
  • NVIDIA GPU (CUDA support)
  • Docker (optional, for containerized evaluation)

Installation

cd /path/to/passnet

# Install dependencies
pip install -r requirements.txt

# Set environment variables
export PYTHONPATH=$PYTHONPATH:/path/to/passnet

Run Example

# Verify sample evaluation
bash samples/fusible_subgraphs/crossvit_15_dagger_240.in1k/crossvit_15_dagger_240.in1k_0_start14_end16_4/entry.sh

Docker Usage

Build Image

docker build . -t passnet:latest -f Dockerfile.nvidia

Verify Single Sample Execution in Container

docker run --gpus all --privileged \
    -v <path-to-passnet-project>:/workspace \
    -w /workspace \
    passnet:latest \
    bash samples/fusible_subgraphs/crossvit_15_dagger_240.in1k/crossvit_15_dagger_240.in1k_0_start14_end16_4/entry.sh

PassBench Evaluation Pipeline

The PassNet evaluation pipeline works as follows:

  1. Analyze Computation Graph: Read the target subgraph's model.py and weight_meta.py
  2. Generate Optimization Pass: Agent creates pattern matching rules and replacement functions
  3. Pass Matching and Replacement: The pass_mgr compiler applies the generated pass
  4. Correctness Verification: Validate numerical consistency between the optimized and original kernels
  5. Performance Benchmarking: Measure speedup and output evaluation results

PassAgent Evaluation

Evaluate agents using the PassAgent framework:

cd pass_agent
pip install -r requirements.txt

python examples/run_pass_agent_demo.py \
    --llm-name openai/glm-4.7 \
    --llm-base-url <your-llm-base-url> \
    --openai-api-key <your-api-key> \
    --dataset datasets/passbench_demo_dataset.jsonl \
    --max-steps 50 \
    --k 10

See pass_agent/README.md for details.

License

Please refer to the license file in the project root directory.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Morty Proxy This is a proxified and sanitized view of the page, visit original site.