Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Soltanilara/av-aloha

Open more actions menu

Repository files navigation

AV-ALOHA

AV-ALOHA

This repository contains the code for the paper: "Active Vision Might Be All You Need: Exploring Active Vision in Bimanual Robotic Manipulation". You can visit the Project Page and check out the ArXiv Paper.

Overview

AV-ALOHA builds upon the ALOHA 2 system and introduces active vision for bimanual robotic manipulation. This repository includes:

  • Teleoperation and data collection
  • Training models with LeRobot
  • Evaluation on both simulated and real-world AV-ALOHA setups

For the VR teleoperation and stereo camera passthrough functionality, refer to the Unity App Repo.

Note: The code is under active development, and a more organized codebase will be available in future updates.

Hardware Setup

AV-ALOHA extends ALOHA 2 by adding another ViperX 300 S robot arm. To install the additional arm, we used two 840mm 2020 extrusions with 4 L brackets. The ZED Mini serves as the active vision camera, attached using custom 3D-printed parts available in assets/3D_printed_parts.

Software Installation

  1. Install ROS Noetic and follow the ALOHA Setup Instructions for software and hardware setup, excluding their repo.

  2. Bind the active vision robot arm to /dev/ttyDXL_puppet_middle.

  3. Clone this repository:

    cd ~/interbotix_ws/src
    git clone https://github.com/Soltanilara/av-aloha
    git submodule init
    git submodule update
    
    # build ROS packages
    cd ~/interbotix_ws
    catkin_make
  4. Set up the Conda environment:

    conda create -y -n lerobot python=3.10
    conda activate lerobot
    conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia
  5. Install the ZED Python API by following these instructions.

  6. Install additional dependencies:

    pip install -e gym_guided_vision
    pip install -e lerobot
    pip install -r requirements.txt

WebRTC Setup

  1. Create a Firebase project and set up a Firestore database at Firebase Console.

  2. In your Firestore database, set the rules as follows:

    rules_version = '2';
    
    service cloud.firestore {
      match /databases/{database}/documents {
        match /<your_password_for_webrtc>/{document=**} {
          allow read, write: if true;
        }
      }
    }
  3. In Project Settings -> Service Accounts, generate a new private key and name it serviceAccountKey.json. Place this file in the data_collection_scripts directory.

  4. Create a file named signalingSettings.json in data_collection_scripts and paste the following:

    {
        "robotID": "<robot id for your robot (e.g. robot_1)>",
        "password": "<your password same as in firestore rules>",
        "turn_server_url": "<turn url>",
        "turn_server_username": "<turn username>",
        "turn_server_password": "<turn password>"
    }

Data Collection

Simulation:

# in data_collection_scripts/
python record_sim_episodes --task_name sim_insert_peg --episode_idx 0
python replay_sim_episode --task_name sim_insert_peg --num_arms <2 or 3>

Real Robot:

  1. In one terminal, launch the robot:

    # in data_collection_scripts/
    source launch_robot.sh
  2. In another terminal, activate the environment:

    # in data_collection_scripts/
    source activate.sh
    python record_episodes --task_name occluded_insertion --episode_idx 0

Visualize an Episode:

# in data_collection_scripts/
python visualize_episodes.py --hdf5_path path/to/your/hdf5

Push Dataset to Hugging Face:

# in repo root
huggingface-cli login
python lerobot/lerobot/scripts/push_dataset_to_hub.py \
    --raw-dir path/to/your/dataset \
    --repo-id <hf_id>/<dataset_name> \
    --raw-format aloha_hdf5

Visualize Data from Hugging Face:

# in repo root
python lerobot/lerobot/scripts/visualize_dataset.py \
    --repo-id <hf_id>/<dataset_name>  \
    --episode-index 0

Training

Ensure the config names are set correctly by modifying lerobot/lerobot/configs. Start training with:

# in repo root
python lerobot/lerobot/scripts/train.py \
    hydra.run.dir=outputs/train/sim_sew_needle_3arms_zed_static_wrist_act \
    hydra.job.name=sim_sew_needle_3arms_zed_static_wrist_act \
    device=cuda \
    env=sim_sew_needle_3arms \
    policy=zed_static_wrist_act \
    wandb.enable=true

Evaluation

Simulation Evaluation (as done in the paper):

# in repo root
python lerobot/lerobot/scripts/eval.py \
    -p outputs/train/sim_hook_package_2arms_wrist_act/checkpoints \
    --out-dir outputs/eval/sim_hook_package_2arms_wrist_act \
    eval.n_episodes=50 \
    eval.batch_size=10 \
    --save-video

Single Checkpoint Evaluation:

  1. Save your model to Hugging Face:

    # in eval_scripts/
    python save_policy.py \
        --repo_id iantc104/sim_slot_insertion_3arms_zed_wrist_act \
        --checkpoint_dir outputs/train/sim_slot_insertion_3arms_zed_wrist_act/checkpoints/014000/pretrained_model
  2. Evaluate using the script in eval_scripts:

    Simulated Policy:

    # in eval_scripts/
    python eval.py \
        --policy iantc104/sim_slot_insertion_3arms_zed_wrist_act \
        --episode_len 300 \
        --num_episodes 50 \
        --sim_env gym_guided_vision/SlotInsertion-3Arms-v0

    Real Policy:

    # in eval_scripts/
    python eval.py \
        --policy iantc104/real_occluded_key_insertion_3arms_zed_act \
        --episode_len 700 \
        --num_episodes 50

Simulation Datasets

Insert Peg

Slot Insertion

Sew Needle

Hook Package

Tube Transfer

Real-World Datasets

Occluded Insertion

Open Box

Note: This dataset is in a different format as it uses a newer version of the robot.

Citation

@misc{chuang2024activevisionneedexploring,
    title={Active Vision Might Be All You Need: Exploring Active Vision in Bimanual Robotic Manipulation}, 
    author={Ian Chuang and Andrew Lee and Dechen Gao and Iman Soltani},
    year={2024},
    eprint={2409.17435},
    archivePrefix={arXiv},
    primaryClass={cs.RO},
    url={https://arxiv.org/abs/2409.17435}, 
}
Morty Proxy This is a proxified and sanitized view of the page, visit original site.