Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

chtmp223/suri

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🦙 Suri: Multi-constraint Instruction Following for Long-form Text Generation

arXiV Website Dataset Suri-I-ORPO Suri-SFT

TopicGPT Pipeline Overview

This repository hosts the code and data for our paper, Suri: Multi-constraint instruction following for long-form text generation.

We release 🦙 Suri, a single-turn instruction following dataset with multi-constraint instructions and long-form gold responses (2k-5k words). We also introduce I-ORPO, a variant of Odds Ratio Preference Optimization (ORPO) that accepts (x+, x-, y) as inputs rather than (x, y+, y-). We demonstrate the effectiveness of the dataset by fine-tuning Mistral-7B-Instruct with the SFT and I-ORPO method.

📣 Updates

  • [2024-06-25]: The code and data for Suri are now available.

📦 Using Suri

Getting Started

  1. Install the requirements for Suri:
    conda create -n suri python=3.10 
    conda activate suri
    pip install -r requirements.txt
    python -m pip install flash-attn --no-build-isolation
    huggingface-cli login       # Log in to Huggingface using your access token 
    sudo apt-get install git-lfs
    
  2. Set up Huggingface cache directory:
    • Open your shell configuration file, which is typically ~/.bashrc or ~/.bash_profile for Bash, or ~/.zshrc for Zsh.
    • Add HF_HOME huggingface cache directory path to your configuration file: HF_HOME=/path/to/huggingface_cache.
    • Add HF_TOKEN huggingface access token to your configuration file: HF_TOKEN=<your_token>.
    • Save and close the file. Source the file to apply the changes: source ~/.bashrc or source ~/.bash_profile or source ~/.zshrc.
    • Double-check that the environment variable is set correctly: echo $HF_HOME.

Project Structure

.
├── README.md
├── assets
│   ├── img
│   └── styles
├── data
├── eval
│   ├── automatic
│   ├── human
│   └── inference
├── ft
│   ├── README.md
│   ├── deepspeed_zero3.yaml
│   ├── i-orpo
│   ├── lib
│   │   ├── alignment_mod
│   │   └── trl_mod
│   └── sft
├── index.html
├── prompts
├── requirements.txt
└── utils.py
  • data contains b3.py, which can be used to reconstruct the gold responses of the books3 subset.
  • eval contains:
    • automatic, which includes code to compute the ranking accuracy metric.
    • human, which includes the XML code for the human evaluation interfaces.
    • inference, which includes code to do inference with the fine-tuned models using either Transformers Huggingface or vLLM.
  • ft contains code to fine-tune the models using I-ORPO or SFT:
    • i-orpo directory includes orpo.yaml, which defines the training hyperparameters; run_orpo.py, which contains the training code; and run_orpo.sh, which consolidates the training process into a single executable command.
    • sft directory includes sft.yaml, which defines the training hyperparameters; run_sft.py, which contains the training code; and run_sft.sh, which consolidates the training process into a single executable command.
    • deepspeed_zero3.yaml contains the hyperparameters for deepspeed zero3.
  • prompts contains all prompts used in the paper.

Dataset

  • The dataset is available on Huggingface: https://huggingface.co/datasets/chtmp223/suri/.
  • Due to copyright concerns, we do not release the gold responses that are sampled from the Books3 subset. For users with local access to the Books3 dataset, we include a script (data/b3.py) to reconstruct this portion of the dataset.
    • First, make sure to set the DATA_DIR variable to the path of the books3 dataset on your local machine.
    • Next, modify the code to either save the reconstructed dataset to a csv file or push to a new Huggingface repository.
    • Finally, run the code using python b3.py.

I-ORPO & SFT implementation

  • Suri-I-ORPO is available on Huggingface: https://huggingface.co/chtmp223/suri-i-orpo. Suri-SFT is also available on Huggingface: https://huggingface.co/chtmp223/suri-sft.
  • We include the code for training in the ft/ directory. See the README.md file in that folder for more information.
  • We recommend inference with Huggingface Transformers library. See the model card and eval/ folder for more details on inference.

📜 Citation

@misc{pham2024surimulticonstraintinstructionfollowing,
      title={Suri: Multi-constraint Instruction Following for Long-form Text Generation}, 
      author={Chau Minh Pham and Simeng Sun and Mohit Iyyer},
      year={2024},
      eprint={2406.19371},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2406.19371}, 
}

About

Suri: Multi-constraint instruction following for long-form text generation [EMNLP’24]

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

Morty Proxy This is a proxified and sanitized view of the page, visit original site.