🦙 Suri: Multi-constraint Instruction Following for Long-form Text Generation

This repository hosts the code and data for our paper, Suri: Multi-constraint instruction following for long-form text generation.

We release 🦙 Suri, a single-turn instruction following dataset with multi-constraint instructions and long-form gold responses (2k-5k words). We also introduce I-ORPO, a variant of Odds Ratio Preference Optimization (ORPO) that accepts (x+, x-, y) as inputs rather than (x, y+, y-). We demonstrate the effectiveness of the dataset by fine-tuning Mistral-7B-Instruct with the SFT and I-ORPO method.

📣 Updates

[2024-06-25]: The code and data for Suri are now available.

📦 Using Suri

Getting Started

Install the requirements for Suri:

conda create -n suri python=3.10 
conda activate suri
pip install -r requirements.txt
python -m pip install flash-attn --no-build-isolation
huggingface-cli login       # Log in to Huggingface using your access token 
sudo apt-get install git-lfs

Set up Huggingface cache directory:
- Open your shell configuration file, which is typically ~/.bashrc or ~/.bash_profile for Bash, or ~/.zshrc for Zsh.
- Add HF_HOME huggingface cache directory path to your configuration file: HF_HOME=/path/to/huggingface_cache.
- Add HF_TOKEN huggingface access token to your configuration file: HF_TOKEN=<your_token>.
- Save and close the file. Source the file to apply the changes: source ~/.bashrc or source ~/.bash_profile or source ~/.zshrc.
- Double-check that the environment variable is set correctly: echo $HF_HOME.

Project Structure

.
├── README.md
├── assets
│   ├── img
│   └── styles
├── data
├── eval
│   ├── automatic
│   ├── human
│   └── inference
├── ft
│   ├── README.md
│   ├── deepspeed_zero3.yaml
│   ├── i-orpo
│   ├── lib
│   │   ├── alignment_mod
│   │   └── trl_mod
│   └── sft
├── index.html
├── prompts
├── requirements.txt
└── utils.py

data contains b3.py, which can be used to reconstruct the gold responses of the books3 subset.
eval contains:
- automatic, which includes code to compute the ranking accuracy metric.
- human, which includes the XML code for the human evaluation interfaces.
- inference, which includes code to do inference with the fine-tuned models using either Transformers Huggingface or vLLM.
ft contains code to fine-tune the models using I-ORPO or SFT:
- i-orpo directory includes orpo.yaml, which defines the training hyperparameters; run_orpo.py, which contains the training code; and run_orpo.sh, which consolidates the training process into a single executable command.
- sft directory includes sft.yaml, which defines the training hyperparameters; run_sft.py, which contains the training code; and run_sft.sh, which consolidates the training process into a single executable command.
- deepspeed_zero3.yaml contains the hyperparameters for deepspeed zero3.
prompts contains all prompts used in the paper.

Dataset

The dataset is available on Huggingface: https://huggingface.co/datasets/chtmp223/suri/.
Due to copyright concerns, we do not release the gold responses that are sampled from the Books3 subset. For users with local access to the Books3 dataset, we include a script (data/b3.py) to reconstruct this portion of the dataset.
- First, make sure to set the DATA_DIR variable to the path of the books3 dataset on your local machine.
- Next, modify the code to either save the reconstructed dataset to a csv file or push to a new Huggingface repository.
- Finally, run the code using python b3.py.

I-ORPO & SFT implementation

Suri-I-ORPO is available on Huggingface: https://huggingface.co/chtmp223/suri-i-orpo. Suri-SFT is also available on Huggingface: https://huggingface.co/chtmp223/suri-sft.
We include the code for training in the ft/ directory. See the README.md file in that folder for more information.
We recommend inference with Huggingface Transformers library. See the model card and eval/ folder for more details on inference.

📜 Citation

@misc{pham2024surimulticonstraintinstructionfollowing,
      title={Suri: Multi-constraint Instruction Following for Long-form Text Generation}, 
      author={Chau Minh Pham and Simeng Sun and Mohit Iyyer},
      year={2024},
      eprint={2406.19371},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2406.19371}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦙 Suri: Multi-constraint Instruction Following for Long-form Text Generation

📣 Updates

📦 Using Suri

Getting Started

Project Structure

Dataset

I-ORPO & SFT implementation

📜 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name	Name	Last commit message	Last commit date
Latest commit History 12 Commits 12 Commits
assets	assets
data	data
eval	eval
ft	ft
prompts	prompts
.gitignore	.gitignore
README.md	README.md
index.html	index.html
requirements.txt	requirements.txt
utils.py	utils.py

Search code, repositories, users, issues, pull requests...

Folders and files

Latest commit

History

Repository files navigation

🦙 Suri: Multi-constraint Instruction Following for Long-form Text Generation

📣 Updates

📦 Using Suri

Getting Started

Project Structure

Dataset

I-ORPO & SFT implementation

📜 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages