Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

XITASO/lgar

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

LGAR: Zero-Shot LLM-Guided Neural Ranking for Abstract Screening in Systematic Literature Reviews

This repo contains the code used for the paper [TODO: link]

Table of Contents

Requirements

Install the requirements listed in requirements.txt, e.g., using a conda environment or venv with Python=3.12

Required Data

The directory structure of the data folder should look like this:

data
├── synergy
│   └── info.json
│   └── csv files of dataset
├── tar2019
│   ├── dta
│   │   └── info.json
│   │   └── csv files of dataset
│   ├── intervention
│   │   └── info.json
│   │   └── csv files of dataset
│   ├── prognosis
│   │   └── info.json
│   │   └── csv files of dataset
│   ├──qualitative
│   │   └── info.json
│   │   └── csv files of dataset

First Stage of Ranking (LLM Ranker)

The scripts to reproduce the results of this paper are located in ./implementation/src/scripts/:

  • Experiment with different scales: run_scale_experiment.py
  • Experiment with different Prompts: run_prompt_experiment.py
  • Experiment with different LLMs: run_llm_epxeriment.py
  • Experiment with LLM (Title only): run_llm_ti_experiment.py The scripts can be started using a batch file (e.g., when using SLURM on a cluster). All scripts except for that used for LLM with title only expect to receive the index of the array job, since the scripts run only part of the code for a given index. In all files, you need to include the paths to the directory, where the model is stored or the huggingface tag of the model.

Example usage for run_prompt_experiment.py:

  • generate_examples=True: The script creates few-shot examples for zero-shot (index=0) and CoT prompting (index=1) for all SLRs of the current dataset (current dataset is selected by setting the respective paths in the config.json file) at the specified location (config.folder_path_few_shot_examples).
  • generate_examples=False: Runs the experiment for all prompting techniques. The provided index specifies the prompting technique (index = 0: 2-shot, index = 1: CoT, index = 2: CoT (n=3), index = 3: 2-shot CoT, index = 4: 2-shot CoT (n=3)).
    • The script saves the results to config.llm_client_output_directory_path with the following order structure:
    ├── Prompting Technique 1
    │   ├── Dataset 1
    │   │   ├── Results of SLR 1
    │   │   │   └── log_file_0.json
    │   │   ├── Results of SLR 2
    │   │   │   └── log_file_0.json
    │   │   ...
    │   └── Dataset 2
    │   ...
    ├── Prompting Technique 2
    ...
    
    • The results of the LLM ranker are stored in the folder of the respective SLR, in a json log-file with an index corresponding to the run (i.e., for not self-consistency runs: log_file_0.json).

Re-Ranking results of LLMs

This section describes how to re-rank the papers with a secondary (dense) ranker, after having completed the first stage of our ranking pipeline. There are two options for performing this step, either by using the script evaluate_experiments_single.py or by using evaluate_experiments.py. Both scripts expect that config.folder_path_slrs and config.file_path_slr_infos are correctly set to the dataset that should be evaluated.

Option 1: evaluate_experiments_single.py

  • Evaluates all experiments that are located in the subfolders of the experiment that is to be evaluated.
  • The experiment can be specified by setting config.llm_client_output_directory_path to the desired folder.
  • The specified folder is expected to be a folder above the folders of different SLRs of a dataset, so e.g., "./implementation/data/paper/prompts/synergy/CoT/"
  • To not always manually adapt the experiment path for all experiments in a SLR dataset folder, it is possible provide the path to the parent folder, e.g., "./implementation/data/paper/prompts/synergy/" and uncomment the respective lines, which split up the folders by the index provided to the main function.

Option 2: evaluate_experiments.py

  • Expects that config.llm_client_output_directory_path is at the level of a dataset folder, e.g., "./implementation/data/paper/prompts/synergy/"
  • To use this script for re-ranking the papers: is_lm_only=False and rerank=True
  • The script reranks (sequentially) the papers for all SLRs of all different configurations stored in config.llm_client_output_directory_path

If you want to use a different re-ranker, you need to change config.llm_client_config.path_to_reranker to a different dense ranker (if none is provided, BM25 will be used as fallback). To distinguish between different re-ranking queries, you need to set config.llm_client_config.system_message_type accordingly. If you specifiy it to be system_message_basic, only the title of the respective SLR is used as query, while if you specifiy it to be system_message_rq, title and rqs are used as query.

Evaluation

This section describes how the experiments can be evaluated using the script evaluate_experiments.py.

  • To generate only the result csv file and not re-rank the documents again: rerank=False → the script will then directly access the already ranked list stored in ranked_df.json for each SLR
  • Depending on what experiment you want to evaluate you can provide a function for name_label that renames the experiments abbreviation accordingly (either create_run_label_exp1 (scale experiment), create_run_label_exp2 (prompt experiment), or create_run_label_exp3 (all other experiments))
  • For the scale experiments: sort=True → ensures an ascending order or the scales; for other experiments it should be set to false
  • To evaluate only the performance of a dense ranker (without LLM): is_lm_only=True
  • As already described above, the script expects that you have set config.folder_path_slrs and config.file_path_slr_infos correctly to the dataset you want to evaluate. Furthermore, you need to specify, which results should be evaluated. This can be achieved by setting config.llm_client_output_directory_path to the respective results folder.

Dense Ranker Only

  • To evaluate only the performance of a dense ranker, you can use the script /src/scripts/run_lm_experiment.py
  • The model paths need to be set according to the models' location.
  • If the script is provided with an index which is not equal to 1, the query for the ranker is the title of the SLR; otherwise the query is title and research questions of the SLR
  • The evaluation of the dense ranker can be performed as described in the previous section, but it is necessary to set is_lm_only=True

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
Morty Proxy This is a proxified and sanitized view of the page, visit original site.