Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Letian2003/C-VQA

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

C-VQA: Counterfactual Reasoning VQA Dataset

This is the code and data for the paper What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models.

Dataset

The dataset directory is C-VQA. You can find the questions in .csv files.

Download Images

After cloning:

pip install gdown
bash download_images.sh

Scripts

The scripts directory contains all required scripts for running models in the paper.

Before you run a script, install the corresponding model and get the weights. Then put the script in the root directory of the model.

Please change PATH_TO_IMAGES in the scripts to the actual directory of images.

Please change PATH_TO_MODEL in the scripts for ViperGPT with different code generators to the actual directory of models.

For example, to run BLIP on C-VQA, run this command in the root directory of LLaVA:

python run_eval_lavis.py --model-name blip2_t5 --model-type pretrain_flant5xxl --query PATH_TO_CSV_FILE

You can find more commands in scripts/README.

After you get the results, run format_response.py to convert raw responses to formatted responses (a single number or a single yes or no). Then run calc_acc.py to get quantitative results of the formatted responses. Remenber to fill in file names in these two scripts.

Download Code Generator Models

Change YOUR_HUGGINGFACE_TOKEN in download_model.py to your huggingface token. Then run:

pip install huggingface_hub
python download_model.py

You can add more code generators in download_model.py by adding models in repo_ids and local_dirs.

Citation

If this code is useful for your research, please consider citing our work.

@InProceedings{zhang2023cvqa,
    author    = {Zhang, Letian and Zhai, Xiaotong and Zhao, Zhongkai and Wen, Xin and Zhao, Bingchen},
    title     = {What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-Modal Language Models},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
    year      = {2023}
}

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  
Morty Proxy This is a proxified and sanitized view of the page, visit original site.