Modular Visual Question Answering via Code Generation

This repo contains the code for the paper Modular Visual Question Answering via Code Generation, published at ACL 2023.

Setup

Follow these steps to set up an environment to run this code. First, create a fresh conda environment based on Python 3.8. Then

Run pip install -e . inside this repo.
Clone the Grounding DINO repo (https://github.com/IDEA-Research/GroundingDINO) and run python -m pip install -e GroundingDINO inside it to install it.
pip install transformers==4.25 openai sentence-transformers Though the annotations for all 5 datasets used in our paper's evaluations are available online, we collected these annotations (and provided the dataset samples used in our evaluations when applicable) in a single zip file for your convenience: https://drive.google.com/file/d/1FrGEpgcGi9SjLPbQ-bGLlGZrdOAqA79j/view?usp=sharing .

Experiments

Run these scripts to reproduce the results of CodeVQA and Few-shot PnP-VQA on the GQA, COVR, and NLVR2 test sets.

bash run_scripts/pnp-vqa/eval/gqa_eval_gpt3.sh
bash run_scripts/pnp-vqa/eval/covr_eval_gpt3.sh
bash run_scripts/pnp-vqa/eval/nlvr2_eval_gpt3.sh
bash run_scripts/pnp-vqa/eval/vqav2_eval_gpt3.sh
bash run_scripts/pnp-vqa/eval/okvqa_eval_gpt3.sh

The config files are stored at lavis/projects/pnp-vqa/eval/{gqa/covr/nlvr2/vqav2/okvqa}_eval_gpt3{_codevqa}.yaml. We provide a few commented-out options for (1) if you want to evaluate on the validation set (or sample thereof) instead of the test set, (2) randomly retrieving in-context examples instead of using question embeddings, and (3) using the find_object primitive for counting objects (for this, provided in the COVR and NLVR2 configs, use both the commented-out option for the programs_path and the commented-out grounding_dino_path). Note: The preambles (API documentation) in the prompts for VQAv2 and OK-VQA may be suboptimal due to either misspecified functions (OK-VQA) or lack of function descriptions (VQAv2). The in-context examples, however, are valid.

Acknowledgements

This repo is based on the original LAVIS repo: https://github.com/salesforce/LAVIS .

Citation

If you find our paper or this repository useful in your work, please cite our paper:

@inproceedings{subramanian-etal-2023-modular,
    title = "Modular Visual Question Answering via Code Generation",
    author = "Subramanian, Sanjay  and
      Narasimhan, Medhini and
      Khangaonkar, Kushal and
      Yang, Kevin and
      Nagrani, Arsha and
      Schmid, Cordelia and
      Zeng, Andy and
      Darrell, Trevor and
      Klein, Dan",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics",
    month = july,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics"
}

Name		Name	Last commit message	Last commit date
Latest commit History 475 Commits
.github/workflows		.github/workflows
app		app
assets		assets
dataset_card		dataset_card
docs		docs
examples		examples
lavis		lavis
projects		projects
run_scripts		run_scripts
tests/models		tests/models
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
README_lavis.md		README_lavis.md
SECURITY.md		SECURITY.md
covr_examples50.json		covr_examples50.json
covr_programs_and_questions50b.json		covr_programs_and_questions50b.json
covr_programs_and_questions50b_find.json		covr_programs_and_questions50b_find.json
evaluate.py		evaluate.py
gqa_examples50.json		gqa_examples50.json
gqa_preamble_simple.py		gqa_preamble_simple.py
gqa_programs_and_questions_nocomments3b.json		gqa_programs_and_questions_nocomments3b.json
nlvr2_examples50.json		nlvr2_examples50.json
nlvr2_programs_and_questions50.json		nlvr2_programs_and_questions50.json
nlvr2_programs_and_questions50_find.json		nlvr2_programs_and_questions50_find.json
okvqa_examples50.json		okvqa_examples50.json
okvqa_examples_knowledge50.json		okvqa_examples_knowledge50.json
okvqa_preamble.py		okvqa_preamble.py
okvqa_programs_and_questions50.json		okvqa_programs_and_questions50.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py
vqa_preamble_find.py		vqa_preamble_find.py
vqav2_examples50.json		vqav2_examples50.json
vqav2_programs_and_questions50_find.json		vqav2_programs_and_questions50_find.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modular Visual Question Answering via Code Generation

Setup

Experiments

Acknowledgements

Citation

About

Releases

Packages

Languages

License

sanjayss34/codevqa

Folders and files

Latest commit

History

Repository files navigation

Modular Visual Question Answering via Code Generation

Setup

Experiments

Acknowledgements

Citation

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages