Deep Cross-Modal Projection Learning for Image-Text Matching

This is a Pytorch implmentation for the paper Deep Cross-Modal Projection Learning for Image-Text Matching.
The official implementation in TensorFlow can be found here.

Requirement

Download the pre-computed/pre-extracted data from GoogleDrive and move them to data/processed folder. Or you can use the file dataset/preprocess.py to prepare your own data.
[Optional] Download the pre-trained model weights from GoogleDrive and move them to pretrained_models folder.

You should firstly change the param model_path to your current directory.

sh scripts/run.sh

You can directly run the code instead of performing training and testing seperately.
Or training:

sh scripts/train.sh

Or testing:

sh scripts/test.sh

Name	Name	Last commit message	Last commit date
Latest commit History 28 Commits
datasets	datasets
models	models
scripts	scripts
utils	utils
README.md	README.md
config.py	config.py
test.py	test.py
test_config.py	test_config.py
train.py	train.py
train_config.py	train_config.py