Author: Talip Ucar (ucabtuc@gmail.com)
The official implementation of Improving Antibody Humanness Prediction using Patent Data
- Model
- Environment
- Configuration
- Training and Evaluation
- Structure of the repo
- Results
- Experiment tracking
- Citing the paper
- Citing this repo
| Pre-training | Fine-tuning |
|---|---|
![]() |
![]() |
We used Python 3.7 for our experiments. The environment can be set up by following three steps:
pip install pipenv # To install pipenv if you don't have it already
pipenv install --skip-lock # To install required packages.
pipenv shell # To activate virtual env
If the second step results in issues, you can install packages in Pipfile individually by using pip i.e. "pip install package_name".
There are two types of configuration files:
1. pad.yaml # Defines parameters and options for pre-training
2. humanness.yaml # Defines parameters and options for fine-training
You can train and evaluate the model by using:
python selfpad_pretrain.py # For pre-training
python selfpad_finetune.py # For fine-tuning it for humanness
python selfpad_eval.py -ev test # To compute humanness score for custome dataset, in this case it is test.csv. CSV file should have "VH", "VL" and/or "Label" columns
- selfpad_pretrain.py
- selfpad_finetune.py
- selfpad_eval.py
- src
|-selfpad.py
|-selfpad_humanness.py
- config
|-pad.yaml
|-humanness.yaml
- utils_common
|-arguments.py
|-utils.py
|-tokenizer.py
...
- utils_pretrain
|-load_data.py
|-model_utils.py
|-loss_functions.py
...
- utils_finetune
|-load_data.py
|-model_utils.py
|-loss_functions.py
...
- data
|-test.csv
...
- results
|-pretraining
|-humanness
...
Results at the end of training is saved under ./results directory. Results directory structure is as following:
- results
|-task e.g. humanness, or pretraining
|-evaluation
|-clusters (for plotting t-SNE and PCA plots of embeddings)
|-training
|-model
|-plots
|-loss
You can save results of evaluations under "evaluation" folder.
You can turn on Weight and Biases (W&B) in the config file for logging
@article{ucar2024SelfPAD,
title={Improving Antibody Humanness Prediction using Patent Data},
author={Ucar, Talip and
Ramon, Aubin and
Oglic, Dino and
Croasdale-Wood, Rebecca and
Diethe, Tom and
Sormanni, Pietro},
journal={arXiv preprint arXiv:2110.04361},
year={2024}
}
If you use SelfPAD framework in your own studies, and work, please cite it by using the following:
@Misc{talip_ucar_2024_SelfPAD,
author = {Talip Ucar},
title = {{Improving Antibody Humanness Prediction using Patent Data}},
howpublished = {\url{https://github.com/AstraZeneca/SelfPAD}},
month = January,
year = {since 2024}
}

