Task Hardness Estimation for Molecular Activity Predcition
THEMAP can be installed pip. First, create a new conda environemnt with the required packages. Then, clone this reposiroty, and finally, install the repository using pip.
conda env create -f environment.yml
conda activate themap
git clone https://github.com/HFooladi/THEMAP.git
cd THEMAP
pip install -e .
For the FS-Mol dataset, moleuclar embedding for each assay (ChEMBL id) and also, chemical and protein distance have been calculated and deposited in the zenodo.
- Download it from zenodo
- Unzip the directory and place it into
datasets
such that you have the pathdatasets/fsmol_hardness
Then, you can go to the notebooks
folder, and run the notebooks.
If you find the models useful in your research, we ask that you cite the following paper:
@article{fooladi2024qth,
author = {Fooladi, Hosein and Hirte, Steffen and Kirchmair, Johannes},
title={Quantifying the hardness of bioactivity prediction tasks for transfer learning},
year={2024},
doi={10.26434/chemrxiv-2024-871mt},
url={https://chemrxiv.org/engage/chemrxiv/article-details/65b3cafd9138d23161cc5ea4},
journal={ChemRxiv}
}