GithubHelp home page GithubHelp logo

chao1224 / sgnn-ebm Goto Github PK

View Code? Open in Web Editor NEW
13.0 2.0 0.0 479 KB

Structured Multi-task Learning for Molecular Property Prediction, AISTATS'22 (https://proceedings.mlr.press/v151/liu22e.html)

Home Page: https://chao1224.github.io/SGNN-EBM

License: MIT License

Python 92.18% Shell 7.82%
knowledge-graph molecule multitask-learning energy-based-model ppi-networks string chembl chembl-string

sgnn-ebm's Introduction

Structured Multi-task Learning for Molecular Property Prediction

AISTATS 2022

Authors: Shengchao Liu, Meng Qu, Zuobai Zhang, Huiyu Cai, Jian Tang

[Project Page] [Paper] [ArXiv] [Code] [NeurIPS AI4Science Workshop 2021]

This repository provides the source code for the AISTATS'22 paper Structured Multi-task Learning for Molecular Property Prediction, with the following contributions:

  1. To our best knowledge, we are the first to propose a new research problem: multi-task learning with an explicit task relation graph;
  2. We construct a domain-specific multi-task dataset with relation graph for drug discovery;
  3. We propose state graph neural network-energy based model (SGNN-EBM) for task structured modeling in both the latent and output space.

In the future, we will merge it into the TorchDrug package.

Baselines

For implementation, this repository also provides the following multi-task learning baselines:

Environments

Below is environment built with pytorch-geometric.

conda create -n SGNN_EBM python=3.7
conda activate SGNN_EBM
conda install -y -c pytorch pytorch=1.6.0 torchvision
conda install -y matplotlib
conda install -y scikit-learn
conda install -y -c rdkit rdkit=2019.03.1.0
conda install -y -c anaconda beautifulsoup4
conda install -y -c anaconda lxml

wget https://data.pyg.org/whl/torch-1.6.0%2Bcu102/torch_sparse-0.6.9-cp37-cp37m-linux_x86_64.whl
wget https://data.pyg.org/whl/torch-1.6.0%2Bcu102/torch_scatter-2.0.6-cp37-cp37m-linux_x86_64.whl
wget https://data.pyg.org/whl/torch-1.6.0%2Bcu102/torch_spline_conv-1.2.1-cp37-cp37m-linux_x86_64.whl
wget https://data.pyg.org/whl/torch-1.6.0%2Bcu102/torch_cluster-1.5.9-cp37-cp37m-linux_x86_64.whl
pip install torch_sparse-0.6.9-cp37-cp37m-linux_x86_64.whl
pip install torch_scatter-2.0.6-cp37-cp37m-linux_x86_64.whl
pip install torch_spline_conv-1.2.1-cp37-cp37m-linux_x86_64.whl
pip install torch_cluster-1.5.9-cp37-cp37m-linux_x86_64.whl
pip install torch-geometric==1.6.*

Dataset

In this work, we propose a novel dataset with explicit task relation. Basically it is a molecule property task dataset, where the task refers to a binary classification problem on a ChEMBL assay. Each task measures certain biological effects of molecules, e.g., toxicity, inhibition or activation of proteins or whole cellular processes, etc. We focus on tasks that target at proteins. Then we extract the task relation by aggregating the protein-protein interaction (PPI, like String dataset) accordingly.

For the detailed pre-processing steps, please check this instruction. The pre-processed datasets can be downloaded here.

Structured Multi-Task Learning: SGNN-EBM

Evaluation on the pre-trained models

We also provide the pre-trained model weights and evaluation scripts accordingly. First you can download the checkpoints here. All the optimal hyper-parameters are provided in the bash scripts.

cd checkpoint

bash eval_SGNN.sh > eval_SGNN.out
bash eval_SGNN_EBM.sh > eval_SGNN_EBM.out

Training from scratch

Here we provide the script for training the SGNN and SGNN-EBM (adaptive with pre-trained SGNN). Note that the pre-trained SGNN models is required (either using last script or from the pre-trained weights).

bash submit_SGNN.sh
bash submit_SGNN_EBM.sh

Baselines

We also provide the scripts for all six STL and MTL baselines under the scripts folder.

Cite Us

Feel free to cite this work if you find it useful to you!

@inproceedings{liu2022multi,
    title={Structured Multi-task Learning for Molecular Property Prediction},
    author={Liu, Shengchao and Qu, Meng and Zhang, Zuobai and Cai, Huiyu and Tang, Jian},
    booktitle={AISTATS},
    year={2022}
}

sgnn-ebm's People

Contributors

chao1224 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

sgnn-ebm's Issues

Help for checkpoint

Hello, there are only posters and slides in the given connection, I would like to ask where I can download the pre-trained checkpoint.

Application for regression tasks

Hello! Is there a way to train a model for multi-task regression? And will it be desirable to do so with SGNN-EBM?

I'd like to train a property predictor that predicts two or so properties for molecules, where those properties are presumably highly correlated, considering their similar physics.

I'd be grateful to have some advice about the two questions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.