GithubHelp home page GithubHelp logo

masashitsubaki / cpi_prediction Goto Github PK

View Code? Open in Web Editor NEW
155.0 155.0 36.0 9.65 MB

This is a code for compound-protein interaction (CPI) prediction based on a graph neural network (GNN) for compounds and a convolutional neural network (CNN) for proteins.

License: Apache License 2.0

Python 93.58% Shell 6.42%

cpi_prediction's People

Contributors

masashitsubaki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cpi_prediction's Issues

AUPR

I used the dataset "human" in your paper, run your code and calc AUPR. The result was that the AUPR is almost 1. I think its too high, a little weird. I am new in this area, could you give some explanation?
Thank you. @masashitsubaki

About the precision

I just follow the author's advice to train my own datasets,but i have a problem during the training as following:

Training...
Epoch Time(sec) Loss_train AUC_dev AUC_test Precision_test Recall_test
C:\Anaconda3\envs\my-rdkit-env\lib\site-packages\sklearn\metrics_classification.py:1221: UndefinedMetricWarning: Precision is ill-defined and being set t
o 0.0 due to no predicted samples. Use zero_division parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
C:\Anaconda3\envs\my-rdkit-env\lib\site-packages\sklearn\metrics_classification.py:1221: UndefinedMetricWarning: Precision is ill-defined and being set t
o 0.0 due to no predicted samples. Use zero_division parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
1 18.5711581 2328.3497102856636 0.4819866010391031 0.5351374101374101 0.0 0.0

so how to deal with this problem?Thanks!

a question about the "tanh" in "attention_cnn" module

Hi, I have a question about the "attention_cnn" module.
In general, it uses "softmax" rather than "tanh" to get the weights. But as below, your code it uses "tanh", so I am a little confused. Are there some reasons to use like that? thanks!
image

how to make predictions?

Thanks for this excellent method for predicting protein-compound interactions. I am just wondering how to make predictions for some compounds? I have trained a model using my own data, the training curve seems to be right. So, how can I make predictions for some other data using this trained model?

About the prediction

Hi, it seems that your code can't do "new drug" or "new target" problem, for every drug learns an embedding vector, so "new drug" type can't be predicted and "new target" as the same. Is this true?
Thank you:)

Negative construction script

Hi,

Could you please provide the script for the negative sample construction? I found that the negative sample construction link provided by Liu et al., (2015) is invalid. Hence, I guess maybe you have the original code.

Thank you very much!

How to use this model?

excuse me,I have completed data preprocessing and training, and generated the model file. But how to use the model?

How to use this model?

Hello, excuse me, how to use this model after training? For example, predict the interaction of only one protein with a small number of compounds.

not enough values to unpack (expected 14, got 0)

Traceback (most recent call last):
File "C:/Users/user/Downloads/15647NeonBand.RarZipExtractorPro_g3b9h1p9bdemw!App/CPI_prediction-master/CPI_prediction-master/code/newtest1.py", line 153., in
setting) = sys.argv[1:]
ValueError: not enough values to unpack (expected 14, got 0)
################
I am getting this type of error. How can I solve it? Can you provide me the original code to me please....
My email id [email protected]

Some questions

Thanks for sharing the code!

I read the code carefully, but I found some questions.

  • 1 How about the end2end test? what the end2end mean is, that mol and protein in, and interaction out.

  • 2 The molecular input without any feature, will it make any sense, with GNN?

  • 3 Where is edge features? Description in the paper, but no line in the code.

can not get the result as the paper

Torch version: 1.1.0

As the title, i trained the code directly on the dataset human, but got the results below, which were different from the paper, is the code not same with the paper?

The code uses GPU...
Training...
Epoch Time(sec) Loss_train AUC_dev AUC_test Precision_test Recall_test
1 1.043790566000098 55.68943989276886 0.75 0.9583333333333334 0.6 1.0
2 1.8972396280005341 55.123449206352234 0.8125 1.0 0.6 1.0
3 2.755670120001014 54.722046077251434 0.8125 1.0 0.6 1.0
4 3.6050277929971344 54.029464304447174 0.9375 0.9583333333333334 0.6 0.5
..........................................
98 84.71962930099835 0.011957645416259766 1.0 0.7916666666666667 0.75 0.5
99 85.56498634699528 0.01192164421081543 1.0 0.7916666666666667 0.75 0.5

version

python version and requirments version please

Could you provide processed DUD-E Dataset?

I'm referring to your paper. I followed the setting mentioned in your paper " We randomly divided 102 DUD-E targets into 72 targets as a training dataset and 30 targets as a test dataset" to process DUD-E dataset, but I couldn't get the results as your paper. I want to know what I can do to get the best results, if I need to change the hyper-parameters?
I hope to your reply, thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.