masashitsubaki / cpi_prediction Goto Github PK

This is a code for compound-protein interaction (CPI) prediction based on a graph neural network (GNN) for compounds and a convolutional neural network (CNN) for proteins.

License: Apache License 2.0

Python 93.58% Shell 6.42%

cpi_prediction's People

Contributors

Stargazers

Watchers

cpi_prediction's Issues

Can you provide complete data?

I am referring to your paper. Can you provide data for versions 1:3 and 1:5? My email is [[email protected]]. Thanks a lot for your help.

AUPR

I used the dataset "human" in your paper, run your code and calc AUPR. The result was that the AUPR is almost 1. I think its too high, a little weird. I am new in this area, could you give some explanation?
Thank you. @masashitsubaki

About the precision

I just follow the author's advice to train my own datasets,but i have a problem during the training as following:

Training...
Epoch Time(sec) Loss_train AUC_dev AUC_test Precision_test Recall_test
C:\Anaconda3\envs\my-rdkit-env\lib\site-packages\sklearn\metrics_classification.py:1221: UndefinedMetricWarning: Precision is ill-defined and being set t
o 0.0 due to no predicted samples. Use zero_division parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
C:\Anaconda3\envs\my-rdkit-env\lib\site-packages\sklearn\metrics_classification.py:1221: UndefinedMetricWarning: Precision is ill-defined and being set t
o 0.0 due to no predicted samples. Use zero_division parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
1 18.5711581 2328.3497102856636 0.4819866010391031 0.5351374101374101 0.0 0.0

so how to deal with this problem?Thanks!

a question about the "tanh" in "attention_cnn" module

Hi, I have a question about the "attention_cnn" module.
In general, it uses "softmax" rather than "tanh" to get the weights. But as below, your code it uses "tanh", so I am a little confused. Are there some reasons to use like that? thanks!

how to make predictions?

Thanks for this excellent method for predicting protein-compound interactions. I am just wondering how to make predictions for some compounds? I have trained a model using my own data, the training curve seems to be right. So, how can I make predictions for some other data using this trained model?

How to use the trained model for prediction?

Greeting sir,
Thank you for this work.

How can I use this model for prediction?

Thanks

About the prediction

Hi, it seems that your code can't do "new drug" or "new target" problem, for every drug learns an embedding vector, so "new drug" type can't be predicted and "new target" as the same. Is this true?
Thank you:)

Negative construction script

Hi,

Could you please provide the script for the negative sample construction? I found that the negative sample construction link provided by Liu et al., (2015) is invalid. Hence, I guess maybe you have the original code.

Thank you very much!

How to use this model?

excuse me，I have completed data preprocessing and training, and generated the model file. But how to use the model?

How to use this model？

Hello, excuse me, how to use this model after training? For example, predict the interaction of only one protein with a small number of compounds.

How to use this model？

Is the data division of 8: 1: 1 used to stop the training of the model with one of them as the validation set?

not enough values to unpack (expected 14, got 0)

Traceback (most recent call last):
File "C:/Users/user/Downloads/15647NeonBand.RarZipExtractorPro_g3b9h1p9bdemw!App/CPI_prediction-master/CPI_prediction-master/code/newtest1.py", line 153., in
setting) = sys.argv[1:]
ValueError: not enough values to unpack (expected 14, got 0)
################
I am getting this type of error. How can I solve it? Can you provide me the original code to me please....
My email id [email protected]

Can you provide complete data?

I am referring to your paper. Found that the data you provided is only the positive:negativate=1:1 version. Can you provide data for versions 1:3 and 1:5?My email is [email protected] you very much.

Some questions

Thanks for sharing the code!

I read the code carefully, but I found some questions.

1 How about the end2end test? what the end2end mean is, that mol and protein in, and interaction out.
2 The molecular input without any feature, will it make any sense, with GNN?
3 Where is edge features? Description in the paper, but no line in the code.

can not get the result as the paper

Torch version: 1.1.0

As the title, i trained the code directly on the dataset human, but got the results below, which were different from the paper, is the code not same with the paper?

The code uses GPU...
Training...
Epoch Time(sec) Loss_train AUC_dev AUC_test Precision_test Recall_test
1 1.043790566000098 55.68943989276886 0.75 0.9583333333333334 0.6 1.0
2 1.8972396280005341 55.123449206352234 0.8125 1.0 0.6 1.0
3 2.755670120001014 54.722046077251434 0.8125 1.0 0.6 1.0
4 3.6050277929971344 54.029464304447174 0.9375 0.9583333333333334 0.6 0.5
..........................................
98 84.71962930099835 0.011957645416259766 1.0 0.7916666666666667 0.75 0.5
99 85.56498634699528 0.01192164421081543 1.0 0.7916666666666667 0.75 0.5

version

python version and requirments version please

Could you provide processed DUD-E Dataset?

I'm referring to your paper. I followed the setting mentioned in your paper " We randomly divided 102 DUD-E targets into 72 targets as a training dataset and 30 targets as a test dataset" to process DUD-E dataset, but I couldn't get the results as your paper. I want to know what I can do to get the best results, if I need to change the hyper-parameters?
I hope to your reply, thank you!

masashitsubaki / cpi_prediction Goto Github PK

cpi_prediction's People

Contributors

Stargazers

Watchers

Forkers

cpi_prediction's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs