xuhanliu / drugex Goto Github PK

View Code? Open in Web Editor NEW

187.0 187.0 65.0 90.36 MB

Deep learning toolkit for Drug Design with Pareto-based Multi-Objective optimization in Polypharmacology

License: MIT License

Python 100.00%

cheminformatics deep-learning graph-transformer multi-objective-optimization reinforcement-learning

drugex's People

Contributors

Stargazers

Watchers

drugex's Issues

Running Multitarget with smiles?

Thanks for the interesting tool.
How do we run multitarget with our own database(smiles) ?

UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 63: ordinal not in range(128)

in drugex2,I've met this issue:
Traceback (most recent call last):
File "F:\drugex-v2.0\drugex-v2.0\trainer.py", line 27, in
A1 = utils.Predictor('output/single/DNN_%s_CHEMBL226_4.pkg' % z, type=z)
File "F:\drugex-v2.0\drugex-v2.0\utils\objective.py", line 27, in init
self.model = joblib.load(path)
File "F:\Anaconda3\envs\david4\lib\site-packages\joblib\numpy_pickle.py", line 587, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "F:\Anaconda3\envs\david4\lib\site-packages\joblib\numpy_pickle.py", line 506, in _unpickle
obj = unpickler.load()
File "F:\Anaconda3\envs\david4\lib\pickle.py", line 1088, in load
dispatchkey[0]
File "F:\Anaconda3\envs\david4\lib\pickle.py", line 1123, in load_persid
"persistent IDs in protocol 0 must be ASCII strings")
_pickle.UnpicklingError: persistent IDs in protocol 0 must be ASCII strings

DrugEx-1.0 about figure6

Hi,Xuhan,
When I want to reappear the figure6(violin plot for the physicochemical proerties comparison) that comparison of the properties of generated molecules by the pre-trained and fine-tuned models, I found I lacked the 'mol_p.txt' , and 'mol_ex.txt', so I want to ask how do I get these data sets, especially the 'mol_p.txt', how I to use the pre-trained model generate molecular? I would be grateful if you could give me an answer !

AttributeError: 'int' object has no attribute 'predict_proba'

when run the agent.py , There was an error and I didn't debug it

Could you give me some advice? Thank you

Traceback (most recent call last):
File "agent.py", line 159, in
main()
File "agent.py", line 124, in main
Policy_gradient(agent, environ, explore=explore)
File "agent.py", line 38, in Policy_gradient
preds = environ(smiles)
File "C:\Users\wmy\DL\DrugEx\util.py", line 194, in call
preds = self.clf.predict_proba(fps)[:, 1]
AttributeError: 'int' object has no attribute 'predict_proba'

No utils.canonicalize_list in Drug_ex v2.0

Thanks for sharing such an interesting tool . But I got some problems when running the project.
In Drug_ex v2.0 ->rlearner.py ->class Evolve->def fit()

but there's no 'canonicalize_list' function in utils
Hope you can reply to me.

Link to Dataset.

Greetings,

Thanks for the great work.
Can you please provide the links to the dataset or the files which are used in dataset.py.
Because if I run the dataset.py with the collected files, it is throwing missing file error.
FileNotFoundError: [Errno 2] No such file or directory: 'D:/pycharmprojects/Drugg_Disc/data/ligand_mf_brics.txt'

Thanks

Zinc models

Is it possible for you to provide the already trained models on zinc/chembl? The pretraining on zinc dataset is time consuming hence I would like to use the model if it is already available. Thanks

miss file /data/chembl_voc.txt

Hi, great work! I may want to try it on my datasets. But it returned the error that "FileNotFoundError: [Errno 2] No such file or directory: 'data/chembl_voc.txt'." Could you please enclose this file? Thank you very much in advance!

Keyerror has been occurred when I run pretrainer.py

Hello, Xuhan
I got a key error when I run pretrainer.py scrip.

Could you please let me know why?

Thank you in advance.

`environ.py` does not work

environ.py does not work,
I have tried abcc509, but it happened error on:

DrugEx/environ.py

Line 319 in abcc509

df = df.unstack(pair[0])

I believe newer versions also do not work.
Which version did you use mt_task method to work?

Package code and make pip installable

It would be nice to make this code pip installable and to use entrypoints to make a vanity script through which the scripts can be run. I'd be happy to send a PR.

Error using designer with the default agent

I can use the code in the repo to train my own DrugEx agent model and in that case the designer.py script works just as it should when I point it to the agent I created.

However, I ran into a problem when I tried to use the data/agent.pkg file that was already committed to the repository (62423a7). I got the following error:

Traceback (most recent call last):
  File "/home/sichom/projects/DrugEx/designer.py", line 52, in <module>
    generate(agent_path, out_path, num=pop_size)
  File "/home/sichom/projects/DrugEx/designer.py", line 26, in generate
    agent.load_state_dict(torch.load(agent_path))
  File "/home/sichom/anaconda/envs/drugex/lib/python3.5/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Generator:
	Missing key(s) in state_dict: "embed.weight", "rnn.weight_ih_l0", "rnn.bias_hh_l2", "rnn.weight_hh_l2", "rnn.bias_hh_l0", "rnn.bias_ih_l1", "rnn.bias_hh_l1", "rnn.bias_ih_l0", "rnn.bias_ih_l2", "rnn.weight_ih_l1", "rnn.weight_ih_l2", "rnn.weight_hh_l1", "rnn.weight_hh_l0". 
	Unexpected key(s) in state_dict: "embedding.weight", "gru_1.weight_ih", "gru_1.weight_hh", "gru_1.bias_ih", "gru_1.bias_hh", "gru_2.weight_ih", "gru_2.weight_hh", "gru_2.bias_ih", "gru_2.bias_hh", "gru_3.weight_ih", "gru_3.weight_hh", "gru_3.bias_ih", "gru_3.bias_hh". 
	size mismatch for linear.bias: copying a param with shape torch.Size([58]) from checkpoint, the shape in current model is torch.Size([50]).
	size mismatch for linear.weight: copying a param with shape torch.Size([58, 512]) from checkpoint, the shape in current model is torch.Size([50, 512]).

Could it be because the repository version of the data/voc.txt does not correspond to the data/voc.txt that was used to create this data/agent.pkg perhaps?

I think it would be nice to have the files under data/ documented in the README.md so it is clear where this data/agent.pkg came from and what it can be used for (is it the one trained for the purpose of the publication or something else?)

Many thanks for any info.

NameError: name 'Dataset' is not defined

Hi Dr Liu,
Great thanks for the amazing work. I also had to your lecture from Bilili.
So I tried to redo your work, and found an error occurred several times (as below). Could you pls kindly enlighten me a bit about the possible solution? Thank you and happy new year!


# 试运行 python autodl-nas/DrugEx-master/dataset.py

(drugex) root@autodl-container-a64e118552-fa88b88c:~# python autodl-nas/DrugEx-master/dataset.py
Traceback (most recent call last):
  File "autodl-nas/DrugEx-master/dataset.py", line 7, in <module>
    from utils import VocSmiles as Voc
  File "/root/autodl-nas/DrugEx-master/utils/__init__.py", line 5, in <module>
    from .vocab import *
  File "/root/autodl-nas/DrugEx-master/utils/vocab.py", line 303, in <module>
    class TgtData(Dataset):
NameError: name 'Dataset' is not defined
(drugex) root@autodl-container-a64e118552-fa88b88c:~#

# 问题 utils/vocab.py文件中的class TgtData(Dataset):

Best,
Sarai

No such file or directory: 'data/LIGAND_RAW.tsv'

how to handle that?

Missing zinc/ folder

The dataset.py script requires a folder called zinc/ but it's not included in the repository. Is this some data that should be downloaded from ZINC?

Would you be able to provide either a shell script or another python script that could take care of getting this data from the source?

MT_DNN

Dear Dr. Liu，
I tried to repeat your code and found that the code is not complete. For example, there is no MT_DNN function in the environ.py script. Could you please post the complete code?

Yours,
Mukuo

how to use or simply start? any example?

RuntimeError: expected device cuda:0 and dtype Byte but got device cuda:0 and dtype Bool

Hi！

An error occurs when I run pretrainer.py. Can you give me some advice? Thanks a lot!

Issue summary

prior.fit(zinc, out=netP_path)
/tmp/pip-req-build-58y_cjjl/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
Traceback (most recent call last):

  File "<ipython-input-17-9e97d71a0301>", line 1, in <module>
    prior.fit(zinc, out=netP_path)

  File "/home/tensorflow/DrugEx/model.py", line 422, in fit
    seqs = self.sample(1000)

  File "/home/tensorflow/DrugEx/model.py", line 388, in sample
    is_end = torch.ge(is_end + end_token, 1)

RuntimeError: expected device cuda:0 and dtype Byte but got device cuda:0 and dtype Bool

Error module 'rdkit.Chem' has no attribute 'MolFromSimles'

With rdkit 2021.03.4
Error in file environ.py,

385     data = df.drop(test.index)
386

--> 387 test_x = utils.Predictor.calc_fp([Chem.MolFromSimles(mol) for mol in test.index])
388 data_x = utils.Predictor.calc_fp([Chem.MolFromSimles(mol) for mol in data.index])
389 out = 'output/single/%s_%s_%s' % (alg, 'REG' if reg else 'CLS', feat)

AttributeError: module 'rdkit.Chem' has no attribute 'MolFromSimles'

Change code "Chem.MolFromSimles" to "AllChem.MolFromSmiles"

It will be great if you would like to provide some examples

Thank you for your repo. It is very interesting that your repo can do more than one target. If possible, please add some examples or introductions to replicate your paper.

xuhanliu / drugex Goto Github PK

drugex's People

Contributors

Stargazers

Watchers

Forkers

drugex's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs