xuhanliu / drugex Goto Github PK
View Code? Open in Web Editor NEWDeep learning toolkit for Drug Design with Pareto-based Multi-Objective optimization in Polypharmacology
License: MIT License
Deep learning toolkit for Drug Design with Pareto-based Multi-Objective optimization in Polypharmacology
License: MIT License
Thanks for the interesting tool.
How do we run multitarget with our own database(smiles) ?
in drugex2,I've met this issue:
Traceback (most recent call last):
File "F:\drugex-v2.0\drugex-v2.0\trainer.py", line 27, in
A1 = utils.Predictor('output/single/DNN_%s_CHEMBL226_4.pkg' % z, type=z)
File "F:\drugex-v2.0\drugex-v2.0\utils\objective.py", line 27, in init
self.model = joblib.load(path)
File "F:\Anaconda3\envs\david4\lib\site-packages\joblib\numpy_pickle.py", line 587, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "F:\Anaconda3\envs\david4\lib\site-packages\joblib\numpy_pickle.py", line 506, in _unpickle
obj = unpickler.load()
File "F:\Anaconda3\envs\david4\lib\pickle.py", line 1088, in load
dispatchkey[0]
File "F:\Anaconda3\envs\david4\lib\pickle.py", line 1123, in load_persid
"persistent IDs in protocol 0 must be ASCII strings")
_pickle.UnpicklingError: persistent IDs in protocol 0 must be ASCII strings
Hi,Xuhan,
When I want to reappear the figure6(violin plot for the physicochemical proerties comparison) that comparison of the properties of generated molecules by the pre-trained and fine-tuned models, I found I lacked the 'mol_p.txt' , and 'mol_ex.txt', so I want to ask how do I get these data sets, especially the 'mol_p.txt', how I to use the pre-trained model generate molecular? I would be grateful if you could give me an answer !
when run the agent.py , There was an error and I didn't debug it
Could you give me some advice? Thank you
Traceback (most recent call last):
File "agent.py", line 159, in
main()
File "agent.py", line 124, in main
Policy_gradient(agent, environ, explore=explore)
File "agent.py", line 38, in Policy_gradient
preds = environ(smiles)
File "C:\Users\wmy\DL\DrugEx\util.py", line 194, in call
preds = self.clf.predict_proba(fps)[:, 1]
AttributeError: 'int' object has no attribute 'predict_proba'
Greetings,
Thanks for the great work.
Can you please provide the links to the dataset or the files which are used in dataset.py.
Because if I run the dataset.py with the collected files, it is throwing missing file error.
FileNotFoundError: [Errno 2] No such file or directory: 'D:/pycharmprojects/Drugg_Disc/data/ligand_mf_brics.txt'
Thanks
Is it possible for you to provide the already trained models on zinc/chembl? The pretraining on zinc dataset is time consuming hence I would like to use the model if it is already available. Thanks
Hi, great work! I may want to try it on my datasets. But it returned the error that "FileNotFoundError: [Errno 2] No such file or directory: 'data/chembl_voc.txt'." Could you please enclose this file? Thank you very much in advance!
It would be nice to make this code pip installable and to use entrypoints to make a vanity script through which the scripts can be run. I'd be happy to send a PR.
I can use the code in the repo to train my own DrugEx agent model and in that case the designer.py
script works just as it should when I point it to the agent I created.
However, I ran into a problem when I tried to use the data/agent.pkg
file that was already committed to the repository (62423a7). I got the following error:
Traceback (most recent call last):
File "/home/sichom/projects/DrugEx/designer.py", line 52, in <module>
generate(agent_path, out_path, num=pop_size)
File "/home/sichom/projects/DrugEx/designer.py", line 26, in generate
agent.load_state_dict(torch.load(agent_path))
File "/home/sichom/anaconda/envs/drugex/lib/python3.5/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Generator:
Missing key(s) in state_dict: "embed.weight", "rnn.weight_ih_l0", "rnn.bias_hh_l2", "rnn.weight_hh_l2", "rnn.bias_hh_l0", "rnn.bias_ih_l1", "rnn.bias_hh_l1", "rnn.bias_ih_l0", "rnn.bias_ih_l2", "rnn.weight_ih_l1", "rnn.weight_ih_l2", "rnn.weight_hh_l1", "rnn.weight_hh_l0".
Unexpected key(s) in state_dict: "embedding.weight", "gru_1.weight_ih", "gru_1.weight_hh", "gru_1.bias_ih", "gru_1.bias_hh", "gru_2.weight_ih", "gru_2.weight_hh", "gru_2.bias_ih", "gru_2.bias_hh", "gru_3.weight_ih", "gru_3.weight_hh", "gru_3.bias_ih", "gru_3.bias_hh".
size mismatch for linear.bias: copying a param with shape torch.Size([58]) from checkpoint, the shape in current model is torch.Size([50]).
size mismatch for linear.weight: copying a param with shape torch.Size([58, 512]) from checkpoint, the shape in current model is torch.Size([50, 512]).
Could it be because the repository version of the data/voc.txt
does not correspond to the data/voc.txt
that was used to create this data/agent.pkg
perhaps?
I think it would be nice to have the files under data/
documented in the README.md
so it is clear where this data/agent.pkg
came from and what it can be used for (is it the one trained for the purpose of the publication or something else?)
Many thanks for any info.
Hi Dr Liu,
Great thanks for the amazing work. I also had to your lecture from Bilili.
So I tried to redo your work, and found an error occurred several times (as below). Could you pls kindly enlighten me a bit about the possible solution? Thank you and happy new year!
# 试运行 python autodl-nas/DrugEx-master/dataset.py
(drugex) root@autodl-container-a64e118552-fa88b88c:~# python autodl-nas/DrugEx-master/dataset.py
Traceback (most recent call last):
File "autodl-nas/DrugEx-master/dataset.py", line 7, in <module>
from utils import VocSmiles as Voc
File "/root/autodl-nas/DrugEx-master/utils/__init__.py", line 5, in <module>
from .vocab import *
File "/root/autodl-nas/DrugEx-master/utils/vocab.py", line 303, in <module>
class TgtData(Dataset):
NameError: name 'Dataset' is not defined
(drugex) root@autodl-container-a64e118552-fa88b88c:~#
# 问题 utils/vocab.py文件中的class TgtData(Dataset):
Best,
Sarai
how to handle that?
The dataset.py
script requires a folder called zinc/
but it's not included in the repository. Is this some data that should be downloaded from ZINC?
Would you be able to provide either a shell script or another python script that could take care of getting this data from the source?
Hi!
An error occurs when I run pretrainer.py. Can you give me some advice? Thanks a lot!
Issue summary
prior.fit(zinc, out=netP_path)
/tmp/pip-req-build-58y_cjjl/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
Traceback (most recent call last):
File "<ipython-input-17-9e97d71a0301>", line 1, in <module>
prior.fit(zinc, out=netP_path)
File "/home/tensorflow/DrugEx/model.py", line 422, in fit
seqs = self.sample(1000)
File "/home/tensorflow/DrugEx/model.py", line 388, in sample
is_end = torch.ge(is_end + end_token, 1)
RuntimeError: expected device cuda:0 and dtype Byte but got device cuda:0 and dtype Bool
With rdkit 2021.03.4
Error in file environ.py,
385 data = df.drop(test.index)
386
--> 387 test_x = utils.Predictor.calc_fp([Chem.MolFromSimles(mol) for mol in test.index])
388 data_x = utils.Predictor.calc_fp([Chem.MolFromSimles(mol) for mol in data.index])
389 out = 'output/single/%s_%s_%s' % (alg, 'REG' if reg else 'CLS', feat)
AttributeError: module 'rdkit.Chem' has no attribute 'MolFromSimles'
Change code "Chem.MolFromSimles" to "AllChem.MolFromSmiles"
Thank you for your repo. It is very interesting that your repo can do more than one target. If possible, please add some examples or introductions to replicate your paper.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.