wangxr0526 / retroprime Goto Github PK

Code for Single-step Retrosynthesis model Retroprime

License: MIT License

Shell 4.64% Python 75.14% Jupyter Notebook 9.74% Makefile 0.08% TeX 2.82% Perl 4.62% Smalltalk 0.24% Emacs Lisp 2.15% JavaScript 0.11% NewLisp 0.20% Ruby 0.21% Slash 0.04% SystemVerilog 0.02%

retroprime's People

Contributors

Stargazers

Watchers

Forkers

aspirincode proevgenii rnaimehaom kymckay academich kongnang

retroprime's Issues

可以联系您吗？

方便话可以加我的qq729868721交流一下吗，谢谢！

运行 run_example.sh时，会出现RuntimeError: "index_select_out_cuda_impl" not implemented for 'Float'

Products to Synthons
0%| | 0/1 [00:00<?, ?it/s]/home/lzf/software/anaconda3/envs/seq_gr/lib/python3.7/site-packages/torchtext/data/field.py:359: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
var = torch.tensor(arr, dtype=self.dtype, device=device)
/home/lzf/programme/Retroprime/RetroPrime/retroprime/transformer_model/onmt/translate/translator.py:613: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
return torch.tensor(a, requires_grad=False)
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "retroprime/transformer_model/translate.py", line 53, in
main(opt)
File "retroprime/transformer_model/translate.py", line 34, in main
attn_debug=opt.attn_debug)
File "/home/lzf/programme/Retroprime/RetroPrime/retroprime/transformer_model/onmt/translate/translator.py", line 238, in translate
batch_data = self.translate_batch(batch, data, fast=self.fast)
File "/home/lzf/programme/Retroprime/RetroPrime/retroprime/transformer_model/onmt/translate/translator.py", line 375, in translate_batch
return self._translate_batch(batch, data)
File "/home/lzf/programme/Retroprime/RetroPrime/retroprime/transformer_model/onmt/translate/translator.py", line 712, in _translate_batch
beam_attn.data[:, j, :memory_lengths[j]])
File "/home/lzf/programme/Retroprime/RetroPrime/retroprime/transformer_model/onmt/translate/beam.py", line 140, in advance
self.attn.append(attn_out.index_select(0, prev_k))
RuntimeError: "index_select_out_cuda_impl" not implemented for 'Float'

How to make the reaction type unknown?

Hi
I am trying to see the results with our dataset when the reaction type is unknown.
Therefore, I want to know which parameter is related to reaction type when training?
In other words, what should I do if I want to make the reaction type unknown.

FileNotFoundError

Hello! When I run the run_example.sh script, it gives me a FileNotFoundError. Could you please let me know where I can obtain the files USPTO-50K_pos_pred_model_step_90000.pt and USPTO-50K_S2R_model_step_100000.pt? Thank you very much!

Always Loading Data

When I run the train.py file, loading data is always displayed. Is this normal?

How to get the top-k accuracy

How to get the data needed in the new_raw_all.csv file?

In the data directory provided, the file retrosim/retrosim/data/get_data.py, which is used to split the data_preprocessed.csv into train, validation and test sets is incomplete. It is generating error for missing function definition. How to get the test, train, validation data for compiling the new_raw_all.csv data?

Even this data: https://raw.githubusercontent.com/connorcoley/retrosim/0a272f0b5de833c448f41491e81e4dc00b4d85b0/retrosim/data/data_processed.csv
does not follow the format that retroprime needs.

Acc drops rapidly when training the P2S model in the uspto-full dataset

i would like to reproduce the result in the uspto-full dataset, but got some problems here: the accuracy of P2S drops rapidly.

i have trained over 50,000 steps and the acc was about 50%. is this normal?

[2022-08-09 14:28:33,864 INFO] encoder: 41252864
[2022-08-09 14:28:33,865 INFO] decoder: 54924817
[2022-08-09 14:28:33,865 INFO] * number of parameters: 96177681
[2022-08-09 14:28:33,889 INFO] Start training...
[2022-08-09 14:28:41,843 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.0.pt, number of examples: 1000000
[2022-08-09 14:28:41,844 INFO] train_iter finished
[2022-08-09 15:03:38,255 INFO] Step 1000/250000; acc:  81.60; ppl:  1.78; xent: 0.58; lr: 0.00012; 7395/7535 tok/s;   2096 sec
[2022-08-09 15:38:33,482 INFO] Step 2000/250000; acc:  95.38; ppl:  1.15; xent: 0.14; lr: 0.00025; 7380/7566 tok/s;   4192 sec
[2022-08-09 16:05:19,712 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.1.pt, number of examples: 1000000
[2022-08-09 16:13:46,337 INFO] Step 3000/250000; acc:  97.17; ppl:  1.09; xent: 0.09; lr: 0.00037; 7480/7610 tok/s;   6304 sec
[2022-08-09 16:48:51,050 INFO] Step 4000/250000; acc:  96.94; ppl:  1.11; xent: 0.11; lr: 0.00049; 7355/7484 tok/s;   8409 sec
[2022-08-09 17:23:55,865 INFO] Step 5000/250000; acc:  96.57; ppl:  1.11; xent: 0.10; lr: 0.00062; 7270/7407 tok/s;  10514 sec
[2022-08-09 17:42:44,180 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.2.pt, number of examples: 1000000
[2022-08-09 17:59:22,875 INFO] Step 6000/250000; acc:  96.90; ppl:  1.09; xent: 0.09; lr: 0.00074; 7399/7564 tok/s;  12641 sec
[2022-08-09 18:34:50,745 INFO] Step 7000/250000; acc:  95.93; ppl:  1.12; xent: 0.11; lr: 0.00086; 7320/7514 tok/s;  14769 sec
[2022-08-09 19:10:08,914 INFO] Step 8000/250000; acc:  96.79; ppl:  1.10; xent: 0.09; lr: 0.00099; 7289/7435 tok/s;  16887 sec
[2022-08-09 19:20:38,308 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.3.pt, number of examples: 1000000
[2022-08-09 19:45:17,775 INFO] Step 9000/250000; acc:  96.71; ppl:  1.11; xent: 0.10; lr: 0.00093; 7414/7583 tok/s;  18996 sec
[2022-08-09 20:20:10,783 INFO] Step 10000/250000; acc:  96.74; ppl:  1.11; xent: 0.10; lr: 0.00088; 7208/7353 tok/s;  21089 sec
[2022-08-09 20:20:10,787 INFO] Saving checkpoint experiments/checkpoints/uspto_full_pos_pred/151_uspto_full_pos_pred_model_step_10000.pt
[2022-08-09 20:55:25,013 INFO] Step 11000/250000; acc:  88.51; ppl:  1.44; xent: 0.36; lr: 0.00084; 6400/6718 tok/s;  23203 sec
[2022-08-09 21:00:21,009 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.4.pt, number of examples: 1000000
[2022-08-09 21:30:28,027 INFO] Step 12000/250000; acc:  92.56; ppl:  1.27; xent: 0.24; lr: 0.00081; 7225/7443 tok/s;  25306 sec
[2022-08-09 22:05:20,563 INFO] Step 13000/250000; acc:  85.15; ppl:  1.60; xent: 0.47; lr: 0.00078; 7314/7448 tok/s;  27399 sec
[2022-08-09 22:40:05,992 INFO] Step 14000/250000; acc:  62.15; ppl:  3.08; xent: 1.13; lr: 0.00075; 7438/7593 tok/s;  29484 sec
[2022-08-09 22:40:22,370 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.5.pt, number of examples: 1000000
[2022-08-09 23:14:50,982 INFO] Step 15000/250000; acc:  48.91; ppl:  4.95; xent: 1.60; lr: 0.00072; 7404/7537 tok/s;  31569 sec
[2022-08-09 23:49:25,286 INFO] Step 16000/250000; acc:  51.57; ppl:  4.75; xent: 1.56; lr: 0.00070; 6632/7134 tok/s;  33643 sec
[2022-08-10 00:19:41,798 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.6.pt, number of examples: 1000000
[2022-08-10 00:24:05,406 INFO] Step 17000/250000; acc:  52.31; ppl:  4.61; xent: 1.53; lr: 0.00068; 7293/7659 tok/s;  35724 sec
[2022-08-10 00:58:41,297 INFO] Step 18000/250000; acc:  50.13; ppl:  4.72; xent: 1.55; lr: 0.00066; 6996/7350 tok/s;  37799 sec
[2022-08-10 01:33:25,863 INFO] Step 19000/250000; acc:  52.59; ppl:  4.30; xent: 1.46; lr: 0.00064; 7378/7524 tok/s;  39884 sec

any reply would be greatly appreciated!

Environment setup failure

Opening this as more of a note to future readers. The current Anaconda environment setup instructions fail to solve for me and after much experimentation I was able to get a working environment like so:

conda create -n retroprime-env python=3.6 pytorch=1.5.0 torchvision torchtext cudatoolkit=10.1 -c pytorch
conda activate retroprime-env
pip install rdkit-pypi
conda install pandas tqdm six

For whatever reason, this would only resolve for me if I specify all of the PyTorch dependencies when first creating the environment like that. I found that RDKit had to be installed from pip (conda complains about conflicts).

Newer versions of PyTorch will not work and will encounter quite an opaque error to do with index data types. Additionally, the project is using legacy data structures from the older versions of torchtext.

Also note that some of these are undocumented dependencies, but you'll find you need them when trying to train and test.

wangxr0526 / retroprime Goto Github PK

retroprime's People

Contributors

Stargazers

Watchers

Forkers

retroprime's Issues

可以联系您吗？

运行 run_example.sh时，会出现RuntimeError: "index_select_out_cuda_impl" not implemented for 'Float'

How to make the reaction type unknown?

FileNotFoundError

Always Loading Data

How to get the top-k accuracy

How to get the data needed in the new_raw_all.csv file?

Acc drops rapidly when training the P2S model in the uspto-full dataset

Environment setup failure

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs