wangxr0526 / retroprime Goto Github PK
View Code? Open in Web Editor NEWCode for Single-step Retrosynthesis model Retroprime
License: MIT License
Code for Single-step Retrosynthesis model Retroprime
License: MIT License
方便话可以加我的qq729868721交流一下吗,谢谢!
Products to Synthons
0%| | 0/1 [00:00<?, ?it/s]/home/lzf/software/anaconda3/envs/seq_gr/lib/python3.7/site-packages/torchtext/data/field.py:359: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
var = torch.tensor(arr, dtype=self.dtype, device=device)
/home/lzf/programme/Retroprime/RetroPrime/retroprime/transformer_model/onmt/translate/translator.py:613: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
return torch.tensor(a, requires_grad=False)
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "retroprime/transformer_model/translate.py", line 53, in
main(opt)
File "retroprime/transformer_model/translate.py", line 34, in main
attn_debug=opt.attn_debug)
File "/home/lzf/programme/Retroprime/RetroPrime/retroprime/transformer_model/onmt/translate/translator.py", line 238, in translate
batch_data = self.translate_batch(batch, data, fast=self.fast)
File "/home/lzf/programme/Retroprime/RetroPrime/retroprime/transformer_model/onmt/translate/translator.py", line 375, in translate_batch
return self._translate_batch(batch, data)
File "/home/lzf/programme/Retroprime/RetroPrime/retroprime/transformer_model/onmt/translate/translator.py", line 712, in _translate_batch
beam_attn.data[:, j, :memory_lengths[j]])
File "/home/lzf/programme/Retroprime/RetroPrime/retroprime/transformer_model/onmt/translate/beam.py", line 140, in advance
self.attn.append(attn_out.index_select(0, prev_k))
RuntimeError: "index_select_out_cuda_impl" not implemented for 'Float'
Hi
I am trying to see the results with our dataset when the reaction type is unknown.
Therefore, I want to know which parameter is related to reaction type when training?
In other words, what should I do if I want to make the reaction type unknown.
Hello! When I run the run_example.sh script, it gives me a FileNotFoundError. Could you please let me know where I can obtain the files USPTO-50K_pos_pred_model_step_90000.pt and USPTO-50K_S2R_model_step_100000.pt? Thank you very much!
In the data directory provided, the file retrosim/retrosim/data/get_data.py, which is used to split the data_preprocessed.csv into train, validation and test sets is incomplete. It is generating error for missing function definition. How to get the test, train, validation data for compiling the new_raw_all.csv data?
Even this data: https://raw.githubusercontent.com/connorcoley/retrosim/0a272f0b5de833c448f41491e81e4dc00b4d85b0/retrosim/data/data_processed.csv
does not follow the format that retroprime needs.
i would like to reproduce the result in the uspto-full dataset, but got some problems here: the accuracy of P2S drops rapidly.
i have trained over 50,000 steps and the acc was about 50%. is this normal?
[2022-08-09 14:28:33,864 INFO] encoder: 41252864
[2022-08-09 14:28:33,865 INFO] decoder: 54924817
[2022-08-09 14:28:33,865 INFO] * number of parameters: 96177681
[2022-08-09 14:28:33,889 INFO] Start training...
[2022-08-09 14:28:41,843 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.0.pt, number of examples: 1000000
[2022-08-09 14:28:41,844 INFO] train_iter finished
[2022-08-09 15:03:38,255 INFO] Step 1000/250000; acc: 81.60; ppl: 1.78; xent: 0.58; lr: 0.00012; 7395/7535 tok/s; 2096 sec
[2022-08-09 15:38:33,482 INFO] Step 2000/250000; acc: 95.38; ppl: 1.15; xent: 0.14; lr: 0.00025; 7380/7566 tok/s; 4192 sec
[2022-08-09 16:05:19,712 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.1.pt, number of examples: 1000000
[2022-08-09 16:13:46,337 INFO] Step 3000/250000; acc: 97.17; ppl: 1.09; xent: 0.09; lr: 0.00037; 7480/7610 tok/s; 6304 sec
[2022-08-09 16:48:51,050 INFO] Step 4000/250000; acc: 96.94; ppl: 1.11; xent: 0.11; lr: 0.00049; 7355/7484 tok/s; 8409 sec
[2022-08-09 17:23:55,865 INFO] Step 5000/250000; acc: 96.57; ppl: 1.11; xent: 0.10; lr: 0.00062; 7270/7407 tok/s; 10514 sec
[2022-08-09 17:42:44,180 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.2.pt, number of examples: 1000000
[2022-08-09 17:59:22,875 INFO] Step 6000/250000; acc: 96.90; ppl: 1.09; xent: 0.09; lr: 0.00074; 7399/7564 tok/s; 12641 sec
[2022-08-09 18:34:50,745 INFO] Step 7000/250000; acc: 95.93; ppl: 1.12; xent: 0.11; lr: 0.00086; 7320/7514 tok/s; 14769 sec
[2022-08-09 19:10:08,914 INFO] Step 8000/250000; acc: 96.79; ppl: 1.10; xent: 0.09; lr: 0.00099; 7289/7435 tok/s; 16887 sec
[2022-08-09 19:20:38,308 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.3.pt, number of examples: 1000000
[2022-08-09 19:45:17,775 INFO] Step 9000/250000; acc: 96.71; ppl: 1.11; xent: 0.10; lr: 0.00093; 7414/7583 tok/s; 18996 sec
[2022-08-09 20:20:10,783 INFO] Step 10000/250000; acc: 96.74; ppl: 1.11; xent: 0.10; lr: 0.00088; 7208/7353 tok/s; 21089 sec
[2022-08-09 20:20:10,787 INFO] Saving checkpoint experiments/checkpoints/uspto_full_pos_pred/151_uspto_full_pos_pred_model_step_10000.pt
[2022-08-09 20:55:25,013 INFO] Step 11000/250000; acc: 88.51; ppl: 1.44; xent: 0.36; lr: 0.00084; 6400/6718 tok/s; 23203 sec
[2022-08-09 21:00:21,009 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.4.pt, number of examples: 1000000
[2022-08-09 21:30:28,027 INFO] Step 12000/250000; acc: 92.56; ppl: 1.27; xent: 0.24; lr: 0.00081; 7225/7443 tok/s; 25306 sec
[2022-08-09 22:05:20,563 INFO] Step 13000/250000; acc: 85.15; ppl: 1.60; xent: 0.47; lr: 0.00078; 7314/7448 tok/s; 27399 sec
[2022-08-09 22:40:05,992 INFO] Step 14000/250000; acc: 62.15; ppl: 3.08; xent: 1.13; lr: 0.00075; 7438/7593 tok/s; 29484 sec
[2022-08-09 22:40:22,370 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.5.pt, number of examples: 1000000
[2022-08-09 23:14:50,982 INFO] Step 15000/250000; acc: 48.91; ppl: 4.95; xent: 1.60; lr: 0.00072; 7404/7537 tok/s; 31569 sec
[2022-08-09 23:49:25,286 INFO] Step 16000/250000; acc: 51.57; ppl: 4.75; xent: 1.56; lr: 0.00070; 6632/7134 tok/s; 33643 sec
[2022-08-10 00:19:41,798 INFO] Loading train dataset from data/uspto_full_pos_pred/uspto_full_pos_pred.train.6.pt, number of examples: 1000000
[2022-08-10 00:24:05,406 INFO] Step 17000/250000; acc: 52.31; ppl: 4.61; xent: 1.53; lr: 0.00068; 7293/7659 tok/s; 35724 sec
[2022-08-10 00:58:41,297 INFO] Step 18000/250000; acc: 50.13; ppl: 4.72; xent: 1.55; lr: 0.00066; 6996/7350 tok/s; 37799 sec
[2022-08-10 01:33:25,863 INFO] Step 19000/250000; acc: 52.59; ppl: 4.30; xent: 1.46; lr: 0.00064; 7378/7524 tok/s; 39884 sec
any reply would be greatly appreciated!
Opening this as more of a note to future readers. The current Anaconda environment setup instructions fail to solve for me and after much experimentation I was able to get a working environment like so:
conda create -n retroprime-env python=3.6 pytorch=1.5.0 torchvision torchtext cudatoolkit=10.1 -c pytorch
conda activate retroprime-env
pip install rdkit-pypi
conda install pandas tqdm six
For whatever reason, this would only resolve for me if I specify all of the PyTorch dependencies when first creating the environment like that. I found that RDKit had to be installed from pip (conda complains about conflicts).
Newer versions of PyTorch will not work and will encounter quite an opaque error to do with index data types. Additionally, the project is using legacy data structures from the older versions of torchtext.
Also note that some of these are undocumented dependencies, but you'll find you need them when trying to train and test.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.