GithubHelp home page GithubHelp logo

clinfo / retrek Goto Github PK

View Code? Open in Web Editor NEW
29.0 6.0 9.0 6.95 MB

ReTReK: data-driven ReTrosynthesis planning application using Retrosynthesis Knowledge

License: MIT License

Java 2.69% CSS 1.29% Python 19.42% Pawn 76.37% HTML 0.23%

retrek's Introduction

ReTReK: ReTrosynthesis planning application using Retrosynthesis Knowledge

This package provides a data-driven computer-aided synthesis planning tool using retrosynthesis knowledge. In this package, the model of ReTReK was trained with US Patent dataset instead of Reaxys reaction dataset. Hence, please note that we cannot guarantee that the model gives the same synthetic routes in the manuscript.

Note The pure Python version of ReTReK is available at https://github.com/clinfo/ReTReKpy

Dependancy

Environment (confirmed)

  • Ubuntu: 18.04 (model training & synthetic route prediction)
  • macOS Catalina: 10.15.7 (synthetic route prediction)

Package

Setup

Please refer to the following link.

Example usage

Note: The order of the knowledge arguments corresponds to that of the knowledge_weight arguments.

javac CxnUtils.java  # for the first time only

# use all knowledge
python run.py --config config/sample.json --target data/sample.mol --knowledge cdscore rdscore asscore stscore --knowledge_weights 1.0 1.0 1.0 1.0

# use CDScore with a weight of 2.0
python run.py --config config/sample.json --target data/sample.mol --knowledge cdscore --knowledge_weights 2.0 0.0 0.0 0.0

If you want to try your own molecule, prepare the molecule as MDL MOLfile format and replace data/sample.mol with the prepared file.

The target molecules used in the manuscript are stored in data/evaluation_compounds. If you want to try the molecules in the directory, run the command as follows:

NOTE: You need to download additional files using git-lfs to run the below command. At first, run git lfs install && git lfs pull to download data/starting_materials_zinc.smi.

python run.py --config config/sample2.json --target data/evaluation_compounds/drug-like-compounds/MtbTMPK_inhibitor.mol --knowledge cdscore --knowledge_weights 5.0 0.0 0.0 0.0 --sel_const 10 --expansion_num 500

python run.py --config config/sample2.json --target data/evaluation_compounds/drug-like-compounds/α7_nicotinic_acetylcholine_receptor_silent_agonist.mol --knowledge cdscore --knowledge_weights 5.0 0.0 0.0 0.0 --sel_const 10 --expansion_num 500

Optional arguments

  • --sel_const: constant value for selection (default value is set to 3).
  • --expansion_num: number of reaction templates used in the expansion step (default value is set to 50).
  • --starting_material: path to SMILES format file containing starting materials.
  • --search_count: the maximum number of iterations of MCTS (default value is set to 100).

Terms

Convergent Disconnection Score (CDScore)

CDScore aims to favor convergent synthesis, which is known as an efficient strategy in multi-step chemical synthesis.

Available Substances Score (ASScore)

For a similar purpose of CDScore, the number of available substances generated in a reaction step is calculated.

Ring Disconnection Score (RDScore)

A ring construction strategy is preferred if a target compounds has complex ring structures.

Selective Transformation Score (STScore)

A synthetic reaction with few by-products is generally preferred in terms of yield.

Contact

Reference

@article{Ishida2022,
  doi = {10.1021/acs.jcim.1c01074},
  url = {https://doi.org/10.1021/acs.jcim.1c01074},
  year = {2022},
  month = mar,
  publisher = {American Chemical Society ({ACS})},
  volume = {62},
  number = {6},
  pages = {1357--1367},
  author = {Shoichi Ishida and Kei Terayama and Ryosuke Kojima and Kiyosei Takasu and Yasushi Okuno},
  title = {{AI}-Driven Synthetic Route Design Incorporated with Retrosynthesis Knowledge},
  journal = {Journal of Chemical Information and Modeling}
}

This application is developed as part of a kGCN project.

retrek's People

Contributors

sishida21 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

retrek's Issues

jchem version related question

error syntax :

(retrek) root@12d7c7a2ac87:~/ReTReK# javac CxnUtils.java
CxnUtils.java:1: error: cannot access ValenceErrorChecker
import chemaxon.checkers.ValenceErrorChecker;
^
bad class file: /opt/opt/chemaxon/jchemsuite/lib/com.chemaxon-structurechecker.jar(chemaxon/checkers/ValenceErrorChecker.class)
class file has wrong version 55.0, should be 52.0
Please remove or make sure it appears in the correct subdirectory of the classpath.

question:

Currently, only jchem version 20.19 is available. I wonder if there is no way to use it other than the 20.13 version suggested in the instructions.

Could you please share the version linux installation file?

Deprecated Dependency Versions in Environment.yml

Hi Team,

I need a little help. I am trying to create an environment using the defult environent.yml in a ubuntu 18.04.6 ARM machine.
But I am facing issues with the following packages-

  • freetype==2.9.1=h8a8886c_1
  • libxslt==1.1.33=h7d1a2b0_0
  • setuptools==46.4.0=py37_0
  • yaml==0.1.7=had09818_2
  • intel-openmp==2020.1=217
  • mkl==2020.1=217
  • tensorflow==1.13.1=gpu_py37hc158e3b_0
  • itsdangerous==1.1.0=py37_0
  • scikit-learn==0.22.1=py37hd81dba3_0
  • libstdcxx-ng==9.1.0=hdf63c60_0
  • pyzmq==18.1.1=py37he6710b0_0
  • lxml==4.5.0=py37hefd8a0e_0
  • beautifulsoup4==4.9.1=py37_0
  • icu==58.2=he6710b0_3
  • lz4-c==1.9.2=he6710b0_0
  • webencodings==0.5.1=py37_1
  • ipykernel==5.1.4=py37h39e3cac_0
  • mkl-service==2.3.0=py37he904b0f_0
  • libtiff==4.1.0=h2733197_1
  • docopt==0.6.2=py37_0
  • libboost==1.67.0=h46d08c1_4
  • sip==4.19.8=py37hf484d3e_0
  • blas==1.0=mkl
  • libffi==3.3=he6710b0_1
  • openssl==1.1.1g=h7b6447c_0
  • cffi==1.14.0=py37he30daa8_1
  • send2trash==1.5.0=py37_0
  • astor==0.8.0=py37_0
  • zeromq==4.3.1=he6710b0_3
  • _tflow_select==2.1.0=gpu
  • dbus==1.13.14=hb2f20db_0
  • gst-plugins-base==1.14.0=hbbd80ab_1
  • ld_impl_linux-64==2.33.1=h53a641e_7
  • tensorboard==1.13.1=py37hf484d3e_0
  • sqlite==3.31.1=h62c20be_1
  • pyqt==5.9.2=py37h05f1152_2
  • hdf5==1.10.4=hb1b8bf9_0
  • py4j==0.10.8.1=py37_0
  • cycler==0.10.0=py37_0
  • cudnn==7.6.5=cuda10.0_0
  • cudatoolkit==10.0.130=0
  • libgfortran-ng==7.3.0=hdf63c60_0
  • qt==5.9.7=h5867ecd_1
  • c-ares==1.15.0=h7b6447c_1001
  • numpy-base==1.18.1=py37hde5b4d6_1
  • glib==2.63.1=h3eb4bd4_1
  • libedit==3.1.20181209=hc058e9b_0
  • kiwisolver==1.2.0=py37hfd86e86_0
  • libgcc-ng==9.1.0=hdf63c60_0
  • ncurses==6.2=he6710b0_1
  • pyyaml==5.3.1=py37h7b6447c_0
  • matplotlib-base==3.1.3=py37hef1b27d_0
  • expat==2.2.6=he6710b0_0
  • pickleshare==0.7.5=py37_0
  • libxml2==2.9.9=hea5a465_1
  • cairo==1.14.12=h8948797_3
  • markdown==3.1.1=py37_0
  • entrypoints==0.3=py37_0
  • numpy==1.18.1=py37h4f9e942_0
  • libuuid==1.0.3=h1bed415_2
  • mistune==0.8.4=py37h7b6447c_0
  • ipython==7.13.0=py37h5ca1d4c_0
  • pyrsistent==0.16.0=py37h7b6447c_0
  • zstd==1.4.4=h0b5b093_3
  • pandas==1.0.3=py37h0573a6f_0
  • tensorflow-base==1.13.1=gpu_py37h8d69cac_0
  • libpng==1.6.37=hbc83047_0
  • importlib-metadata==1.6.0=py37_0
  • markupsafe==1.1.1=py37h7b6447c_0
  • py-boost==1.67.0=py37h04863e7_4
  • ipython_genutils==0.2.0=py37_0
  • tensorflow-gpu==1.13.1=h0d30ee6_0
  • absl-py==0.9.0=py37_0
  • urllib3==1.25.8=py37_0
  • python==3.7.7=hcff3b4d_5
  • gmp==6.1.2=h6c8ec71_1
  • cupti==10.0.130=0
  • jpeg==9b=h024ee3a_2
  • readline==8.0=h7b6447c_0
  • pcre==8.43=he6710b0_0
  • h5py==2.10.0=py37h7918eee_0
  • mkl_random==1.1.1=py37h0573a6f_0
  • pillow==7.1.2=py37hb39fc2d_0
  • jedi==0.17.0=py37_0
  • ca-certificates==2020.1.1=0
  • keras-base==2.3.1=py37_0
  • pandocfilters==1.4.2=py37_1
  • pixman==0.38.0=h7b6447c_0
  • gstreamer==1.14.0=hb31296c_0
  • xz==5.2.5=h7b6447c_0
  • tk==8.6.8=hbc83047_0
  • fontconfig==2.13.0=h9420a91_0
  • bzip2==1.0.8=h7b6447c_0
  • pandoc==2.2.3.2=0
  • pip==20.0.2=py37_3
  • libsodium==1.0.16=h1bed415_0
  • jupyter==1.0.0=py37_7
  • scipy==1.4.1=py37h0b6359f_0
  • certifi==2020.4.5.1=py37_0
  • libprotobuf==3.11.4=hd408876_0
  • sqlalchemy==1.3.17=py37h7b6447c_0
  • tornado==6.0.4=py37h7b6447c_1
  • ptyprocess==0.6.0=py37_0
  • libxcb==1.13=h1bed415_1
  • grpcio==1.27.2=py37hf8bcb03_0
  • protobuf==3.11.4=py37he6710b0_0
  • backcall==0.1.0=py37_0
  • rdkit==2020.03.2.0=py37hc20afe1_1
  • zlib==1.2.11=h7b6447c_3
  • keras-gpu==2.3.1=0
  • mkl_fft==1.0.15=py37ha843d7b_0
  • termcolor==1.1.0=py37_1
  • cryptography==2.9.2=py37h1ba5d50_0

While checking conda channels, I found that the versions and build mentioned above are not available.
I have not worked with conda before.
Kindly let me know if there is an easy way to replace them with an existing version/build wihtout having to perform conda search for each packages and updating the info individually/manually.

If you can update the environment.yml file from your side, that would be much appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.