GithubHelp home page GithubHelp logo

morld's Introduction

MORLD

MORLD is a molecule optimization method based on reinforcement learning and docking. This repository provides the source code of the main part of the MORLD software and its usage.

To run a demo, you need to prepare the enviroment described below.

Or simply go to MORLD web service (http://morld.kaist.ac.kr) and see the Tutorial page. The demo prepared at MORLD web service takes 1~2 days to get the result.

Prepare

Enviroment setting

  1. MolDQN and its requirements (RL framework): https://github.com/google-research/google-research/tree/master/mol_dqn

The MORLD is based on MolDQN. Therefore, to run MORLD in standalone, you must have an environment that you can run MolDQN. Usage of the MORLD is similar with MolDQN too.

  1. rdkit (QED score and molecule modification): https://www.rdkit.org/docs/Install.html
  2. gym-molecule library (SA score): https://github.com/bowenliu16/rl_graph_generation/tree/master/gym-molecule
  3. QuickVina2 (docking score): https://github.com/QVina/qvina
  4. open babel (converting file types of a molecule): https://openbabel.org/docs/dev/Installation/install.html
  5. mgltools for linux (preprocessing a target protein): http://mgltools.scripps.edu/downloads

Verified dependencies

The working of MORLD has been verified in the versions below.

  1. python: 3.7.6
  2. MolDQN: Latest github version (https://github.com/aksub99/MolDQN-pytorch)
  3. rdkit: 2018.09.1
  4. mgltools: mgltools_Linux-x86_64_1.5.7 (https://ccsb.scripps.edu/mgltools/downloads/)
  5. gym-molecule: Latest github version (https://github.com/bowenliu16/rl_graph_generation/tree/master/gym-molecule)
  6. QuickVina2: Latest github version (https://github.com/QVina/qvina)
  7. open babel: 2.4.1
  8. pandas: 1.0.1
  9. baselines: Latest github version (https://github.com/openai/baselines#installation)
  10. absl-py: 0.9.0
  11. networkx: 2.4
  12. numpy: 1.18.1
  13. tensorflow: 1.14.0

Preprocessing of a target protein

For running QuickVina2, the PDB file should have no ligand molecules. You need to remove the ligand molecules with tools like pymol before docking. And PDB file should be protonated to appropriately. You can use PDB2PQR server to protonate PDB file.

Also, the target protein is given as a pdbqt file format. Please follow the intruction of the below link to convert a pdb file format to a pdbqt file format.

http://autodock.scripps.edu/faqs-help/how-to/how-to-prepare-a-receptor-file-for-autodock4 or https://bioinformaticsreview.com/20200716/prepare-receptor-and-ligand-files-for-docking-using-python-scripts/

We provide an example pdbqt file of the protein DDR1 (discoidin domain receptor 1) 3zosA_prepared.pdbqt for demo in this repository.

Configuration file for docking

For running QuickVina2, you need a configuration file. Make the configuration file looks like below as "config.txt".

receptor = receptor.pdbqt
ligand = ligand.pdbqt

#binding_pocket
center_x = ###
center_y = ###
center_z = ###

size_x = ###
size_y = ###
size_z = ###

Fill the file name of the receptor at the placeholder of receptor. (You do not need to change the name of ligand file.) Fill the binding pocket information with the coordinate and the size of the grid box in Angstrom (ร…).

An example configuration file for demo is also provides as config.txt in this repository.

Place the required files

MORLD works inside the MolDQN. Place the below files into mol_dqn/chemgraph/ directory.

  1. optimized_BE.py file
  2. 3zos.pdbqt the receptor file with pdbqt format.
  3. config.txt file

Usage

Choose the output directory

export OUTPUT_DIR="./save"

Set the initial molecule (lead molecule)

export INIT_MOL="C1CC2=CC=CC=C2N(C1)C(=O)CN3CCC(CC3)NC4=NC(=CC(=O)N4)C(F)(F)F"

Set your own initial molecule with a SMILES representation. The example SMILES is ZINC12114041 which is found by virtual screening against the protein DDR1 (3zos).

Set the hyperparameters

At mol_dqn/configs/ directory, there are json files for the hyperparameters. You can change thoes hyperparameters as your desire.

Optimization of binding affinity

python optimize_BE.py --model_dir=${OUTPUT_DIR} --start_molecule=${INIT_MOL} --hparams="./configs/bootstrap_dqn_step1.json"

hparams could be a custom json file.

Output

The output file optimized_result_total.txt contains the optimized molecules with SMILES, docking score, SA score, and QED score by tab delimiter.

MORLD web server (http://morld.kaist.ac.kr), in addition, provides files of docking pose of each optimized molecules.

morld's People

Contributors

wsjeon92 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.