quantaosun / dock-md-fep Goto Github PK

Open Source, Automated free binding free energy calculation between protein and small molecule. An all-in-one workflow, free energy pertubation with OpenMM.

License: MIT License

Jupyter Notebook 73.26% HTML 26.74%

gpu medicinal-chemistry simulation free-energy-perturbation

dock-md-fep's Introduction

Dock-MD-FEP

################ PLEASE NOTE ################################################################

The current workflow may be unable to deal with multiple chained PDB bank protein structures

This Colab version notebook is currently experiencing some instability due to dependencies and incompatible issues. This will be solved later, for now, you are advised to do this Dock-MD-FEP separately with three independent notebooks and
The third stage is (This is only a test to assure yourself that the FEP is good to go, to run the actual job please use a GPU platform. Remember to change the platform to CPU during the test and back to CUDA in the real job).
The second stage is a free MD (https://github.com/pablo-arantes/making-it-rain/blob/main/Protein_ligand.ipynb) or at least an MD input generation process that you can finish from here ,
The very first stage you can finish from here ############################################################################################

Installing the workflow on a Linux computer

Using the code locally, you only need to do the installation once so that you can skip the installation cell afterwards.

RTX30 series or a better GPU is recommended for the local computer.

This workflow requires a dedicated Python 3.8 environment before installation.

conda create --name Dock-MD-FEP python=3.8
conda activate Dock-MD-FEP
conda install jupyter --yes
(Dock-MD-FEP) jupyter lab Dock-MD-FEP_local_installation.ipynb

Several little things should be modified compared to using this online, like the path name etc. Smina instead of Gnia is used because of possible GPU incompatibilities. The rest of the code remains the same.

There might be an error from the first import cell, but it doesn't matter for the dock and MD simulation, it may only affect analysis later which you could do locally alternatively.

############################## Updated on 24/12/2022####################################

To use the workflow online with Colab

$Google_Drive_Path is the path in your Google Drive where you want to store the simulation data, and you should provide it.

You can use this workflow purely for docking purposes, docking plus MD simulation, or docking, MD and absolute binding free energy calculation.

After you've imported Google Drive, just fill in the four lines that define your working path, providing your PDB ID and small molecule structure, and the rest will run automatically with just one click.

default parameters (Change simulation time to be longer is necessary for a proper simulation)

-- MD , all MD input use Amber Gaff2 force field to deal with small molecule.

Up until before FEP simulation, the workflow procedes with one MD (MD0), one docking(dock1), MD(1), then the second docking (docking2), and a final MD(MD2), it will cost a total 25 ns of MD simulation.

MD 0 , equilibration 1 ns, production 2 ns. OpenMM as simulation engine.

MD0 result in $Google_Drive_Path (PDB bank structures MD result)

1st docking result in $Google_Drive_Path and $Google_Drive_Path/MD1

MD 1,  equilibration 2 ns, production 5.0 ns. OpenMM as simulation engine.

MD1 result in $Google_Drive_Path/MD1 (Docked small molecule MD result)

2nd docking result in $Google_Drive_Path/MD1 and $Google_Drive_Path/MD1/MD2

MD2, equilibration 5 ns, production 10 ns. OpenMM as a simulation engine.

MD2 result in $Google_Drive_Path/MD1/MD2 (Docked small molecule MD result)


Dock, gnina, --exaustiveness=200. The best pose with docking score and best CNN score was carried on to FEP 

FEP, stop the simulation when reaching error < 0.1 KT, Simulation engine OpenMM with Yank python library.

Free energy of binding(benzene) : -11.229 +- 0.352 kT (-6.694 +- 0.210 kcal/mol)

Restart the simulation

Open the Dock-MD-FEP-restart.ipynb, after pasting the working directory path of last simulation. Import the drive manually, then run all the other cells at one time.

Trouble shooting

A CUDA version related error CUDA_ERROR_UNSUPPORTED_PTX_VERSION (222) Please refer to #2 or the openMM issue link openmm/openmm#3585

Those with access to AI studio can try this workflow with automatic analysis. (Extend the simulation steps if you have enough GPU time)

The user is expected to upload the ligand.mol2 + starting_end.pdb to the default path, then submit the job to the queue.

https://aistudio.baidu.com/clusterprojectdetail/7677668

dock-md-fep's People

Contributors

Stargazers

Watchers

Forkers

gzy9970625 shunsunsun masterwhook jingqiong ebgu lachrymator yansonggu

dock-md-fep's Issues

Error in yank step

Hi,

I am getting the following error, during the yank step.

AttributeError: 'MBAR' object has no attribute 'getFreeEnergyDifferences'

I am running the program locally

Script for local run

Hi quantao,
Your script is really helpful. Can you provide a local version instead of colab version script? Since GPU resource on colab can not be freely available all the time. Many thanks!

Open babel wrong H adding warning

It has been constantly observed, that open babel some times added more H atoms to some complex aromatic structures, for example,
a pyridine was added more H it should, or failed to Kekulize the aromatic ring. In such cases, please use pymol or Maestro to open
the wrong ligand_H.pdb to fix the problem. If Maestro was used, you can just use ProteinPrep to recover the correct aromatic order then save it to overwrite the wrong one.

It should be mentioned that the probability you encounter such an issue is not very high if your small molecule contains not many N atoms in your ring system. All the errors observed has been related to N or O atom that is inside or adjacent to the ring system.

Possible Open babel related error

An possible error has been noticed, still working on it

ValueError                                Traceback (most recent call last)

[<ipython-input-4-7aaef16462be>](https://localhost:8080/#) in <module>
     34   ppdb.to_pdb(path="temp.pdb", records=['ATOM', 'HETATM'], gz=False, append_newline=True)
     35 
---> 36   mol= [m for m in pybel.readfile(filename="temp.pdb", format='pdb')][0]
     37   mol.calccharges
     38   mol.addh()

[/usr/local/lib/python3.8/site-packages/openbabel/pybel.py](https://localhost:8080/#) in readfile(format, filename, opt)
    157             obconversion.AddOption(k, obconversion.INOPTIONS, str(v))
    158     if not formatok:
--> 159         raise ValueError("%s is not a recognised Open Babel format" % format)
    160     if not os.path.isfile(filename):
    161         raise IOError("No such file: '%s'" % filename)

ValueError: pdb is not a recognised Open Babel format

REMINDER : up-to-date python not tested, please use python3.8 for stability

Directly installing the packages from the ipynb file may not work, because many packages have not been updated to the newer python environment. You can work it out yourself but it would take some time to do so, Please use the environment.yml file (which has fixed the python to 3.8.16 for example, but if using colab, the python is up-to-date version maybe 3.11 in April 2024 and cause incompatible issues with other packages) when possible, and it should still work until April 2024.

Also, it is a bit slow even with v100 colab provided to run the simulation, please consider use RTX4090 is recommended when availiable.

Local installation reminder

Using this workflow on colab is handy but requires installation every time, which is not handy at all. To address this problem, one way is to install the whole workflow locally.

conda create --name Dock-MD-FEP python=3.8
conda activate Dock-MD-FEP
(Dock-MD-FEP)conda install juyter --yes
(Dock-MD-FEP)pip install $everything
(Dock-MD-FEP)conda install $everything
(Dock-MD-FEP) jupyter lab Dock-MD-FEP_local_installation.ipynb

After starting the Dock-MD-FEP_local_installation.ipynb, firstly, we need to import all the packages we have installed, and later you don't have to install those packages again.

In a perfect world, you only need to customise the input cell with a proper PDB id and SMILES to represent the small molecule you need to dock and simulate. All the rest should be run automatically by clicking run them all with one click.

CUDA_ERROR_UNSUPPORTED_PTX_VERSION (222)

In case of the CUDA version error.

CUDA_ERROR_UNSUPPORTED_PTX_VERSION (222)

There are two ways to get rid of this

Use nvidia-smi to print out your cuda version. In my case, it is 11.2. Use conda list | grep cudatoolkit to print out the cudatoolkit version, in my case it is 11.7. You see, the difference between the cuda driver and cuatoolkit version is where this problem lies. An easy way to fix this, is to change cudatoolkit to the same 11.2 version. by conda install -c conda-forge cudatoolkit=11.2
Alternatively, you could modify the yaml file to change CUDA to OpenCL, if your platform supports OpenCL. However, this might not be as fast as CUDA in terms of utilizing GPU power.

NaN error when using explicit solvent.

This error will not be encountered unless you use explicit-1000_per_interation_doubled_lambda.yml instead of the default implicit-1000_per_interation_doubled_lambda.yml.

This issue is a reminder to myself, and you are welcome to solve this if you have a better position than me.

This issue is doubted to be related to the PBC setting from Amber files used in OpenMM. The way openMM expected PBC and the way Amber defines PBC might conflict. But I must say, sometimes it just works, and sometimes it does not, so the nature of this error is unclear to me.

One thing for sure is once we switch explicit solvent to implicit, this error is gone.