futianfan / reinforced-genetic-algorithm Goto Github PK
View Code? Open in Web Editor NEWStructure-based Drug Design; Reinforcement Learning and Genetic Algorithm
Structure-based Drug Design; Reinforcement Learning and Genetic Algorithm
I loaded the pre-trained checkpoint and evaluate on a set of different protein targets and after several rounds it output the following error.
Traceback (most recent call last):
File "RGA.py", line 994, in
pdbqtvina_list = pdbqtvina_list)
File "/home/siqiouyang/work/projects/GEKO/baselines/reinforced-genetic-algorithm/model.py", line 324, in forward_ligand_list
feature_list = featurize_receptor_and_ligand_list(name_of_receptor, pdbqtvina_list)
File "/home/siqiouyang/work/projects/GEKO/baselines/reinforced-genetic-algorithm/model.py", line 204, in featurize_receptor_and_ligand_list
ligand_atom_idx, ligand_positions, ligand_mask = pdbqtvina2feature(pdbqt_file)
File "/home/siqiouyang/work/projects/GEKO/baselines/reinforced-genetic-algorithm/model.py", line 167, in pdbqtvina2feature
line_indx = lines.index("MODEL 2")
ValueError: 'MODEL 2' is not in list
It seems to be pdbqt file error, but I checked the results directory the pdbqt files are have string "MODEL 2" inside. Could you please help me resolve this issue?
I think one needs to load the ckpt files that contain pre-trained ENN models to run the code.
I installed rga on Ubuntu 22.04. Conda environments are listed at the end.
Then I tried to run the following commands:
$ MGLTOOLS=/usr/local/MGLTools/
$ python RunAutogrow.py \
--filename_of_receptor ./tutorial/PARP/4r6eA_PARP1_prepared.pdb \
--center_x -70.76 --center_y 21.82 --center_z 28.33 \
--size_x 25.0 --size_y 16.0 --size_z 25.0 \
--source_compound_file ./source_compounds/naphthalene_smiles.smi \
--root_output_folder ./output \
--number_of_mutants_first_generation 50 \
--number_of_crossovers_first_generation 50 \
--number_of_mutants 50 \
--number_of_crossovers 50 \
--top_mols_to_seed_next_generation 50 \
--number_elitism_advance_from_previous_gen 50 \
--number_elitism_advance_from_previous_gen_first_generation 10 \
--diversity_mols_to_seed_first_generation 10 \
--diversity_seed_depreciation_per_gen 10 \
--num_generations 5 \
--mgltools_directory $MGLTOOLS/ \
--number_of_processors -1 \
--scoring_choice VINA \
--LipinskiLenientFilter \
--start_a_new_run \
--rxn_library click_chem_rxns \
--selector_choice Rank_Selector \
--dock_choice VinaDocking \
--max_variants_per_compound 5 \
--redock_elite_from_previous_gen False \
--generate_plot True \
--reduce_files_sizes True \
--use_docked_source_compounds True
I got errors as below:
Traceback (most recent call last):
File "RunAutogrow.py", line 688, in <module>
AutogrowMainExecute.main_execute(vars)
File "/opt/modeling/molecule-generation/reinforced-genetic-algorithm/autogrow/autogrow_main_execute.py", line 105, in main_execute
current_generation_dir, smile_file_new_gen)
File "/opt/modeling/molecule-generation/reinforced-genetic-algorithm/autogrow/docking/execute_docking.py", line 166, in run_docking_common
deleted_smiles_names_list = deleted_smiles_names_list_convert + deleted_smiles_names_list_dock
NameError: name 'deleted_smiles_names_list_convert' is not defined
So I looked into execute_docking.py and uncommented the following part:
########### print #############
deleted_smiles_names_list_convert = [x for x in smiles_names_failed_to_convert if x is not None]
deleted_smiles_names_list_convert = list(set(deleted_smiles_names_list_convert))
# if len(deleted_smiles_names_list_convert) != 0:
# print("THE FOLLOWING LIGANDS WHICH FAILED TO CONVERT:")
# print(deleted_smiles_names_list_convert)
# print("####################")
Then I ran the command again and got different error message:
Finished generation 0
/opt/modeling/molecule-generation/reinforced-genetic-algorithm/output/Run_10/generation_1/
Traceback (most recent call last):
File "RunAutogrow.py", line 688, in <module>
AutogrowMainExecute.main_execute(vars)
File "/opt/modeling/molecule-generation/reinforced-genetic-algorithm/autogrow/autogrow_main_execute.py", line 114, in main_execute
crossover_ligand1_policy_net, crossover_ligand2_policy_net, )
TypeError: populate_generation() takes 2 positional arguments but 6 were given
Because it looked that "autogrow_main_execute.py" calls "populate_generation" defined in "autogrow/operators/operations1.py", I changed "autogrow/autogrow_main_execute.py" as below:
#import autogrow.operators.operations as operations
import autogrow.operators.operations1 as operations
Then ran the command again and got another error:
No PDB folder to concatenate and compress. This is likely generation 0 seeded with a Ranked .smi file.
Finished generation 0
/opt/modeling/molecule-generation/reinforced-genetic-algorithm/output/Run_11/generation_1/
There were no available ligands in previous generation ranked ligand file.
Check formatting or if file has been moved.
Traceback (most recent call last):
File "RunAutogrow.py", line 688, in <module>
AutogrowMainExecute.main_execute(vars)
File "/opt/modeling/molecule-generation/reinforced-genetic-algorithm/autogrow/autogrow_main_execute.py", line 115, in main_execute
crossover_ligand1_policy_net, crossover_ligand2_policy_net, )
File "/opt/modeling/molecule-generation/reinforced-genetic-algorithm/autogrow/operators/operations1.py", line 85, in populate_generation
source_compounds_list = get_complete_list_prev_gen_or_source_compounds(vars, generation_num)
File "/opt/modeling/molecule-generation/reinforced-genetic-algorithm/autogrow/operators/operations1.py", line 721, in get_complete_list_prev_gen_or_source_compounds
raise Exception(printout)
Exception:
There were no available ligands in previous generation ranked ligand file.
Check formatting or if file has been moved.
I gave up with going further and came here to ask for help.
Thanks.
-Don
$ conda --version
conda 23.1.0
$ conda list
# packages in environment at /opt/anaconda310/envs/autogrow:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
aom 3.5.0 h27087fc_0 conda-forge
boost 1.78.0 py37h48bf904_0 conda-forge
boost-cpp 1.78.0 h6582d0a_3 conda-forge
bottleneck 1.3.5 py37hda87dfa_0 conda-forge
brotlipy 0.7.0 py37h540881e_1004 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
ca-certificates 2022.12.7 ha878542_0 conda-forge
cairo 1.16.0 h35add3b_1015 conda-forge
certifi 2022.12.7 pyhd8ed1ab_0 conda-forge
cffi 1.15.1 py37h43b0acd_1 conda-forge
charset-normalizer 3.1.0 pyhd8ed1ab_0 conda-forge
cryptography 39.0.1 py37h9ce1e76_0
cudatoolkit 11.7.0 hd8887f6_11 conda-forge
cycler 0.11.0 pyhd8ed1ab_0 conda-forge
expat 2.5.0 hcb278e6_1 conda-forge
ffmpeg 4.4.2 gpl_h8dda1f0_112 conda-forge
font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge
font-ttf-inconsolata 3.000 h77eed37_0 conda-forge
font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge
font-ttf-ubuntu 0.83 hab24e00_0 conda-forge
fontconfig 2.14.2 h14ed4e7_0 conda-forge
fonts-conda-ecosystem 1 0 conda-forge
fonts-conda-forge 1 0 conda-forge
freetype 2.12.1 hca18f0e_1 conda-forge
func-timeout 4.3.5 pypi_0 pypi
future 0.18.2 py37h89c1867_5 conda-forge
gettext 0.21.1 h27087fc_0 conda-forge
gmp 6.2.1 h58526e2_0 conda-forge
gnutls 3.7.8 hf3e180e_0 conda-forge
greenlet 1.1.3 py37hd23a5d3_0 conda-forge
icu 72.1 hcb278e6_0 conda-forge
idna 3.4 pyhd8ed1ab_0 conda-forge
importlib-metadata 4.11.4 py37h89c1867_0 conda-forge
jpeg 9e h0b41bf4_3 conda-forge
kiwisolver 1.4.4 py37h7cecad7_0 conda-forge
lame 3.100 h166bdaf_1003 conda-forge
lcms2 2.14 h6ed2654_0 conda-forge
ld_impl_linux-64 2.40 h41732ed_0 conda-forge
lerc 4.0.0 h27087fc_0 conda-forge
libblas 3.9.0 16_linux64_openblas conda-forge
libcblas 3.9.0 16_linux64_openblas conda-forge
libdeflate 1.14 h166bdaf_0 conda-forge
libdrm 2.4.114 h166bdaf_0 conda-forge
libexpat 2.5.0 hcb278e6_1 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 12.2.0 h65d4601_19 conda-forge
libgfortran-ng 12.2.0 h69a702a_19 conda-forge
libgfortran5 12.2.0 h337968e_19 conda-forge
libglib 2.76.2 hebfc3b9_0 conda-forge
libgomp 12.2.0 h65d4601_19 conda-forge
libiconv 1.17 h166bdaf_0 conda-forge
libidn2 2.3.4 h166bdaf_0 conda-forge
liblapack 3.9.0 16_linux64_openblas conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libopenblas 0.3.21 pthreads_h78a6416_3 conda-forge
libpciaccess 0.17 h166bdaf_0 conda-forge
libpng 1.6.39 h753d276_0 conda-forge
libprotobuf 3.20.3 h3eb15da_0 conda-forge
libsqlite 3.40.0 h753d276_1 conda-forge
libstdcxx-ng 12.2.0 h46fd767_19 conda-forge
libtasn1 4.19.0 h166bdaf_0 conda-forge
libtiff 4.4.0 h82bc61c_5 conda-forge
libunistring 0.9.10 h7f98852_0 conda-forge
libuuid 2.38.1 h0b41bf4_0 conda-forge
libva 2.18.0 h0b41bf4_0 conda-forge
libvpx 1.11.0 h9c3ff4c_3 conda-forge
libwebp-base 1.3.0 h0b41bf4_0 conda-forge
libxcb 1.13 h7f98852_1004 conda-forge
libxml2 2.10.4 hfdac1af_0 conda-forge
libzlib 1.2.13 h166bdaf_4 conda-forge
matplotlib-base 3.4.3 py37h1058ff1_2 conda-forge
ncurses 6.3 h27087fc_1 conda-forge
nettle 3.8.1 hc379101_1 conda-forge
ninja 1.11.1 h924138e_0 conda-forge
nomkl 1.0 h5ca1d4c_0 conda-forge
numexpr 2.8.3 py37h85a3170_100 conda-forge
numpy 1.21.6 py37h976b520_0 conda-forge
nvidia-cublas-cu11 11.10.3.66 pypi_0 pypi
nvidia-cuda-nvrtc-cu11 11.7.99 pypi_0 pypi
nvidia-cuda-runtime-cu11 11.7.99 pypi_0 pypi
nvidia-cudnn-cu11 8.5.0.96 pypi_0 pypi
openh264 2.3.1 hcb278e6_2 conda-forge
openjpeg 2.5.0 h7d73246_1 conda-forge
openssl 3.1.0 hd590300_3 conda-forge
p11-kit 0.24.1 hc5aa10d_0 conda-forge
packaging 23.1 pyhd8ed1ab_0 conda-forge
pandas 1.3.5 py37h8c16a72_0
pcre2 10.40 hc3806b6_0 conda-forge
pillow 9.2.0 py37h850a105_2 conda-forge
pip 23.1.2 pyhd8ed1ab_0 conda-forge
pixman 0.40.0 h36c2ea0_0 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
pycairo 1.21.0 py37h0afab05_1 conda-forge
pycparser 2.21 pyhd8ed1ab_0 conda-forge
pyopenssl 23.1.1 pyhd8ed1ab_0 conda-forge
pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge
pysocks 1.7.1 py37h89c1867_5 conda-forge
python 3.7.12 hf930737_100_cpython conda-forge
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python_abi 3.7 3_cp37m conda-forge
pytz 2023.3 pyhd8ed1ab_0 conda-forge
pyyaml 6.0 py37h540881e_4 conda-forge
rdkit 2022.09.1 py37h97e29ec_1 conda-forge
readline 8.2 h8228510_1 conda-forge
reportlab 3.5.68 py37h69800bb_1 conda-forge
requests 2.29.0 pyhd8ed1ab_0 conda-forge
scipy 1.7.3 pypi_0 pypi
setuptools 67.7.2 pyhd8ed1ab_0 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
sqlalchemy 1.4.42 py37h540881e_0 conda-forge
sqlite 3.40.0 h4ff8645_1 conda-forge
svt-av1 1.4.1 hcb278e6_0 conda-forge
tk 8.6.12 h27826a3_0 conda-forge
torch 1.13.1 pypi_0 pypi
torchvision 0.12.0 cpu_py37hb263d47_1 conda-forge
tornado 6.2 py37h540881e_0 conda-forge
typing-extensions 4.5.0 hd8ed1ab_0 conda-forge
typing_extensions 4.5.0 pyha770c72_0 conda-forge
urllib3 1.26.15 pyhd8ed1ab_0 conda-forge
wheel 0.40.0 pyhd8ed1ab_0 conda-forge
x264 1!164.3095 h166bdaf_2 conda-forge
x265 3.5 h924138e_3 conda-forge
xorg-fixesproto 5.0 h7f98852_1002 conda-forge
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libice 1.0.10 h7f98852_0 conda-forge
xorg-libsm 1.2.3 hd9c2040_1000 conda-forge
xorg-libx11 1.8.4 h0b41bf4_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h0b41bf4_2 conda-forge
xorg-libxfixes 5.0.3 h7f98852_1004 conda-forge
xorg-libxrender 0.9.10 h7f98852_1003 conda-forge
xorg-renderproto 0.11.1 h7f98852_1002 conda-forge
xorg-xextproto 7.3.0 h0b41bf4_1003 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
yaml 0.2.5 h7f98852_2 conda-forge
zipp 3.15.0 pyhd8ed1ab_0 conda-forge
zlib 1.2.13 h166bdaf_4 conda-forge
zstd 1.5.2 h3eb15da_6 conda-forge
$ pip list
Package Version
------------------------ ----------------
Bottleneck 1.3.5
brotlipy 0.7.0
certifi 2022.12.7
cffi 1.15.1
charset-normalizer 3.1.0
cryptography 39.0.1
cycler 0.11.0
func-timeout 4.3.5
future 0.18.2
greenlet 1.1.3
idna 3.4
importlib-metadata 4.11.4
kiwisolver 1.4.4
matplotlib 3.4.3
numexpr 2.8.3
numpy 1.21.6
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
packaging 23.1
pandas 1.3.5
Pillow 9.2.0
pip 23.1.2
pycairo 1.21.0
pycparser 2.21
pyOpenSSL 23.1.1
pyparsing 3.0.9
PySocks 1.7.1
python-dateutil 2.8.2
pytz 2023.3
PyYAML 6.0
reportlab 3.5.68
requests 2.29.0
scipy 1.7.3
setuptools 67.7.2
six 1.16.0
SQLAlchemy 1.4.42
torch 1.13.1
torchvision 0.12.0a0+76b4a42
tornado 6.2
typing_extensions 4.5.0
urllib3 1.26.15
wheel 0.40.0
zipp 3.15.0
Thanks for publishing an interesting research!
I have a question, in the paper you stated that you generated 200 offspring molecules in each state and set the 50 good ones in terms of docking score as the next state (I've referenced https://arxiv.org/pdf/2211.16508.pdf for the hyperparameter values).
In the experiment section, you stated that you allowed 1000 oracle calls, so does that mean that the generation process consisted of 5 iterations in total?
Hi @futianfan , thanks for sharing this wonderful piece of work!
I'd like to reproduce the RGA experiment results, but found that pre-specified center_x, center_y, center_z
would be needed for each of the 10 protein pdbs. Could you please share the docking coordinates that are used for each pdb file? Many thanks in advance!
Following the tutorial, when I execute this line of code: smiles_list = docking(smiles_folder = results_folder, smiles_file = smiles_file, args_dict = vars) the following error appears:setting PYTHONHOME environment
Failed to convert 0 times: ./results_4r6e_000_PDB/5978009__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/5155218__4.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/2065594__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/7918451__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/4485828__2.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/9066744__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/9592795__2.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/3884053__2.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/7918451__4.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/5867488__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/8905482__2.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/4025756__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/2719455__3.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/1200527__4.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/7292583__4.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/7595926__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/8424926__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/8095622__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/7872732__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/4691782__4.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/2073335__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/9980521__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/3098257__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/6056760__4.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/4024833__2.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/4024833__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/1653654__3.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/8181518__4.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/8625472__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/2442229__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/6354861__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/5809515__5.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/7851679__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/7107562__4.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/9490483__1.pdb
Failed to convert 0 times: ./results_4r6e_000_PDB/7107562__3.pdb
Can you help me with this?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.