gcorso / torsional-diffusion Goto Github PK

View Code? Open in Web Editor NEW

245.0 4.0 43.0 1.49 MB

Implementation of Torsional Diffusion for Molecular Conformer Generation (NeurIPS 2022)

Home Page: https://arxiv.org/abs/2206.01729

License: MIT License

Python 100.00%

chemistry conformer-generator equivariance geometry machine-learning molecules torsion-angles neurips-2022

torsional-diffusion's People

Contributors

Stargazers

Watchers

torsional-diffusion's Issues

the result fairness?

when you compare your model result with others, the result is far beyond baselines.But, your model is train on a different train data, does it is fair to other models? In your ablation experiments, your model trained on true conformations almost same as others, even worse than others.

conformer_matching

Positive log likelihoods

I trained the torsional diffusion model on my own dataset and was interested in generating conformers with likelihoods. I use the following command to generate the conformers:
python generate_confs.py --test_csv test.csv --inference_steps 20 --model_dir run/ --out conformers_20steps.pkl --tqdm --batch_size 128 --ode --likelihood full

But when I check the euclidean_dlogp the values are between -30 and +30. Do you know why this is? This is the log likelihood, right? So shouldn't it always be negative? I have tried both the full and hutchinson methods.

Thanks for your help in advance!

training data setup

I would like to provide my own datasest for retraining torisional-diffusion. There are some things that I do not know what value to put in for the pickle file. For example, the conformers dictionary has the following:

{'geom_id': 123368967, 'set': 1, 'degeneracy': 3, 'totalenergy': -23.59133734, 'relativeenergy': 0.0, 'boltzmannweight': 0.8585, 'conformerweights': [0.28617, 0.28617, 0.28616], 'rd_mol': <rdkit.Chem.rdchem.Mol at 0x7f7b42014bd0>}

What should I put for boltzmannweight and degeneracy? Is there a setup script to take molfiles and convert them into the dataset for training?

Why sigma is repeated 10000 times

Hi,

Your work is really interesting! However, when I run your code, I found a code block that runs very slow:

torsional-diffusion/diffusion/torus.py , line 68

score_norm_ = score(
sample(sigma[None].repeat(10000, 0).flatten()),
sigma[None].repeat(10000, 0).flatten()
).reshape(10000, -1)
score_norm_ = (score_norm_ ** 2).mean(0)

I wonder why sigma is repeated 10000 times? Is there any way to make it faster?
Thanks！

Cannot use batch normalization for layers without any scalar irrep (0e)

torsional-diffusion/diffusion/score_model.py

Line 118 in fcad6fb

out_irreps=f'{ns}x0o',

The variable new_means is an empty list e3nn code which leads to this error because it is necessary to have at least a scalar irrep for calculating new_means in the BatchNorm layer.

How can I get the split.npy file

Hi.
I want to training torsion-diffusion model using qm9 dataset but how I can get the split.npy file or generated this np file on your code ?

please let me know detail processing gen dataset

Thank!!

Set up issue

Hi,
First of all thank you for making your code available. I was curious to test out your conformer generation, but unfortunately ran into a few issues that I was hoping you could help me with.
A short report on what I did and which errors I ran into (my machine is running Ubuntu 22.04)

git cloned your repo
$ conda env create -f environment.yml
$ conda activate torsional_diffusion
$ pip install e3nn
checked the torch and cuda versions: 1.13 and 11.7 (are these the versions you worked with?)
installed pyg
$ pip install pyg-lib torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.13.0+cu117.html
downloaded your workdir, created the smiles.csv
when running
$ python generate_confs.py --test_csv smiles.csv --inference_steps 20 --model_dir workdir/drugs_default --out conformers_20steps.pkl --tqdm --batch_size 128 --no_energy
I encountered the following warnings:

../torsional-diffusion/diffusion/torus.py:33: RuntimeWarning: invalid value encountered in divide
score_ = grad(x, sigma[:, None], N=100) / p_
.../.conda/envs/torsional_diffusion/lib/python3.9/site-packages/torch/jit/_check.py:181: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in __init__. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in torch.jit.Attribute.
warnings.warn("The TorchScript type system doesn't support "
0it [00:00, ?it/s]
Generated conformers for 0 molecules

Would you have any idea what might be causing this behaviour?

Thanks in advance for any pointers you can give me and kind regards,
Jessica

num_samples should be a positive integer value, but got num_samples=0

Putting QM9 in the data folder, running the following
python train.py --log_dir ./test_run --cache data/QM9/cache --data_dir data/QM9/qm9 --std_pickles data/QM9/standardized_pickles --split_path data/QM9/split.npy --dataset=qm9

keeps on giving a num_samples should be a positive integer value, but got num_samples=0 error. I checked and all the directories are correct. Is there something special about qm9 that merits a different treatment? Thanks.

Any checkpoint provided？

Hello，thanks for ur good work .Would you like to provide checkpoints in the future?
Best wishes to u.

from utils.xtb import * cannot Find the path

Traceback (most recent call last):
File "e:\Cheminfo_Workshop\5_Docking_Lab\torsional-diffusion-master\generate_confs.py", line 11, in
from diffusion.sampling import *
File "e:\Cheminfo_Workshop\5_Docking_Lab\torsional-diffusion-master\diffusion\sampling.py", line 4, in
from diffusion.likelihood import *
File "e:\Cheminfo_Workshop\5_Docking_Lab\torsional-diffusion-master\diffusion\likelihood.py", line 7, in
from utils.xtb import *
File "e:\Cheminfo_Workshop\5_Docking_Lab\torsional-diffusion-master\utils\xtb.py", line 12, in
os.mkdir(my_dir)
FileNotFoundError: [WinError 3] The system cannot find the path specified: '/tmp/8508'

while I try to set the temp file,
how can I change the temp file. many thanks,

best,
Sh-Y

Alternative to Conformer Matching

Hi,

Thanks for your great work! I wonder why we should use the optimization to perform conformer matching. Could we just set the ground truth torsions to our RDKit-generated conformers as our targets? Am I missing something here?

Thank you in advance!

Best regards,
Lin

gcorso / torsional-diffusion Goto Github PK

torsional-diffusion's People

Contributors

Stargazers

Watchers

Forkers

torsional-diffusion's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs