GithubHelp home page GithubHelp logo

gcorso / torsional-diffusion Goto Github PK

View Code? Open in Web Editor NEW
245.0 4.0 43.0 1.49 MB

Implementation of Torsional Diffusion for Molecular Conformer Generation (NeurIPS 2022)

Home Page: https://arxiv.org/abs/2206.01729

License: MIT License

Python 100.00%
chemistry conformer-generator equivariance geometry machine-learning molecules torsion-angles neurips-2022

torsional-diffusion's People

Contributors

bjing2016 avatar gcorso avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

torsional-diffusion's Issues

the result fairness?

when you compare your model result with others, the result is far beyond baselines.But, your model is train on a different train data, does it is fair to other models? In your ablation experiments, your model trained on true conformations almost same as others, even worse than others.
image
image

Positive log likelihoods

I trained the torsional diffusion model on my own dataset and was interested in generating conformers with likelihoods. I use the following command to generate the conformers:
python generate_confs.py --test_csv test.csv --inference_steps 20 --model_dir run/ --out conformers_20steps.pkl --tqdm --batch_size 128 --ode --likelihood full

But when I check the euclidean_dlogp the values are between -30 and +30. Do you know why this is? This is the log likelihood, right? So shouldn't it always be negative? I have tried both the full and hutchinson methods.

Thanks for your help in advance!

training data setup

I would like to provide my own datasest for retraining torisional-diffusion. There are some things that I do not know what value to put in for the pickle file. For example, the conformers dictionary has the following:

{'geom_id': 123368967, 'set': 1, 'degeneracy': 3, 'totalenergy': -23.59133734, 'relativeenergy': 0.0, 'boltzmannweight': 0.8585, 'conformerweights': [0.28617, 0.28617, 0.28616], 'rd_mol': <rdkit.Chem.rdchem.Mol at 0x7f7b42014bd0>}

What should I put for boltzmannweight and degeneracy? Is there a setup script to take molfiles and convert them into the dataset for training?

Why sigma is repeated 10000 times

Hi,

Your work is really interesting! However, when I run your code, I found a code block that runs very slow:

torsional-diffusion/diffusion/torus.py , line 68

score_norm_ = score(
sample(sigma[None].repeat(10000, 0).flatten()),
sigma[None].repeat(10000, 0).flatten()
).reshape(10000, -1)
score_norm_ = (score_norm_ ** 2).mean(0)

I wonder why sigma is repeated 10000 times? Is there any way to make it faster?
Thanks!

How can I get the split.npy file

Hi.
I want to training torsion-diffusion model using qm9 dataset but how I can get the split.npy file or generated this np file on your code ?

please let me know detail processing gen dataset

Thank!!

Set up issue

Hi,
First of all thank you for making your code available. I was curious to test out your conformer generation, but unfortunately ran into a few issues that I was hoping you could help me with.
A short report on what I did and which errors I ran into (my machine is running Ubuntu 22.04)

  • git cloned your repo
  • $ conda env create -f environment.yml
  • $ conda activate torsional_diffusion
  • $ pip install e3nn
  • checked the torch and cuda versions: 1.13 and 11.7 (are these the versions you worked with?)
  • installed pyg
  • $ pip install pyg-lib torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.13.0+cu117.html
  • downloaded your workdir, created the smiles.csv
  • when running
    $ python generate_confs.py --test_csv smiles.csv --inference_steps 20 --model_dir workdir/drugs_default --out conformers_20steps.pkl --tqdm --batch_size 128 --no_energy
    I encountered the following warnings:

../torsional-diffusion/diffusion/torus.py:33: RuntimeWarning: invalid value encountered in divide
score_ = grad(x, sigma[:, None], N=100) / p_
.../.conda/envs/torsional_diffusion/lib/python3.9/site-packages/torch/jit/_check.py:181: UserWarning: The TorchScript type system doesn't support instance-level annotations on empty non-base types in __init__. Instead, either 1) use a type annotation in the class body, or 2) wrap the type in torch.jit.Attribute.
warnings.warn("The TorchScript type system doesn't support "
0it [00:00, ?it/s]
Generated conformers for 0 molecules

Would you have any idea what might be causing this behaviour?

Thanks in advance for any pointers you can give me and kind regards,
Jessica

num_samples should be a positive integer value, but got num_samples=0

Putting QM9 in the data folder, running the following
python train.py --log_dir ./test_run --cache data/QM9/cache --data_dir data/QM9/qm9 --std_pickles data/QM9/standardized_pickles --split_path data/QM9/split.npy --dataset=qm9

keeps on giving a num_samples should be a positive integer value, but got num_samples=0 error. I checked and all the directories are correct. Is there something special about qm9 that merits a different treatment? Thanks.

Any checkpoint provided?

Hello,thanks for ur good work .Would you like to provide checkpoints in the future?
Best wishes to u.

from utils.xtb import * cannot Find the path

Traceback (most recent call last):
File "e:\Cheminfo_Workshop\5_Docking_Lab\torsional-diffusion-master\generate_confs.py", line 11, in
from diffusion.sampling import *
File "e:\Cheminfo_Workshop\5_Docking_Lab\torsional-diffusion-master\diffusion\sampling.py", line 4, in
from diffusion.likelihood import *
File "e:\Cheminfo_Workshop\5_Docking_Lab\torsional-diffusion-master\diffusion\likelihood.py", line 7, in
from utils.xtb import *
File "e:\Cheminfo_Workshop\5_Docking_Lab\torsional-diffusion-master\utils\xtb.py", line 12, in
os.mkdir(my_dir)
FileNotFoundError: [WinError 3] The system cannot find the path specified: '/tmp/8508'

while I try to set the temp file,
how can I change the temp file. many thanks,

best,
Sh-Y

Alternative to Conformer Matching

Hi,

Thanks for your great work! I wonder why we should use the optimization to perform conformer matching. Could we just set the ground truth torsions to our RDKit-generated conformers as our targets? Am I missing something here?

Thank you in advance!

Best regards,
Lin

model failure on generated conformers

Hey, when I evaluate generated conformers, for some molecules its noted to be a model failure (image attached).

Why do these model failures happen? I can't really place it in the code where these failures take place.
Screen Shot 2023-07-06 at 10 42 16 AM

The loss is sometimes over 1e17

Hi, I am trying to train a torsional diffusion model. Since torus.score_norm() is very small when sigma is large, the training loss can be enormous. Is this the expected behavior? Do you suggest using a smaller sigma_max for numerical stability?

Check for rotatable bonds

Hi, I had a question about how you check if a bond is rotatable, it seems you do it by checking if a networkx graph is fully connected after removing that bond - but do you take into account the bond order anywhere?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.