GithubHelp home page GithubHelp logo

hips / molecule-autoencoder Goto Github PK

View Code? Open in Web Editor NEW
154.0 39.0 52.0 20.71 MB

A project to enable optimization of molecules by transforming them to and from a continuous representation.

Python 100.00%

molecule-autoencoder's People

Contributors

duvenaud avatar t1m0thy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

molecule-autoencoder's Issues

How to choose inducing point of the latent space and sample from the latent space?

Hi,
I tried to perform bayesian optimization on decoded smiles from 292 dimensional vectors. Following your paper, I first used the latent vector of this smile 'CCN(CC)C(=O)Cc1ccc(S(=O)(=O)N2CCCc3ccccc32)cc1' as inducing point. And I modified five dimensions of this vector(with value range [-0.8,1]) to obtain new smiles. But the following smiles I got are not valid ones. Do you any suggestions about how to modify the vectors in order to get new smiles? Thanks a lot
[['CCN(CC)C(=O)Cc1ccc(S(=O)(=O)N2CCCc3ccccc32)cc1'], ['CCN(CC)(=O)CSc1ccc(S(=O)(=O)N2CCCc3ccccc33)cc1C'], ['CCN(CC(C(=O)Oc1cccc1S(])(==)NCCCcc2ccccc2)2C1'], ['CC1CCN1C(=O)Nc2cccc1C((=O)=O)N1CCc2ccccc3)cc1'], ['CCCCCN1C(=O)Nc2cccc1S(C)((=O)N(CCc2ccccc3))c21']]

Details about all_drugs.smi

Could you please provide some details for the data included in the all_drugs.smi?

What is the source (PubChem, Drugbank ?) of this data?
What are the inclusion criteria used in the process?

Thank you!

Error in running sample_autoencoder.py file

I was following the instructions on the homepage of this github repository, trying to run the sample_autoencoder.py file exactly according to the example. However, this is what showed up:

python sample_autoencoder.py
../data/best_vae_model.json
../data/best_vae_annealed_weights.h5
../data/250k_rndm_zinc_drugs_clean.smi
../data/zinc_char_list.json
-l5000
Using Theano backend.
Traceback (most recent call last):
File "sample_autoencoder.py", line 97, in
model = model_from_json(json.dumps(model_dict))
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 213, in model_from_json
return layer_from_config(config, custom_objects=custom_objects)
File "/usr/local/lib/python2.7/dist-packages/keras/utils/layer_utils.py", line 27, in layer_from_config
class_name = config['class_name']
KeyError: 'class_name'

Can anyone please tell me what is going on here? Thank you very much!

Baselines

Hi,
Are the details (and/or the code) for implementing the baselines, esp. the Genetic Algorithm, available? I can't seem to find them here or in the paper.

Thank you very much for the information!

The charset

As I proposed in maxhodak/keras-molecules#54. I am interested in why the charset is designed like this. It's not straightforward. From the viewpoint of chemistry, the chlorine "Cl" should not be treated as "C" and "l". Maybe it will be some improvement if we re-design the charset. I used the implementation from keras-molecules, and when I tried to interpolate between 2 chemical structures (CC=C(C(=CC)c1ccc(O)cc1)c1ccc(O)cc1 and CN1C(=O)CCS(=O)(=O)C1c1ccc(Cl)cc1).
). I got something like these invalid structures below, so I guess the charset is the reason for this.
CC(C)(O)CCC1CCC(Cr)So2c1ccc(C)cc1
CCNC(=O)CN(CC1((l)CN1c1ccc(OC)cc1
CN1C(=O)CN(CC1((#)CN1c1ccc(OC)cc1
CN1C(=O)CC(CC**()(=O)C1c1ccc(Cl)cc1
CN
1C(=O)CC(NC()(=O)C1**c1ccc(Cl)cc1

Showing an error

How to correct the error. For detailed, please see the attached image

keras version 1.2.0
Theano version 0.8.2
untitled

softmax activation in GRU

Hi, I noticed that you put softmax activation inside GRU cell, as I understand in this case you wont get sum of activations for each timestep equals to 1. Here is link for GRU cell and the same situation for terminal GRU https://github.com/HIPS/molecule-autoencoder/blob/master/autoencoder/train_autoencoder.py#L225

I also checked with you version of keras that it does not sum to 1, here is link to ghist https://gist.github.com/fgvbrt/1f2e1828c6d8c0eb88614f14c60874ad

Was it done on purpose or was it mistake?
Thanks in advance.

Unclear best model configuration

Hi, thanks for releasing the code. However, how can I train "the best model" myself? Such a configuration is missing in hyperparams.py (simple_params does not correspond to it and makes many random choices anyway). Also, one cannot fully reconstruct it from best_vae_model.json, especially regarding optimization details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.