GithubHelp home page GithubHelp logo

luis-sribeiro / molecule_gen Goto Github PK

View Code? Open in Web Editor NEW

This project forked from robertcsordas/molecule_gen

0.0 0.0 0.0 294 KB

Implementation of "Learning Deep Generative Models"

Python 100.00%

molecule_gen's Introduction

Implementation of Learning Deep Generative Models of Graphs in PyTorch

For the paper, see: [1803.03324] Learning Deep Generative Models of Graphs

Implementation is batched and can process roughly 10k molecules/minute with batch size of 128.

Results

This implementation reaches ~72% of valid molecules generated (instead of 97.5% reported in the paper) with a loss of 19.5 (instead 20.5 of theirs). The reason for the difference might be:

  • Different method of validation - they did not specify how they do it. This implementation uses Chem.SanitizeMol() function of RDKit
  • What to do with multiple edge additions? Ignore them? Mask them out in time of choosing the destination atom? This implementation counts them as invalid.
  • What exact parameters are shared among the propagators/aggregators and which of them is custom?
  • Where do they apply the dropout to? They say they apply it to the last layer of output modules. What is an output module? Applying it directly before the decisions make no sense.
  • Molecules are not uniquely represented by their graph. They also need the "NumExplicitHs" and "FormalCharge" attributes because of kekulization and because hydrogen atoms are not represented in the graph. What do they do with these? In this implementation it is just filtered out.
  • It is not specified which version of Chembl they are using.

Having lower validation loss while having significantly lower number of valid molecules indicates a different way of validating molecules. Achieving 97.5% valid graphs seems to be quiet astonishing, considering the fact that every single choice in the decision problem could very easily result in an invalid molecule.

How to use it

./main.py -save_dir <train dir> -gpu 0

For more options, check out main.py.

Visualization during training

Requirements

Tested with Python 3, PyTorch 1.1 and RDKit 2018.09.1. Others can be installed by running pip3 -r requirements.txt.

License

The software is under Apache 2.0 license. See http://www.apache.org/licenses/LICENSE-2.0 for further details.

molecule_gen's People

Contributors

robertcsordas avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.