GithubHelp home page GithubHelp logo

josejimenezluna / delfta Goto Github PK

View Code? Open in Web Editor NEW
95.0 95.0 16.0 3.84 MB

Δ-QML for medicinal chemistry

License: GNU Affero General Public License v3.0

Python 99.18% Makefile 0.38% Dockerfile 0.29% Shell 0.09% Batchfile 0.06%
deep-learning machine-learning pytorch quantum-chemistry

delfta's People

Contributors

atzkenneth avatar cisert avatar josejimenezluna avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

delfta's Issues

Cannot download files

Hi! Thanks for a great repository.
I am trying to download the files as suggested in the README, but unfortunately I get this error:

(delfta) mduranfrigola@raluy:~/Desktop$ python -c "import runpy; _ = runpy.run_module('delfta.download', run_name='__main__')"
2024/04/24 05:17:59 PM | DelFTa | INFO: Now downloading trained models and utils...
Traceback (most recent call last):                                                                                                  
  File "<string>", line 1, in <module>
  File "/home/mduranfrigola/miniconda3/envs/delfta/lib/python3.9/runpy.py", line 228, in run_module
    return _run_code(code, {}, init_globals, run_name, mod_spec)
  File "/home/mduranfrigola/miniconda3/envs/delfta/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/mduranfrigola/miniconda3/envs/delfta/lib/python3.9/site-packages/delfta/download.py", line 134, in <module>
    _download_required()
  File "/home/mduranfrigola/miniconda3/envs/delfta/lib/python3.9/site-packages/delfta/download.py", line 93, in _download_required
    with tarfile.open(models_tar) as handle:
  File "/home/mduranfrigola/miniconda3/envs/delfta/lib/python3.9/tarfile.py", line 1797, in open
    raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully

Do you know what might be going on?
Thanks a lot in advance.

Installation: Docker image missing libXau.so.6

When using the docker image from DockerHub, the import of openbabel fails:

>>> from openbabel.pybel import readstring
==============================
*** Open Babel Error  in openLib
  /opt/conda/envs/delfta/lib/openbabel/3.1.0/png2format.so did not load properly.
 Error: libXau.so.6: cannot open shared object file: No such file or directory

This can be solved by running (inside the container):

apt-get update
apt-get install libxtst6

(see also https://stackoverflow.com/questions/17355863/cant-find-install-libxtst-so-6)

Change order of 3D-check and hydrogen check; modify hydrogen check

  • make3D automatically assigns hydrogens, so when passing a molecule without hydrogens but using force3D=True, there's no feedback in the logger that hydrogens have been added. Additionally, this means that force3d=True overrides addh=False. I suggest first checking if hydrogens are present, and, if that is not the case but force3d=True, to return an error.
  • Checking for hydrogens with 1 in set([atom.atomicnum for atom in mol.atoms]) runs the risk of ignoring molecules where only certain hydrogens are present. It's a strange edge case, but I suggest checking if e.g. [atom.atomicnum for atom in mol.atoms] changes after adding hydrogens to a copy of the molecule. Not elegant, but didn't find an alternative yet.

Fix tests to refer to correct directory

  • need to download to and run from the same folder
  • assert len(mols) == 100 in test_xtb.py and test_calculator.py to make sure we actually have something to run the tests on

tarfile.ReadError

I have installed delfta with conda install delfta -c delfta -c pytorch -c rusty1s -c conda-forge on a WSL Ubuntu 20.04
Now I'm trying to execute the "First Run" commands, but both
python -c "import runpy; _ = runpy.run_module('delfta.download', run_name='__main__')" and

from delfta.download import _download_required
_download_required()

throw tarfile.ReadError: file could not be opened successfully
The file in question is under home/user/miniconda3/envs/delfta/lib/python3.8/tarfile.py

Check for validity of molecules when reading

When an invalid input file is provided, OpenBabel does print a warning, but we try to keep going and return somewhat confusing errors.

  • example_file_1.sdf returns Molecules at position [0] have no 3D conformations available. Either provide a mol with one or re-run calculator with force3D=True.
  • example_file_2.sdf returns need at least one array to concatenate

example_files.zip

Checking if molecules are valid and throwing an error if not might be clearer for the user. In the two cases above, input_ is either a list of empty (no atoms) pybel molecules (for example_file_1.sdf) or just an empty list (for example_file_2.sdf), so we could check for that.

Use single progress bar for inputs > batch_size

When running DelftaCalculator for more than batch_size molecules, new progress bars are created for each batch. This somewhat defeats the purpose of a progress bar, as it always only goes from 0/1 to 1/1, and then a new bar is created, so there's no estimate on how long the overall process will take, or which percentage of the overall task has been completed. As a minor point, we can probably skip the progress if the input is < batch_size, although the user could also turn that off themselves.

Screenshot 2021-06-28 at 14 10 31

Parallelize xtb code

For large numbers of molecules (i.e. an sdf file) it may make more sense to use joblib with large batch sizes.

E_gap doesn't always equal E_lumo - E_homo

While I realize E_gap is a separate model itself, the value it returns is sometimes way different from calculating E_lumo - E_homo. For example, take the following molecule with the following settings:

calc = DelftaCalculator(delta=True, xtbopt=True, return_optmols=True, addh=True)
mol = readstring("smi", "C[C@@H]1CC[C@@H]2C(C)(C)[C@H](O)[C@]3(C)CC[C@@]21C3")

Results are:

'E_homo': -0.23676376
'E_lumo': 0.11727003
'E_gap': 0.12012529

E_gap should be close to 0.354

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.