josejimenezluna / delfta Goto Github PK
View Code? Open in Web Editor NEWΔ-QML for medicinal chemistry
License: GNU Affero General Public License v3.0
Δ-QML for medicinal chemistry
License: GNU Affero General Public License v3.0
For completeness, as we say we provide those in the manuscript.
Hi! Thanks for a great repository.
I am trying to download the files as suggested in the README, but unfortunately I get this error:
(delfta) mduranfrigola@raluy:~/Desktop$ python -c "import runpy; _ = runpy.run_module('delfta.download', run_name='__main__')"
2024/04/24 05:17:59 PM | DelFTa | INFO: Now downloading trained models and utils...
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/mduranfrigola/miniconda3/envs/delfta/lib/python3.9/runpy.py", line 228, in run_module
return _run_code(code, {}, init_globals, run_name, mod_spec)
File "/home/mduranfrigola/miniconda3/envs/delfta/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/mduranfrigola/miniconda3/envs/delfta/lib/python3.9/site-packages/delfta/download.py", line 134, in <module>
_download_required()
File "/home/mduranfrigola/miniconda3/envs/delfta/lib/python3.9/site-packages/delfta/download.py", line 93, in _download_required
with tarfile.open(models_tar) as handle:
File "/home/mduranfrigola/miniconda3/envs/delfta/lib/python3.9/tarfile.py", line 1797, in open
raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully
Do you know what might be going on?
Thanks a lot in advance.
When using the docker image from DockerHub, the import of openbabel fails:
>>> from openbabel.pybel import readstring
==============================
*** Open Babel Error in openLib
/opt/conda/envs/delfta/lib/openbabel/3.1.0/png2format.so did not load properly.
Error: libXau.so.6: cannot open shared object file: No such file or directory
This can be solved by running (inside the container):
apt-get update
apt-get install libxtst6
(see also https://stackoverflow.com/questions/17355863/cant-find-install-libxtst-so-6)
Hi, I would like to use the QMugs dataset used in this work with PyTorch Geometric. I am trying to set it up as an InMemoryDataset similar to how QM9 has been setup in PyG: https://pytorch-geometric.readthedocs.io/en/latest/_modules/torch_geometric/datasets/qm9.html#QM9
Could you please guide on converting QMugs to a PyG dataset? for eg., which urls should I use for raw_url
and raw_dir
?
make3D
automatically assigns hydrogens, so when passing a molecule without hydrogens but using force3D=True
, there's no feedback in the logger that hydrogens have been added. Additionally, this means that force3d=True
overrides addh=False
. I suggest first checking if hydrogens are present, and, if that is not the case but force3d=True
, to return an error.1 in set([atom.atomicnum for atom in mol.atoms])
runs the risk of ignoring molecules where only certain hydrogens are present. It's a strange edge case, but I suggest checking if e.g. [atom.atomicnum for atom in mol.atoms]
changes after adding hydrogens to a copy of the molecule. Not elegant, but didn't find an alternative yet.to provide meV errors in the manuscript
assert len(mols) == 100
in test_xtb.py
and test_calculator.py
to make sure we actually have something to run the tests onLine 65 in bd8da83
I have installed delfta with conda install delfta -c delfta -c pytorch -c rusty1s -c conda-forge
on a WSL Ubuntu 20.04
Now I'm trying to execute the "First Run" commands, but both
python -c "import runpy; _ = runpy.run_module('delfta.download', run_name='__main__')"
and
from delfta.download import _download_required
_download_required()
throw tarfile.ReadError: file could not be opened successfully
The file in question is under home/user/miniconda3/envs/delfta/lib/python3.8/tarfile.py
Having an option to set the logger level (e.g., suppress all INFO-level messages) would be nice.
When an invalid input file is provided, OpenBabel does print a warning, but we try to keep going and return somewhat confusing errors.
example_file_1.sdf
returns Molecules at position [0] have no 3D conformations available. Either provide a mol with one or re-run calculator with force3D=True.
example_file_2.sdf
returns need at least one array to concatenate
Checking if molecules are valid and throwing an error if not might be clearer for the user. In the two cases above, input_
is either a list of empty (no atoms) pybel molecules (for example_file_1.sdf
) or just an empty list (for example_file_2.sdf
), so we could check for that.
Instead of building the container each time with the Dockerfile. TBD once repo goes public
Line 38 in 67f1703
When running DelftaCalculator for more than batch_size
molecules, new progress bars are created for each batch. This somewhat defeats the purpose of a progress bar, as it always only goes from 0/1 to 1/1, and then a new bar is created, so there's no estimate on how long the overall process will take, or which percentage of the overall task has been completed. As a minor point, we can probably skip the progress if the input is < batch_size
, although the user could also turn that off themselves.
Python 3.7 is approaching EOL soon.
e.g. smiles, inchi
This error (
Line 64 in 5b4328d
To see if error propagation still works correctly.
Line 54 in ca79d57
mol.addhs()
Line 18 in 67f1703
Line 182 in e7fd6b9
mol.atoms
can only be accessed if the previous check passed for all items in the list.
Useful when computing a large number of molecules.
It is quite ugly to expose this parameter to the user.
Alternatively, consider making this a class variable as in self.offset_idx
.
Line 161 in 91da43a
For large numbers of molecules (i.e. an sdf file) it may make more sense to use joblib with large batch sizes.
This is more of a thought than a bug issue, but couldn't it all go to the calculator.py
file?
Using a modern GPU
Combinations of force3d with addh can lead to unexpected results.
While I realize E_gap is a separate model itself, the value it returns is sometimes way different from calculating E_lumo - E_homo. For example, take the following molecule with the following settings:
calc = DelftaCalculator(delta=True, xtbopt=True, return_optmols=True, addh=True)
mol = readstring("smi", "C[C@@H]1CC[C@@H]2C(C)(C)[C@H](O)[C@]3(C)CC[C@@]21C3")
Results are:
'E_homo': -0.23676376
'E_lumo': 0.11727003
'E_gap': 0.12012529
E_gap should be close to 0.354
via conda
xtb
Line 114 in 091ef8a
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.