Comments (2)
With cache_mols enabled, your entire dataset will be loaded into memory (after processing - so just the xyz coordinates and type information). If your dataset doesn't fit in memory, you will run out. What you should see is a steady increase in memory usage until you have done one complete epoch at which point it levels off.
If your dataset isn't very large and/or you are seeing increased memory usage after performing a complete epoch, we will investigate, but otherwise it is the expected behavior.
If you aren't using vector typing, you can use the create_caches2.py script (https://github.com/gnina/scripts/blob/master/create_caches2.py) to assemble your dataset into one very efficient memory-mappable file. Memory mapping let's you benefit from in-memory data without having to worry about out-of-memory errors (and especially valuable when running multiple jobs on the same node).
from libmolgrid.
The models I'm using are on the edge of what's possible to store in the memory that I have available, so the 23gb dataset on top of that is indeed giving me OOM problems. I suppose this can be closed, perhaps adding something to the docs warning about large datasets and the default constructor (cache_structs=True) would be useful.
Thanks for the prompt response.
from libmolgrid.
Related Issues (20)
- libmolgrid issue HOT 4
- libmolgrid issues about stratifying receptors HOT 1
- Attempting compilation for unknown architectures HOT 1
- The ExampleProvider populate part is not working properly. HOT 7
- Is there a relationship between dataset structure and ExampleProvider HOT 5
- data_root error while defining ExampleProvider HOT 3
- libmolgrid install in python2 HOT 1
- Taking care of each region when creating a gninatype HOT 1
- .
- libmolgrid install error HOT 1
- example code in libmolgrid github HOT 4
- Issue when install libmolgrid from source HOT 9
- SystemError importing molgrid HOT 3
- General Question about AtomTyping HOT 2
- Simple example of multi-atom molecule to density to molecule HOT 4
- How can I visualize my voxel grid HOT 1
- Saving in-memory cache to disk HOT 8
- Specifying grid centers for all Examples in a batch HOT 2
- Conda forge package HOT 2
- Libmolgrid does not provide consistent batching HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libmolgrid.