Comments (7)
I'm hitting this exact same exception - the encode keeps going which isn't ideal.
from bio2zarr.
I pushed an update in #80 which should help debug this @shz9. If you run encode
with -v
it should give you helpful messages about the minimum RAM required per array.
Doing things better will require some refactoring, which we should probably do as part of making the encode job work in parallel over a cluster.
from bio2zarr.
I just hit the same issue and it was due to the worker getting killed by the OOM killer.
I suspect what happened here is that you had just enough memory reserved for 4 workers for all the fields except PL. These fields are huge (each chunk is nearly 1GB), so I'm not surprised the cluster killed it.
This is not obvious, so we should potentially intercept the BrokenProcessPool exception in the main process and add an informative mesage like "you probably ran out of memory".
I think the simplest thing for now is to just remove PL from your experiments. Edit the schema JSON and delete the PL field, and it should all work fine.
Also a general question related to this: Do you think it's possible to pick up the encoding work from where it left off if things like this happen instead of starting over?
I think that's closely related to how we're going to split this up into manageable bits for cluster scheduling. See #71 and #77 for discussions on how we're doing this for explode
(and I think some high-level discussion about encode
too).
from bio2zarr.
Good to know, I think I know how to fix
from bio2zarr.
In addition to the informative exception, do you think it'd be possible to allow the user to set a --max-memory
flag based on which we can determine memory-friendly chunksizes for the encoding stage? We can re-chunk afterwards for optimal compression if needed? Alternatively, if we don't want to change the chunksizes, we can automatically reduce the number of workers for arrays that may have large chunks?
If it's of interest, I have this function that determines chunking patterns based on on number of cores / data type:
https://github.com/shz9/magenpy/blob/579504c7cd8a61808ab8b880e1627ef3ffe5fc8d/magenpy/stats/ld/utils.py#L547
def optimize_chunks_for_memory(chunked_array, cpus=None, max_mem=None):
"""
Determine optimal chunks that fit in max_mem. Max_mem should be numerical in GiB
Modified from: Sergio Hleap
"""
import psutil
import dask.array as da
if cpus is None:
cpus = psutil.cpu_count()
if max_mem is None:
max_mem = psutil.virtual_memory().available / (1024.0 ** 3)
chunk_mem = max_mem / cpus
chunks = da.core.normalize_chunks(f"{chunk_mem}GiB", shape=chunked_array.shape, dtype=chunked_array.dtype)
return chunked_array.chunk(chunks)
from bio2zarr.
Ooh, max-memory
is a great idea! We could associate a memory value with each future (say 3 times the number of bytes in one chunk of the array) and then stop submitting when the total for the outstanding futures exceeds this. I expect this would work quite well, especially if we try and mix up the big chunks with smaller ones.
We should follow this up in a separate issue
from bio2zarr.
Closing this as we've added the max-memory argument as well.
from bio2zarr.
Related Issues (20)
- Refactor docs build infrastructure
- Restructure vcf2zarr docs
- Add --no-progress (or similar) to suppress progress
- Bug in dexplode-partition
- Change dexplode-init to use ``--num-parts``/``-n`` instead of positional HOT 1
- Change dencode-init to use --num-partitions
- Hypothesis testing for vcf2zarr HOT 13
- Pin to zarr < 3
- ValueError: could not broadcast input array
- Run tests against Zarr 3 HOT 13
- Run tests against numpy 2 HOT 4
- Set copy=True in np.array creation for numpy 2.0 compatibility HOT 1
- ICF stores created with numpy 1.x won't work with numpy 2.x HOT 1
- Optional first phasing symbol introduced in VCF 4.4 HOT 4
- Parsing fails for VCF with GT in header but not in FORMAT field HOT 1
- Char fields added as Unicode not string HOT 1
- Numcodecs v0.13.0 causing test failures HOT 1
- Inspect fails for datasets with out consolidated metadata
- LPL no smaller than PL HOT 10
- Add variant length/end coordinate field HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bio2zarr.