Comments (5)
Open the bigWig file inside extract_data()
.
from pybigwig.
I tried. In principle that works in the sense that I don't get errors, but the problem is that, with that setup, as I increase the number of processes the computing time also increases.
from pybigwig.
For small regions that are near each other multithreading won't help you. Once you have 100kb or megabase regions the overhead of reading and decompressing is no longer rate limiting. In general, opening files inside worker forks is the only way to reliably access files in parallel with python.
from pybigwig.
I have done some more research and it seems that, as you pointed out, increasing the number of processes indeed does not help a lot. However, the biggest reason for the slowdown that I was seeing was actually adding new rows to the pandas data frame 1000s of times. I changed that to writing to file directly and things improved A LOT.
Thanks again for the help again.
from pybigwig.
Glad you got things resolved!
from pybigwig.
Related Issues (20)
- RuntimeError: Invalid interval bounds
- Document performance considerations? HOT 4
- Cannot add entries of value type int, but only float HOT 2
- support for osx-arm64 HOT 2
- numpy support broken in 0.3.18? HOT 1
- Create a BedGraph file using addEntries() throws segmentation fault HOT 2
- library import error HOT 4
- pyBigWig fails to find numpy installation when installing from PyPI HOT 5
- Writing a nan value should leave a gap HOT 7
- Can't enforce numpy features when pyBigWig is used as a dependency in downstream package HOT 5
- pip installation broken HOT 4
- Installing through pip not working HOT 9
- addHeader does not support multiple calls HOT 1
- Support for python >=3.11 HOT 1
- Issue Downloading pyBigWig HOT 1
- Simple patch to resolve conflict with roundup() macro
- Stats Sum Not Working as Expected
- 'zsh: segmentation fault ' HOT 1
- Out of memory listing entries on one human chromosome on a machine with 300 GB ram and 165 GB BigBed file HOT 1
- pyBigWig.entries() should return empty array, not None when no entries are found
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pybigwig.