Comments (13)
How to test dpryans new PR:
git clone -b numpy [email protected]:dpryan79/pyBigWig.git
cd pyBigWig
python setup.py install # I guess this should overwrite the already installed pyBigWig?
Code you can copy into ipython afterwards:
iimport pyBigWig
import pandas as pd
header = [("chr1", 500)]
bw = pyBigWig.open("test.bw", "w")
bw.addHeader(header)
c = ["chr1"] * 3
s = pd.Series([1, 5, 7])
e = s + 1
v = pd.Series([5, 0, -5])
bw.addEntries(c, list(s), ends=list(e), values=list(v)) # error happens here
bw.close()
For me, running the above code lead to the following error:
# RuntimeError: You must provide a valid set of entries. These can be comprised of any of the following:
# 1. A list of each of chromosomes, start positions, end positions and values.
# 2. A list of each of start positions and values. Also, a chromosome and span must be specified.
# 3. A list values, in which case a single chromosome, start position, span and step must be specified.
Edit: come to think of it, this would have segfaulted in the previous version, AFAICR so good job. Might be an error on my part, somewhere.
from pybigwig.
Great! I'd like to sit on this a little while and play around with it before I make the next release, since the large increase in code size inevitably created a but or two.
I don't have any good reference for python/numpy/C interop. This was my first C/Python hybrid and was quite a learning curve to make (in comparison, I started and completed py2bit/lib2bit in 2 days, so feel free to "borrow" the boilerplate initialization/finalization stuff from my code). I personally found the python C API documentation to be...not the most helpful thing in the world (especially in regards to reference counting). If you're interested in python/C/numpy interop then think early on about python 2/3 differences and the variety of different numeric types that can get passed in by numpy. The latter is relatively easy to handle but the former I at least personally still find to be confusing.
from pybigwig.
Good suggestion, I'll add that in.
Regarding parallelism, it depends on how you try to go about things. You can't pickle bigWigFile
objects, so can't pass one to a function and have that work (the same is true for AlignmentFile
objects from pysam, python's multithreading support just leaves much to be desired). What we do to get around that is to have each thread open the bigWig file(s). That works well for deepTools. Writing is innately single-threaded.
from pybigwig.
I've now been tripped up by this multiple times, so I'll resound the request. However, I'm also curious why you can't work with numpy floats. I'd prefer to use the float16 to trade off precision for less memory. Thanks a lot for putting this code online!
from pybigwig.
@davek44: I certainly could work with numpy input, it's just a matter of having another dependency.
from pybigwig.
@endrebak @davek44 If either of you have a chance, give the numpy
branch a try. I've started adding support for numpy arrays when creating bigWig files. I'll try to add a method to output a numpy array in the values()
method too, since that'd be faster and more memory efficient. Note that this is only possible if you have numpy installed before you install pyBigWig, since pyBigWig is written in C and needs to link against the numpy .so file.
from pybigwig.
Most people use conda, so installation order should very seldom be a
problem.
Super btw. Will report back when I get around to it.
On Mon, Oct 31, 2016 at 8:57 AM, Devon Ryan [email protected]
wrote:
@endrebak https://github.com/endrebak @davek44
https://github.com/davek44 If either of you have a chance, give the
numpy branch a try. I've started adding support for numpy arrays when
creating bigWig files. I'll try to add a method to output a numpy array in
the values() method too, since that'd be faster and more memory
efficient. Note that this is only possible if you have numpy installed
before you install pyBigWig, since pyBigWig is written in C and needs to
link against the numpy .so file.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/dpryan79/pyBigWig/issues/18#issuecomment-257233002,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AQ9I0utTkLDVZy-5XVXSovcRSfsYYFaqks5q5Z9XgaJpZM4KQuAw
.
from pybigwig.
Yup, I'll modify the bioconda recipe accordingly if this works out OK.
from pybigwig.
The numpy
branch now has some automated tests. Further, the values()
function can return a numpy array. This still needs a fair but of testing, since it's likely I've missed some things.
from pybigwig.
You can also just pip install --upgrade git+https://github.com/dpryan79/pyBigWig@numpy
to install it.
I've only tested directly inputting a numpy array (note that there's a nosetest that uses this), so I'd have to look and see what pandas is actually giving things.
As an aside, this is the downside to writing python extensions in C. They're much faster, but also much less flexible.
from pybigwig.
@endrebak Values need to be floats in bigWig files. That's why you're getting the error.
from pybigwig.
For pandas, it looks like you can use the values
attribute rather than making a list:
import pyBigWig
import pandas as pd
header = [("chr1", 500)]
bw = pyBigWig.open("test.bw", "w")
bw.addHeader(header)
c = ["chr1"] * 3
s = pd.Series([1, 5, 7])
e = s + 1
v = pd.Series([5, 0, -5], dtype=float)
bw.addEntries(c, s.values, ends=e.values, values=v.values)
bw.close()
As an aside, there was a bug preventing this from working with 64bit floats (what pandas uses) that I just fixed.
from pybigwig.
With the latest PR (Fix the minimum floating point value check) it worked. Thanks!
Ps. I am trying to recreate some of S4Vectors' functionality in Python. Do you have a good reference for reading about Python/numpy/C interop? If no, then please do not go looking for my sake (obviously :) ).
from pybigwig.
Related Issues (20)
- RuntimeError: Invalid interval bounds
- Document performance considerations? HOT 4
- Cannot add entries of value type int, but only float HOT 2
- support for osx-arm64 HOT 2
- numpy support broken in 0.3.18? HOT 1
- Create a BedGraph file using addEntries() throws segmentation fault HOT 2
- library import error HOT 4
- pyBigWig fails to find numpy installation when installing from PyPI HOT 5
- Writing a nan value should leave a gap HOT 7
- Can't enforce numpy features when pyBigWig is used as a dependency in downstream package HOT 5
- pip installation broken HOT 4
- Installing through pip not working HOT 9
- addHeader does not support multiple calls HOT 1
- Support for python >=3.11 HOT 1
- Issue Downloading pyBigWig HOT 1
- Simple patch to resolve conflict with roundup() macro
- Stats Sum Not Working as Expected
- 'zsh: segmentation fault ' HOT 1
- Out of memory listing entries on one human chromosome on a machine with 300 GB ram and 165 GB BigBed file HOT 1
- pyBigWig.entries() should return empty array, not None when no entries are found
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pybigwig.