GithubHelp home page GithubHelp logo

ensembl / wiggletools Goto Github PK

View Code? Open in Web Editor NEW
138.0 19.0 24.0 3.17 MB

Basic operations on the space of numerical functions defined on the genome using lazy evaluators for flexibility and efficiency

License: Apache License 2.0

Makefile 0.44% Shell 0.24% C 88.50% Python 9.20% TeX 1.36% Dockerfile 0.24%

wiggletools's Introduction

GitHub license GitHub stars GitHub forks GitHub issues

WiggleTools 1.2

Author: Daniel Zerbino

Copyright holder: EMBL-European Bioinformatics Institute (Apache 2 License)

The WiggleTools package allows genomewide data files to be manipulated as numerical functions, equipped with all the standard functional analysis operators (sum, product, product by a scalar, comparators), and derived statistics (mean, median, variance, stddev, t-test, Wilcoxon's rank sum test, etc).

Conda Installation

Install conda, then run:

conda install -c bioconda wiggletools

Brew Installation

Install Homebrew, then run:

brew install brewsci/bio/wiggletools

Docker Installation

Pull the latest image from Dockerhub:

docker pull ensemblorg/wiggletools:latest

Run the resulting wiggletools executable, bind-mounting the current working directory into the container:

docker container run --rm --mount type=bind,source="$(pwd)",target=/mnt ensemblorg/wiggletools  [...arguments...]

Guix Installation

Install GNU Guix, then run:

guix pull
guix install wiggletools

Build from source

Pre-requisites

WiggleTools requires three main dependencies: LibBigWig, HTSLib and GSL (GNU scientific) libraries. They themselves require zlib bzip2 and libcurl.

Installing LibBigWig

git clone https://github.com/dpryan79/libBigWig.git
cd libBigWig
make install

Installing the htslib library

git clone --recurse-submodules https://github.com/samtools/htslib.git
cd htslib 
make install

Installing the GSL library

wget ftp://www.mirrorservice.org/sites/ftp.gnu.org/gnu/gsl/gsl-latest.tar.gz 
tar -xvzpf gsl-latest.tar.gz
cd gsl*
./configure
make
make install

Installing WiggleTools

If you didn't download WiggleTools yet:

git clone https://github.com/Ensembl/WiggleTools.git

Once you installed the previous libraries and downloaded WiggleTools, you can compile the WiggleTools library:

cd WiggleTools
make

The make process produces a number of outputs:

  • A statically linked library in lib/
  • A header for that library in inc/
  • Various executables in bin/

There is no installation routine, meaning that you should copy the relevant files onto your path, library path, etc. Note that the executable does not require the libraries to be available.

If the system cannot find 'gsl/gsl_cdf.h' then you need to install the GNU scientific library

Just to check, you can launch the tests (requires Python):

make test

Basics

The WiggleTools library, and the derived program, are centered around the use of iterators. An iterator is a function which produces a sequence of values. The cool thing is that iterators can be built off other iterators, offering many combinations.

The wiggletools executable is run by giving it a string which describes an iterator function, which it executes, printing the output into stdout.

wiggletools <program>

If you need a refresher:

wiggletools --help

If you are an intensive user, you may find that processing many files may break limits on commandline commands, especially if shelling out from a scripting language. You may copy the program into a text file, then execute it:

wiggletools run program.txt

Input files

By default, the executable recognizes the file format from the suffix of the file name:

  • Wiggle files
wiggletools test/fixedStep.wig 
  • BigWig files
wiggletools test/fixedStep.bw 
  • BedGraph files
wiggletools test/bedfile.bg 
  • Bed files
wiggletools test/overlapping.bed 
  • BigBed files
wiggletools test/overlapping.bb 
  • Bam files

Requires a .bai index file in the same directory

wiggletools test/bam.bam
  • Cram files

Requires a .bai index file in the same directory

wiggletools test/cram.cram
  • VCF files
wiggletools test/vcf.vcf
  • BCF files

Requires a .tbi index file in the same directory

wiggletools test/bcf.bcf

Streaming data

You can stream data into WiggleTools, e.g.:

cat test/fixedStep.wig | wiggletools -

The input data is assumed to be in Wig or BedGraph format, but can also be in Sam format:

samtools view test/bam.bam | wiggletools sam -

Operators

However, iterators can be constructed from other iterators, allowing arbitrarily complex constructs to be built. We call these iterators operators. In all the examples below, the iterators are built off simple file readers (for simplicity), but you are free to replace the inputs with other iterators.

1 Unary operators

The following operators are the most straightforward, because they only read data from a single other iterator.

  • abs

Returns the absolute value of an iterators output:

wiggletools abs test/fixedStep.bw 
  • ln

Returns the natural log of an iterators output:

wiggletools ln test/fixedStep.bw 
  • log

Returns the logarithm in an arbitrary base of an iterators output:

wiggletools log 10 test/fixedStep.bw 
  • scale

Returns an iterator's output multiplied by a scalar (i.e. decimal number):

wiggletools scale 10 test/fixedStep.bw 
  • offset

Returns an iterator's output added to a scalar (i.e. decimal number):

wiggletools offset 10 test/fixedStep.bw 
  • gt

Returns contiguous boolean regions where the iterator is strictly greater than a given cutoff:

wiggletools gt 5 test/fixedStep.bw 

This is useful to define regions in the apply function, or to compute information content (see below).

  • lt

Returns contiguous boolean regions where the iterator is strictly less than a given cutoff:

wiggletools lt 5 test/fixedStep.bw 

This is useful to define regions in the apply function, or to compute information content (see below).

  • gte

Returns contiguous boolean regions where the iterator is greater than or equal to a given cutoff:

wiggletools gte 5 test/fixedStep.bw 

This is useful to define regions in the apply function, or to compute information content (see below).

  • lte

Returns contiguous boolean regions where the iterator is less than or equal to a given cutoff:

wiggletools lte 5 test/fixedStep.bw 

This is useful to define regions in the apply function, or to compute information content (see below).

  • unit

Returns 1 if the operator is non-zero, 0 otherwise, and merges contiguous positions with the same output value into blocks:

wiggletools unit test/fixedStep.bw 

This is useful to define regions in the apply function (see below).

  • coverage

Returns a coverage plot of overlapping regions, typically read from a bed file:

wiggletools coverage test/overlapping.bed
  • isZero

Does not print anything, just exits with return value 1 (i.e. error) if it encounters a non-zero value:

wiggletools isZero test/fixedStep.bw 
  • seek

Outputs only the points of an iterator within a given genomic region:

wiggletools seek chr1 2 8 test/fixedStep.bw 
  • bin

Sums results into fixed-size bins

wiggletools bin 2 test/fixedStep.bw 
  • toInt

Casts the iterator's output to an int, effectively rounding any floating point values toward zero.

wiggletools toInt test/fixedStep.bw
  • floor

Returns the floor of a iterator's output. Note that floor rounds the output toward negative infinity.

wiggletools floor test/fixedStep.bw
  • shiftPos

Returns the iterator given with start and end positions shifted downwards by a specified value. Note the given value must be non-negative, as default behavior is to shift coordinates toward zero.

wiggletools shiftPos 10 test/fixedStep.bw

2 Binary operators

The following operators read data from exactly two iterators, allowing comparisons:

  • diff

Returns the difference between two iterators outputs:

wiggletools diff test/fixedStep.bw test/variableStep.bw 
  • ratio

Returns the output of the first iterator divided by the output of the second (divisions by 0 are squashed, and no result is given for those bases):

wiggletools ratio test/fixedStep.bw test/variableStep.bw 
  • overlaps

Returns the output of the second iterator that overlaps regions of the first.

wiggletools overlaps test/fixedStep.bw test/variableStep.bw 
  • trim

Same as above but trims the regions to the overlapping portions:

wiggletools trim test/fixedStep.bw test/variableStep.bw
  • trimFill

Same as trim, but fills in trimmed regions with the default value of the second iterator.

wiggletools trimFill test/fixedStep.bw test/overlapping_coverage.wig
  • nearest

Returns the regions of the second iterator and their distance to the nearest region in the first iterator.

wiggletools nearest test/fixedStep.bw test/variableStep.bw 

3 Multiplexed iterators

However, sometimes you want to compute statistics across many iterators. In this case, the function is followed by an arbitrary list of iterators, separated by spaces. The list is terminated by a colon (:) separated by spaces from other words. At the very end of a command string, the colon can be omitted (see example in the example for sum)

  • sum

The sum function sums all the listed iterators. The two following commands are equivalent:

wiggletools sum test/fixedStep.bw test/variableStep.bw :
wiggletools sum test/fixedStep.bw test/variableStep.bw

However, the colon can be necessary for the program string to be unambiguous, e.g.:

wiggletools diff sum test/fixedStep.bw test/variableStep.bw \
            : test/fixedStep
  • mult

Multiplies the subsequent list of iterators:

wiggletools mult test/fixedStep.bw test/variableStep.bw 
  • mean

Computes the mean of the subsequent list of iterators at each position:

wiggletools mean test/fixedStep.bw test/variableStep.bw 
  • median

Computes the median of the subsequent list of iterators at each position:

wiggletools median test/fixedStep.bw test/variableStep.bw 
  • var

Computes the variance of the subsequent list of iterators at each position:

wiggletools var test/fixedStep.bw test/variableStep.bw 
  • stddev

Computes the standard error of the subsequent list of iterators at each position:

wiggletools stddev test/fixedStep.bw test/variableStep.bw 
  • entropy

Computes the Shannon entropy of the subsequent list of iterators at each position, separating 0 from non-0 values. This is probably most useful with the gt (greater than) filter:

wiggletools entropy gt 5 test/fixedStep.bw test/overlapping.bb
  • CV

Computes the coefficient of variation ( = standard deviation / mean) of the subsequent list of iterators at each position:

wiggletools CV test/fixedStep.bw test/variableStep.bw 
  • min

Computes the minimum of the subsequent list of iterators at each position:

wiggletools min test/fixedStep.bw test/variableStep.bw 
  • max

Computes the maximum of the subsequent list of iterators at each position:

wiggletools max test/fixedStep.bw test/variableStep.bw 

4 Comparing sets of sets

  • Welch's t-test

Computes the two-tailed p-value of Welch's t-test comparing two sets of numbers, each assumed to have a normal distribution:

wiggletools ttest test/fixedStep.bw test/variableStep.bw test/fixedStep.wig \
            : test/fixedStep.wig test/variableStep.bw test/fixedStep.wig
  • F-test

Computes the p-value of the F-test comparing sets of numbers, each assumed to have a normal distribution:

wiggletools ftest test/fixedStep.bw test/variableStep.bw test/fixedStep.wig \
            : test/fixedStep.wig test/variableStep.bw test/fixedStep.wig
  • Wilcoxon's sum rank test

Non-parametric equivalent of the above:

wiggletools wilcoxon test/fixedStep.bw test/variableStep.bw test/fixedStep.wig \
            : test/fixedStep.wig test/variableStep.bw test/fixedStep.wig

5 Mapping a unary function to an iterator list:

If you wish to apply the same function to a list of iterators without typing redundant keywords, you can use the map function, which applies said operator to each element of the list:

wiggletools sum map ln test/fixedStep.bw test/variableStep.bw
wiggletools sum scale -1 test/fixedStep.bw test/variableStep.bw

Writing into files

Stdout is great and all, but sometimes you want to specify an output file on the command line without the use of pipes. This is done with the write function. It writes the output of an iterator into a wiggle file, and simultaneously returns the same output:

wiggletools write copy.wiggle test/fixedStep.wig 

The write instruction is itself an iterator, such that you can store data in a file, yet keep it in memory for more computation. For example the following computes the mean of two files, stores the result in a file, and also compares that result to a third file:

wiggletools diff test/fixedStep.bw \
write sum.wig mean test/fixedStep.bw test/variableStep.bw 

For convenience, if a command starts with a write instruction, the standard output is squashed. Otherwise, if you want to silence standard out, use the do command, which simply runs an iterator and returns nothing:

wiggletools do test/fixedStep.wig 

If you wish to write into standard output, simply use the dash - symbol.

wiggletools write - test/fixedStep.wig 

If you wish to have your output in BedGraph format (takes more space but easier to parse line-by-line), use the write_bg command:

wiggletools write_bg - test/fixedStep.wig 

Note that BedGraphs and the BedGraph sections within wiggle files are 0-based, whereas the `normal' wiggle lines have 1-based coordinates.

The BedGraph output respects the layout of the input iterator, however if you want consecutive regions with the same value to be collapsed together, you can use the "compress" keyword:

wiggletools write_bg - compress test/fixedStep.wig 

Writing multidimensional wiggles into files

Sometimes, for your own reasons, you may want to print out multiple wiggles side by side. This can be done with the designated mwrite and mwrite_bg operators

Warning!! Multidimensional wiggles are not part of the BigWig/BedGraph specs, and will probably spark an error with the Kent apps. These are solely designed for your own usage.

wiggletools mwrite_bg - test/overlapping.bed test/fixedStep.bw

Statistics

Sometimes, you just want a statistic across the genome. The following functions do not return a sequence of numbers, just a single number. Some of these integrating functions have "I" appended to them to distinguish them from the iterators with related (yet different) functions.

  • AUC

Computes the area under the curve (AUC) of an iterator:

wiggletools AUC test/fixedStep.bw
  • meanI

Computes the mean of an iterator across all of its points:

wiggletools meanI test/fixedStep.bw 
  • varI

Computes the variance of an iterator across all of its points:

wiggletools varI test/fixedStep.bw 
  • stddevI

Computes the standard deviation of an iterator across all of its points:

wiggletools stddevI test/fixedStep.bw 
  • CVI

Computes the coefficient of variation of an iterator across all of its points:

wiggletools CVI test/fixedStep.bw 
  • maxI

Computes the maximum of an iterator across all of its points:

wiggletools maxI test/fixedStep.bw 
  • minI

Computes the minimum of an iterator across all of its points:

wiggletools minI test/fixedStep.bw 
  • pearson

Computes the Pearson correlation between two iterators across all their points:

wiggletools pearson test/fixedStep.bw test/fixedStep.bw
  • energy

Computes the energy density at a given wavelength:

wiggletools energy 10 test/fixedStep.bw

Chaining statistics

All the above functions are actually iterators that transmit the same data as they are given, e.g.:

wiggletools test/fixedStep.bw 
wiggletools scale 1 AUC test/fixedStep.bw 

This allows you to plug multiple statistics in a dandelion chain off the same iterator. Note how results of the operators are concatenated as they are read from left to right:

wiggletools meanI varI minI maxI test/fixedStep.bw 

If you want to save the output of a statistic into a file, you can use the print statement:

wiggletools print output.txt AUC test/fixedStep.bw

As with other write functions, if a command starts with a print statement, the standard output is squashed.

Apply

The apply function reads the regions from one iterator, then computes a given statistic on another iterator across those regions. You can chain the statistics as above. Because of this feature, the apply operator returns a multiplexer (i.e. a multidimensional wiggle), hence the mwrite operator before it:

wiggletools mwrite_bg - apply meanI stddevI unit test/variableStep.bw test/fixedStep.bw

For convenience, if the apply operator is used in a context which expects a standard unidimensional wiggle, it is transformed into one. In particular, if you computed multiple statistics in parallel as above, only the first is retained.

  • Apply and Paste

This is a convenience wrapper around the above function: it reads the regions directly from a Bed file, then prints out each line of the file, with the resulting statistic appended at the end of the line. This is useful to keep identifiers and other metadata contained in the same file as the results. Note that the mwrite operator is unnecessary in this case:

wiggletools apply_paste output_file.txt meanI test/overlapping.bed test/fixedStep.bw

Profiles

To generate a fixed width summary of an iterator across a collection of regions, you can request the profiles function. This will print out the profiles, one for each region:

wiggletools profiles results.txt 3 test/overlapping.bed test/fixedStep.wig

If you just want a single profile, which sums up the results of all those profiles, you simply do:

wiggletools profile results.txt 3 test/overlapping.bed test/fixedStep.wig

As above, the output file name can be replaced by a dash (-) to print to standard output.

Histograms

To generate a histogram of values across the iterator, simply use the histogram command. The number of bins must be pre-defined:

wiggletools histogram results.txt 10 test/fixedStep.wig

A histogram can hold multiple distributions:

wiggletools histogram results.txt 10 test/fixedStep.wig test/variableStep.wig

The format of the output is hopefully rather self explanatory: each line starts with the midpoint value of a bin, and the values for that bin, tabbed-delimited.

The algorithm used to compute these histograms is approximate: it adapts the width of the bins to the data received, and requires very little memory or computation. However, the values of the bins is not quite exact, as some points might be counted in a neighbouring bin to the one they should belong to. Normally, over a large datasets, these approximations should roughly even out.

Parallel processing

To aid in running Wiggletools efficiently, a script, parallelWiggletools.py was designed to automate the batching of multiple jobs and the merging of their output. At the moment, this scripts requires an LSF job queueing system.

To run this script, you must provide first with a tab-delimited file that specifies the names and legnths of all the chromosomes in your genome, see test/chrom_sizes for an example.

You then specify a Wiggletools command, note how the write function now points to a BigWig file:

parallelWiggletools.py test/chrom_sizes 'write copy.bw test/fixedStep.bw'

Because these are asynchronous jobs, they generate a bunch of files as input, stdout and stderr. If these files are annoying to you, you can change the DUMP_DIR variable in the parallelWiggleTools script, to another directory which is visible to all the nodes in the LSF farm.

Default Values

A basic underlying question is how to deal with missing values. In some cases, no value in a BigWig file implicitly means 0, typically when working with coverage statistics or peaks. However, sometimes, you want positions with no values to be disregarded.

  • To deal with this, every iterator has a default value. By default, any file being read has a default value of 0. The default value of composed iterators is computed from their inputs. For example, if B is equal to A multiplied by 10, then the default value of B is 10 times that of A. The default value of an iterator can be directly set with the default keyword:
wiggletools sum test/fixedStep.wig test/variableStep.wig
wiggletools sum test/fixedStep.wig default 10 test/variableStep.wig
  • When a set of iterators A1, A2 ... is composed by a n-ary iterator M, M will skip the regions which are skipped by all the input iterators. However, in the presence of more than one input iterators that do not perfectly overlap, there will be regions which are covered by say A1, but not A2. Two behaviours are defined: if M is strict, it skips those regions, else it replaces missing values with the corresponding default values.

By default, n-ary iterators are not strict, but they can be made so with the strict keyword after the n-ary function name:

wiggletools sum test/fixedStep.wig test/variableStep.wig
wiggletools sum strict test/fixedStep.wig test/variableStep.wig
  • However, when integrating statistics across the genome, or regions of the genome, missing values are discarded by default, because it is not known which regions need to be filled in.

The fillIn operator allows you to define which regions should be filled in with the default value. It takes in two iterators, and behaves like a default multiplexer. Wherever possible, it takes the value of the second iterator, and otherwise takes the default_value of the second iterator. Note that this incurs a slight processing cost, so only use this operator at the last step of your computations, right before computing a statistic.

wiggletools test/variableStep.wig
wiggletools meanI - test/variableStep.wig
wiggletools fillIn test/fixedStep.wig test/variableStep.wig
wiggletools meanI - fillIn test/fixedStep.wig test/variableStep.wig

For convenience, the fillIn keyword can be used in the apply commands, as is:

wiggletools apply meanI unit test/variableStep.bw test/fixedStep.bw
wiggletools apply meanI fillIn unit test/variableStep.bw test/fixedStep.bw

Creating your own functions

If you're looking for guidance on creating your own iterators, you can have a look at:

  • Iterator -> Float: src/statistics.c, MeanIntegrator
  • Iterator -> Iterator: src/unaryOps.c, ScaleWiggleIterator
  • Set of iterators -> Iterator: src/reducers.c, MaxReduction
  • Set of Set of Iterators -> Iterator: src/setComparisons.c, TTestReduction

More info on the basic objects at:

  • wiggleIterator.h: Simple iterator.
  • multiplexer.h: Set of synchronised iterators.
  • multiSet.h: Set of set of synchronised iterators.

You may need some help hooking your new functions to the parser, we can help you out.

Citing WiggleTools

Zerbino DR, Johnson N, Juettemann T, Wilder SP and Flicek PR: WiggleTools: parallel processing of large collections of genome-wide datasets for visualization and statistical analysis. Bioinformatics 2014 30:1008-1009.

wiggletools's People

Contributors

dahlo avatar dzerbino avatar heavywatal avatar hoffman avatar jmarshall avatar joshuak94 avatar juettemann avatar lelouar avatar nathanweeks avatar purcaro avatar rekado avatar sgiorgetti avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wiggletools's Issues

Bug in 'bin'

Hi,
I found a bug in bin. I have a bigwig whose max values are 1.0:

$wiggletools write_bg - fillIn genome.bed blah.bw | head -n 1000000 | cut -f 4 | sort -nr | head
1.000000
1.000000
1.000000
1.000000

However, when I do bin 20 scale 0.05, I am getting values > 1 (either with or without write_bg), which should be impossible:

$wiggletools bin 20 scale 0.05 fillIn genome.bed blah.bw
chr1	109840	109860	0.982538
chr1	109860	110020	1.000000
chr1	110020	110040	1.321969
chr1	110040	110060	0.959469
chr1	110060	110080	0.961057
$wiggletools write_bg - bin 20 scale 0.05 fillIn genome.bed 
chr1	109980	110000	1.000000
chr1	110000	110020	1.000000
chr1	110020	110040	1.321969
chr1	110040	110060	0.959469
chr1	110060	110080	0.961057

It is happening due to bin, because without scale, it is still happening (max value should be 20):

$ wiggletools bin 20 fillIn genome.bed blah.bw
chr1	109820	109840	19.169865
chr1	109840	109860	19.650755
chr1	109860	110020	20.000000
chr1	110020	110040	26.439389
chr1	110040	110060	19.189385
chr1	110060	110080	19.221145

This is the original area of that issue without binning:

$wiggletools fillIn genome.bed blah.bw
chr1    109845  109847  0.954545
chr1    109847  109848  0.950000
chr1    109848  110027  1.000000
chr1    110027  109848  0.000000
chr1    109848  110027  1.000000
chr1    110027  110032  0.954545
chr1    110032  110041  0.958333

I'm not sure the source of the bug, but I suspect that binning is not weighting properly the proportion of the bin that is overlapped by the bigwig values.

error accesing remote bigWig

I am attempting to access a ENCODE bigWig by URL

wiggletools overlaps promoter-region.bb https://encode-public.s3.amazonaws.com/2017/10/03/ad2c0f17-0824-4647-a749-74276daca7da/ENCFF278CUB.bigWig

I get the error:

[urlOpen] curl_easy_perform received an error: Failed writing received data to disk/application File https://encode-public.s3.amazonaws.com/2017/10/03/ad2c0f17-0824-4647-a749-74276daca7da/ENCFF278CUB.bigWig is not in BigWig format

If the bigWig is download locally, this works fine.

The bigBed containes one record:
chr19 6925784 6926845

thanks!

Ubuntu14 installation error

Hi,

I've trying to install wiggletools on my desktop. Followed the installation instructions, I installed all pre-requisite package but still can't install wiggletools.

This is the output when I run sudo make

mkdir -p bin
cd src; make -e
make[1]: Entering directory `/home/yichao/Documents/software/WiggleTools/src'
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c wiggleIterator.c -o wiggleIterator.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c wigReader.c -o wigReader.o
wigReader.c: In function ‘WiggleReaderSeek’:
wigReader.c:216:14: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
  data->chrom = chrom;
              ^
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c bigWiggleReader.c -o bigWiggleReader.o
bigWiggleReader.c: In function ‘BigWiggleReaderSeek’:
bigWiggleReader.c:98:14: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
  data->chrom = chrom;
              ^
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c multiplexer.c -o multiplexer.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c reducers.c -o reducers.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c bedReader.c -o bedReader.o
bedReader.c: In function ‘BedReaderSeek’:
bedReader.c:95:14: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
  data->chrom = chrom;
              ^
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c bigBedReader.c -o bigBedReader.o
bigBedReader.c: In function ‘BigBedReaderSeek’:
bigBedReader.c:98:14: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
  data->chrom = chrom;
              ^
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c bamReader.c -o bamReader.o
bamReader.c: In function ‘BamReaderSeek’:
bamReader.c:201:14: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
  data->chrom = chrom;
              ^
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c apply.c -o apply.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c commandParser.c -o commandParser.o
commandParser.c: In function ‘parseFile’:
commandParser.c:872:7: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result]
  fread(buffer, 1, length, file);
       ^
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c wigWriter.c -o wigWriter.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c statistics.c -o statistics.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c unaryOps.c -o unaryOps.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c multiSet.c -o multiSet.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c setComparisons.c -o setComparisons.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c bufferedReader.c -o bufferedReader.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c vcfReader.c -o vcfReader.o
vcfReader.c: In function ‘VcfReaderSeek’:
vcfReader.c:70:14: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
  data->chrom = chrom;
              ^
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c bcfReader.c -o bcfReader.o
bcfReader.c: In function ‘downloadBCFFile’:
bcfReader.c:53:3: warning: passing argument 2 of ‘pushValuesToBuffer’ discards ‘const’ qualifier from pointer target type [enabled by default]
   if (pushValuesToBuffer(data->bufferedReaderData, bcf_hdr_id2name(data->bcf_header, vcf_line->rid), vcf_line->pos+1, vcf_line->pos+2, 1))
   ^
In file included from bcfReader.c:19:0:
bufferedReader.h:26:6: note: expected ‘char *’ but argument is of type ‘const char *’
 bool pushValuesToBuffer(BufferedReaderData * data, char * chrom, int start, int finish, double value);
      ^
bcfReader.c: In function ‘BcfReaderSeek’:
bcfReader.c:94:14: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
  data->chrom = chrom;
              ^
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c plots.c -o plots.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c mWigWriter.c -o mWigWriter.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c recycleBin.c -o recycleBin.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c fib.c -o fib.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c samReader.c -o samReader.o
samReader.c: In function ‘SamReaderSeek’:
samReader.c:186:14: warning: assignment discards ‘const’ qualifier from pointer target type [enabled by default]
  data->chrom = chrom;
              ^
mkdir -p ../lib
ar rcs ../lib/libwiggletools.a *.o
cc -g -Wall -O3 -std=gnu99  -D_PBGZF_USE -c wiggletools.c -o wiggletools.o
mkdir -p ../bin
cc -g -Wall -O3 -std=gnu99 -L../lib wiggletools.c -lwiggletools -lBigWig -lcurl -lhts -lgsl  -lgslcblas -lz -lpthread -lm -o ../bin/wiggletools 
make[1]: Leaving directory `/home/yichao/Documents/software/WiggleTools/src'
cd python/wiggletools; make
make[1]: Entering directory `/home/yichao/Documents/software/WiggleTools/python/wiggletools'
mkdir -p ../../bin
cp [^_]*.py *.sh ../../bin
make[1]: Leaving directory `/home/yichao/Documents/software/WiggleTools/python/wiggletools'
chmod 755 bin/*

When I run sudo make test, there is an error:

cd test; python2.7 test.py
Testing: ../bin/wiggletools do isZero diff fixedStep.bw variableStep.wig
../bin/wiggletools: error while loading shared libraries: libBigWig.so: cannot open shared object file: No such file or directory
Traceback (most recent call last):
  File "test.py", line 22, in <module>
    assert test('../bin/wiggletools do isZero diff fixedStep.bw variableStep.wig') == 1
AssertionError
make: *** [tests] Error 1

Can you help me figure out what might went wrong? Thank you so much!

easy_install.sh chokes of gsl install on ubuntu 14.04 LTS

Trying to install WiggleTools using easy_install.sh on Ubuntu 14.04 LTS results in an error when attempting to install prerequisite GSL. It appears the the apt package manager has different naming convention for the package libgsl0 -> libgsl0ldbl. Also see this thread for gsl package names

Manually installing libgsl0ldbl via apt-get install libgsl0ldbl followed by running easy_install.sh results in successful compilation.

BigWig.h missing from src folder, make cannot run

Hello all, I am trying to install WiggleTools from source. I have downloaded and installed the prerequisites, but after I clone WiggleTools from GitHub and try to make it in my folder on my server, I get this error:

cd src; make -e
make[1]: Entering directory 'path-to-software/WiggleTools/src'
cc -g -Wall -O3 -std=gnu99 -D_PBGZF_USE -c bigWiggleReader.c -o bigWiggleReader.o
bigWiggleReader.c:16:20: fatal error: bigWig.h: No such file or directory
#include "bigWig.h"

compilation terminated.
make[1]: *** [bigWiggleReader.o] Error 1
make[1]: Leaving directory 'path-to-software/WiggleTools/src'
make: *** [Wiggletools] Error 2

(Directory simplified in the above code) The src file here in the GitHub is lacking this bigWig.h file, but it seems required for installation.

how to average bw files?

Hi,

I tried to run the following command to average bw files and output a bw file. But the resulting bw file is not a binary file. How do I generate an averaged bw?

wiggletools write test_comb_wiggletool.bw mean test_rep*_spike.bw

gt not working properly after bin and scale

Hi,
This command is not working. It is giving unexpected and incorrect results. It should sum in bins of 20 bp, then scale by 0.05, then filter only for bins with value > 0.5. But this is not what is happening.
Thanks.

wiggletools bin 20 scale 0.05 gt 0.5 file.bw

Perhaps it is not operating on the binned output and is still operating on the wiggle output? If so, it might be nice to have a way to operate on binned output in iterator chains.

Recommendations to compute normalizations with severals samples and one input

I'm testing WiggleTools since a few days and I was wondering if you have advices to give on the normalisation step. Does Ensembl recommand some specific process ?

I mean for example when you have done ChipSeq for one histone mark. You have 2 replicates(or more) and one input. All are in bam format.( from bwa alignment for example) And finally you want only one wiggle "normalised" to display on viewer or to make some plots to profile the signal around several coordinates.

You want at the end one wiggle file normalised in RPKM (I know it's what deeptools is doing) or RPM (just number of mapped reads, is what I'm doing).

I was thinking to try the following, seems ok ? thanks :

Normalise Replicate 1 :
wiggletools write Rep1.normalised scale 1/TotalMappedReads scale 1000000 Rep1.bam

NormaliseReplicate 2 :
wiggletools write Rep2.normalised scale 1/TotalMappedReads scale 1000000 Rep1.bam
Normalise Control :
wiggletools write Control.normalised scale 1/TotalMappedReads scale 1000000 Rep1.bam

Final normalisation to get 1 wiggle file :
wiggletools write mean.normalised.posOnly.wig trim lengths.bed gt 0 diff mean Rep1.normalised .wig Rep2.normalised.wig : Control.normalised.wig

How would you have done if for RPKM ?

Using wiggletools overlaps processed data, but the results cannot open by IGV

Hello,

I used the overlaps function to combine my two bw data. The command are

wiggletools overlaps /Volumes/PBLAB/Seq_data_temp/F03_PBS_A1/F03_PBS_A1.bw /Volumes/PBLAB/Seq_data_temp/F03_PBS_A2/F03_PBS_A2.bw > /Volumes/PBLAB/Seq_data_temp/F03_PBS_A1.bw

The original two data both can open by IGV, but the combined bw file can not....

Do you know why this happens? How can I do to solve this?

No chromosome with 2 numbers in the output wig

Trying to do (something that used to work one year ago) :

Didn't change the executable.

wiggletools write final.wig trim length.chromosomes.sorted.bed diff mean A.rep1.bw B.rep2.bw : C.control.bw
To explain a little bit, I compute the mean of two replicates and remove the input signal. I don't remember the trim part but this is the part that creates the error. Without the trim part , the output is fine.

In output final.wig , I have only signal for chr1 to chr9 and (chrX / chrY).

I revert back these files (A.rep1.bw B.rep2.bw C.control.bw) to wig to check if also they were empty for chr10 for example but they contain signal for chromosomes with two numbers.

awk '{ print $1}' final.wig | sort -u

chr1
chr2
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chrX
chrY

Any ideas ?

My file with chrom length : length.chromosomes.sorted.bed diff

GL000008.2 0 209709
GL000009.2 0 201709
GL000194.1 0 191469
GL000195.1 0 182896
GL000205.2 0 185591
GL000208.1 0 92689
GL000213.1 0 164239
GL000214.1 0 137718
GL000216.2 0 176608
GL000218.1 0 161147
GL000219.1 0 179198
GL000220.1 0 161802
GL000221.1 0 155397
GL000224.1 0 179693
GL000225.1 0 211173
GL000226.1 0 15008
KI270302.1 0 2274
KI270303.1 0 1942
KI270304.1 0 2165
KI270305.1 0 1472
KI270310.1 0 1201
KI270311.1 0 12399
KI270312.1 0 998
KI270315.1 0 2276
KI270316.1 0 1444
KI270317.1 0 37690
KI270320.1 0 4416
KI270322.1 0 21476
KI270329.1 0 1040
KI270330.1 0 1652
KI270333.1 0 2699
KI270334.1 0 1368
KI270335.1 0 1048
KI270336.1 0 1026
KI270337.1 0 1121
KI270338.1 0 1428
KI270340.1 0 1428
KI270362.1 0 3530
KI270363.1 0 1803
KI270364.1 0 2855
KI270366.1 0 8320
KI270371.1 0 2805
KI270372.1 0 1650
KI270373.1 0 1451
KI270374.1 0 2656
KI270375.1 0 2378
KI270376.1 0 1136
KI270378.1 0 1048
KI270379.1 0 1045
KI270381.1 0 1930
KI270382.1 0 4215
KI270383.1 0 1750
KI270384.1 0 1658
KI270385.1 0 990
KI270386.1 0 1788
KI270387.1 0 1537
KI270388.1 0 1216
KI270389.1 0 1298
KI270390.1 0 2387
KI270391.1 0 1484
KI270392.1 0 971
KI270393.1 0 1308
KI270394.1 0 970
KI270395.1 0 1143
KI270396.1 0 1880
KI270411.1 0 2646
KI270412.1 0 1179
KI270414.1 0 2489
KI270417.1 0 2043
KI270418.1 0 2145
KI270419.1 0 1029
KI270420.1 0 2321
KI270422.1 0 1445
KI270423.1 0 981
KI270424.1 0 2140
KI270425.1 0 1884
KI270429.1 0 1361
KI270435.1 0 92983
KI270438.1 0 112505
KI270442.1 0 392061
KI270448.1 0 7992
KI270465.1 0 1774
KI270466.1 0 1233
KI270467.1 0 3920
KI270468.1 0 4055
KI270507.1 0 5353
KI270508.1 0 1951
KI270509.1 0 2318
KI270510.1 0 2415
KI270511.1 0 8127
KI270512.1 0 22689
KI270515.1 0 6361
KI270516.1 0 1300
KI270517.1 0 3253
KI270518.1 0 2186
KI270519.1 0 138126
KI270521.1 0 7642
KI270522.1 0 5674
KI270528.1 0 2983
KI270529.1 0 1899
KI270530.1 0 2168
KI270538.1 0 91309
KI270539.1 0 993
KI270544.1 0 1202
KI270548.1 0 1599
KI270579.1 0 31033
KI270580.1 0 1553
KI270581.1 0 7046
KI270582.1 0 6504
KI270583.1 0 1400
KI270584.1 0 4513
KI270587.1 0 2969
KI270588.1 0 6158
KI270589.1 0 44474
KI270590.1 0 4685
KI270591.1 0 5796
KI270593.1 0 3041
KI270706.1 0 175055
KI270707.1 0 32032
KI270708.1 0 127682
KI270709.1 0 66860
KI270710.1 0 40176
KI270711.1 0 42210
KI270712.1 0 176043
KI270713.1 0 40745
KI270714.1 0 41717
KI270715.1 0 161471
KI270716.1 0 153799
KI270717.1 0 40062
KI270718.1 0 38054
KI270719.1 0 176845
KI270720.1 0 39050
KI270721.1 0 100316
KI270722.1 0 194050
KI270723.1 0 38115
KI270724.1 0 39555
KI270725.1 0 172810
KI270726.1 0 43739
KI270727.1 0 448248
KI270728.1 0 1872759
KI270729.1 0 280839
KI270730.1 0 112551
KI270731.1 0 150754
KI270732.1 0 41543
KI270733.1 0 179772
KI270734.1 0 165050
KI270735.1 0 42811
KI270736.1 0 181920
KI270737.1 0 103838
KI270738.1 0 99375
KI270739.1 0 73985
KI270740.1 0 37240
KI270741.1 0 157432
KI270742.1 0 186739
KI270743.1 0 210658
KI270744.1 0 168472
KI270745.1 0 41891
KI270746.1 0 66486
KI270747.1 0 198735
KI270748.1 0 93321
KI270749.1 0 158759
KI270750.1 0 148850
KI270751.1 0 150742
KI270752.1 0 27745
KI270753.1 0 62944
KI270754.1 0 40191
KI270755.1 0 36723
KI270756.1 0 79590
KI270757.1 0 71251
chr1 0 248956422
chr10 0 133797422
chr11 0 135086622
chr12 0 133275309
chr13 0 114364328
chr14 0 107043718
chr15 0 101991189
chr16 0 90338345
chr17 0 83257441
chr18 0 80373285
chr19 0 58617616
chr2 0 242193529
chr20 0 64444167
chr21 0 46709983
chr22 0 50818468
chr3 0 198295559
chr4 0 190214555
chr5 0 181538259
chr6 0 170805979
chr7 0 159345973
chr8 0 145138636
chr9 0 138394717
chrM 0 16569
chrX 0 156040895
chrY 0 57227415

unexpected output - wiggletools profile

I'm confused by the output of wiggletools profile - using a single entry bed file (TSS around ActB)

chr7    5528101 5533101 NM_001101.3_ACTB        .       -

and wiggletools profile out.txt 250 actb.bed input.bw, this is the result (showing values near the peak)

115     109.794479
116     131.095570
117     144.408720
118     155.737980
119     164.613380
120     180.223719
121     189.255841
122     193.275845
123     194.267832
124     181.737781
125     165.605360
126     158.609439
127     150.256069
128     146.183820
129     132.087501
130     118.200050
131     102.276440
132     88.806671
133     82.698241
134     79.043671
135     82.907081

However, looking at the same bigwig file in IGV, the maximum value near the peak of ActB is <10.

actb

I'd expect the maximum value of wiggletools profile output to be around 10 as well.

Memory Issue

When I run the wiggletools mean function on two files of size 92M and 127M, I get the following error:
Segmentation fault: 11
Wiggletools works for smaller files, but otherwise doesn't print out anything.

build script without sudo?

Is is possible to gave a build script that doesn't use sudo? Libraries can all be installed locally with LD_LIBRARY_PATH, LIBRARY_PATH, and header paths adjusted. If you don't have root building these manually is a big investment just to try something out.

Scaling and computing mean of > 1000 files

Hi,

I'm scaling some bigwig files and then computing their mean for several sets of bigwig files. This works well on small sets that have, lets say 20 files. But it fails when using a set of 1700 files. The command looks like this:

wiggletools write mean.wig mean scale 1 file1.bw scale 2 file2.bw scale 3 file3.bw scale 4.file4.bw

On the large set I get the following error:

Could not create new thread 11

after which the job core dumps.

From looking at the help, I don't see how to control the number of threads.

$ wiggletools --help
WiggleTools

Copyright EMBL-EBI, 2013.
Development contact: Daniel Zerbino [email protected]

This library parses wiggle files and executes various operations on them streaming through lazy evaluators.

Inputs:
    The program takes in Wig, BigWig, BedGraph, Bed, BigBed and Bam files, which are distinguished thanks to their suffix (.wig, .bw, .bg, .bed, .bb, .bam respectively).
    Note that wiggletools assumes that every bam file has an index .bai file next to it.

Outputs:
    The program outputs a wiggle file in stdout unless the output is squashed

Command line:
    wiggletools --help
    wiggletools program

Program grammar:
    program = (iterator) | do (iterator) | (statistic) | (extraction)
    statistic = AUC (output) (iterator) | meanI (output) (iterator) | varI (output) (iterator) | pearson (output) (iterator) (iterator) | isZero (iterator)
    output = filename | -
    extraction = profile (output) (int) (iterator) (iterator) | profiles (output) (int) (iterator) (iterator) | histogram (output) (width) (iterator_list)
        | apply_paste (out_filename) (statistic) (bed_file) (iterator)
    iterator = (filename) | (unary_operator) (iterator) | (binary_operator) (iterator) (iterator) | (reducer) (multiplex) | (setComparison) (multiplex) (multiplex)
    unary_operator = unit | write (output) | write_bg (ouput) | smooth (int) | exp | ln | log (float) | pow (float) | offset (float) | scale (float) | gt (float)
    binary_operator = diff | ratio | overlaps | apply (statistic)
    reducer = cat | sum | product | mean | var | stddev | entropy | CV | median | min | max
    iterator_list = (iterator) : | (iterator) (iterator_list)
    multiplex = (iterator_list) | map (unary_operator) (multiplex)
    setComparison = ttest | wilcoxon
    filename = *.wig | *.bw | *.bed | *.bb | *.bg | *.bam

Now, the above error happened when requesting a single core on a cluster managed by SGE. If I run it when requesting 15 cores (as shown below), it also leads to the same error.

qrsh -pe local 15 -l mem_free=5G,h_vmem=10G

Anyhow, I don't know if this is a bug on Wiggletools. I didn't see a way to get the version installed in our cluster, in case that's information you need.

If there is a limit on the number of files I can process at a time, I could always compute sums of subsets, then the overall sum before scaling by 1/n where n = number of samples.

Thanks,
Leo

New version is not on brew or conda

Hi, The new version is not on brew or conda. I tried installing from source but getting several errors even though the required libraries are installed.

GSL not found using OSX binary build

Many thanks for providing a OSX binary!

However, on a clean OSX install, the binary does not work:
dyld: Library not loaded: /usr/local/lib/libgsl.0.dylib
OSX does not ship with a GSL library.

I fixed it by installing the homebrew package manager and doing a "brew install gsl".
You could fix it by installing a static version of the gsl library and compiling your binary statically on OSX.

Compilation issue: libBibWig.a not found

I'm getting this error, even though libBigWig.a is definitely in my library path:

evrong01-i27-02 WiggleTools$ export LIBDIR=/usr/local/lib

$ l /usr/local/lib/libBigWig.a 
lrwxr-xr-x  1 evrong01  admin    41B Dec 27 21:08 /usr/local/lib/libBigWig.a -> ../Cellar/libbigwig/0.4.4/lib/libBigWig.a

$ l /usr/local/Cellar/libbigwig/0.4.4/lib/libBigWig.a         
-rwxr-x---  1 evrong01  NYUMC\Domain Users    57K May 14  2019 /usr/local/Cellar/libbigwig/0.4.4/lib/libBigWig.a

evrong01-i27-02 WiggleTools$ make
cd src; make -e
mkdir -p /usr/local/lib
ar rcs /usr/local/lib/libwiggletools.a *.o
mkdir -p ../bin
cc -g -Wall -O3 -std=gnu99 -L/usr/local/lib -L../../libBigWig -L../../htslib wiggletools.c -lwiggletools -l:libBigWig.a -lcurl -l:libhts.a -lgsl  -lgslcblas -lz -lpthread -lm -llzma -lbz2 -o ../bin/wiggletools 
ld: warning: directory not found for option '-L../../libBigWig'
ld: warning: directory not found for option '-L../../htslib'
ld: library not found for -l:libBigWig.a
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [../bin/wiggletools] Error 1
make: *** [Wiggletools] Error 2

histogram not working

Hi, The histogram command seems to no longer be working. It is creating non-sensical results.
Perhaps something in the recent changes affected it.

Conda installation not working

Hi,
Conda installation is not working:

conda install -c bioconda wiggletools
Collecting package metadata (repodata.json): done
Solving environment: / 
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - defaults/linux-64::attrs==18.1.0=py36_0
  - defaults/linux-64::datashape==0.5.4=py36h3ad6b5c_0
  - defaults/linux-64::qtconsole==4.3.1=py36h8f73b5b_0
  - defaults/linux-64::curl==7.60.0=h84994c4_0
  - defaults/linux-64::clyent==1.2.2=py36h7e57e65_1
  - defaults/linux-64::contextlib2==0.5.5=py36h6c84a62_0
  - defaults/linux-64::greenlet==0.4.13=py36h14c3975_0
  - defaults/linux-64::requests==2.22.0=py36_0
  - defaults/linux-64::sip==4.19.8=py36hf484d3e_0
  - defaults/noarch::tqdm==4.36.1=py_0
  - defaults/linux-64::jupyter==1.0.0=py36_4
  - defaults/linux-64::bottleneck==1.2.1=py36haac1ea0_0
  - defaults/linux-64::pycrypto==2.6.1=py36h14c3975_8
  - defaults/linux-64::nbformat==4.4.0=py36h31c9010_0
  - defaults/linux-64::pycparser==2.18=py36hf9f622e_1
  - defaults/linux-64::cytoolz==0.9.0.1=py36h14c3975_0
  - defaults/linux-64::sphinxcontrib-websupport==1.0.1=py36hb5cb234_1
  - defaults/linux-64::xlsxwriter==1.0.4=py36_0
  - defaults/linux-64::asn1crypto==0.24.0=py36_0
  - defaults/linux-64::partd==0.3.8=py36h36fd896_0
  - defaults/linux-64::conda==4.7.12=py36_0
  - defaults/linux-64::heapdict==1.0.0=py36_2
  - defaults/linux-64::pathlib2==2.3.2=py36_0
  - defaults/linux-64::filelock==3.0.4=py36_0
  - defaults/linux-64::numpy==1.14.3=py36hcd700cb_1
  - defaults/linux-64::libssh2==1.8.0=h9cfc8f7_4
  - defaults/linux-64::packaging==17.1=py36_0
  - defaults/linux-64::pycurl==7.43.0.1=py36hb7f436b_0
  - defaults/linux-64::cryptography==2.7=py36h1ba5d50_0
  - defaults/linux-64::msgpack-python==0.5.6=py36h6bb024c_0
  - defaults/linux-64::_ipyw_jlab_nb_ext_conf==0.1.0=py36he11e457_0
  - defaults/linux-64::markupsafe==1.0=py36hd9260cd_1
  - defaults/linux-64::colorama==0.3.9=py36h489cec4_0
  - defaults/linux-64::pycosat==0.6.3=py36h0a5515d_0
  - defaults/linux-64::simplegeneric==0.8.1=py36_2
  - defaults/linux-64::beautifulsoup4==4.6.0=py36h49b8c8c_1
  - defaults/linux-64::backports==1.0=py36hfa02d7e_1
  - defaults/linux-64::mccabe==0.6.1=py36h5ad9710_1
  - defaults/linux-64::unicodecsv==0.14.1=py36ha668878_0
  - defaults/linux-64::send2trash==1.5.0=py36_0
  - defaults/linux-64::anaconda==5.2.0=py36_3
  - defaults/linux-64::ipython_genutils==0.2.0=py36hb52b0d5_0
  - defaults/linux-64::pip==19.1.1=py36_0
  - defaults/linux-64::isort==4.3.4=py36_0
  - defaults/linux-64::qtpy==1.4.1=py36_0
  - defaults/linux-64::jupyterlab==0.32.1=py36_0
  - defaults/linux-64::sortedcollections==0.6.1=py36_0
  - defaults/linux-64::docutils==0.14=py36hb0f60f5_0
  - defaults/linux-64::pyqt==5.9.2=py36h751905a_0
  - defaults/linux-64::notebook==5.5.0=py36_0
  - defaults/linux-64::olefile==0.45.1=py36_0
  - defaults/linux-64::sqlalchemy==1.2.7=py36h6b74fdf_0
  - defaults/linux-64::spyder==3.2.8=py36_0
  - defaults/linux-64::idna==2.8=py36_0
  - defaults/linux-64::cffi==1.11.5=py36h9745a5d_0
  - defaults/linux-64::astropy==3.0.2=py36h3010b51_1
  - defaults/linux-64::testpath==0.3.1=py36h8cadb63_0
  - defaults/linux-64::nbconvert==5.3.1=py36hb41ffb7_0
  - defaults/linux-64::widgetsnbextension==3.2.1=py36_0
  - defaults/linux-64::pandocfilters==1.4.2=py36ha6701b7_1
  - defaults/linux-64::psutil==5.4.5=py36h14c3975_0
  - defaults/linux-64::get_terminal_size==1.0.0=haa9412d_0
  - defaults/linux-64::itsdangerous==0.24=py36h93cc618_1
  - defaults/linux-64::webencodings==0.5.1=py36h800622e_1
  - defaults/linux-64::openpyxl==2.5.3=py36_0
  - defaults/linux-64::distributed==1.21.8=py36_0
  - defaults/linux-64::pyflakes==1.6.0=py36h7bd6a15_0
  - defaults/linux-64::toolz==0.9.0=py36_0
  - defaults/linux-64::nltk==3.3.0=py36_0
  - defaults/linux-64::pylint==1.8.4=py36_0
  - defaults/linux-64::anaconda-navigator==1.8.7=py36_0
  - defaults/linux-64::alabaster==0.7.10=py36h306e16b_0
  - defaults/linux-64::libcurl==7.60.0=h1ad7b7a_0
  - defaults/linux-64::xlrd==1.1.0=py36h1db9f0c_1
  - defaults/linux-64::seaborn==0.8.1=py36hfad7ec4_0
  - defaults/linux-64::pytest-astropy==0.3.0=py36_0
  - defaults/linux-64::anaconda-client==1.6.14=py36_0
  - defaults/linux-64::entrypoints==0.2.3=py36h1aec115_2
  - defaults/linux-64::mkl_fft==1.0.1=py36h3010b51_0
  - defaults/linux-64::more-itertools==4.1.0=py36_0
  - defaults/linux-64::numba==0.38.0=py36h637b7d7_0
  - defaults/linux-64::cycler==0.10.0=py36h93f1223_0
  - defaults/linux-64::imageio==2.3.0=py36_0
  - defaults/linux-64::conda-build==3.10.5=py36_0
  - defaults/linux-64::odo==0.5.1=py36h90ed295_0
  - defaults/linux-64::pytest-openfiles==0.3.0=py36_0
  - defaults/linux-64::et_xmlfile==1.0.1=py36hd6bccc3_0
  - defaults/linux-64::ipykernel==4.8.2=py36_0
  - defaults/linux-64::flask==1.0.2=py36_1
  - defaults/linux-64::python-dateutil==2.7.3=py36_0
  - defaults/linux-64::conda-verify==2.0.0=py36h98955d8_0
  - defaults/linux-64::multipledispatch==0.5.0=py36_0
  - defaults/linux-64::pkginfo==1.4.2=py36_1
  - defaults/linux-64::gmpy2==2.0.8=py36hc8893dd_2
  - defaults/linux-64::locket==0.2.0=py36h787c0ad_1
  - defaults/linux-64::mpmath==1.0.0=py36hfeacd6b_2
  - defaults/linux-64::pyopenssl==19.0.0=py36_0
  - defaults/linux-64::jinja2==2.10=py36ha16c418_0
  - defaults/linux-64::navigator-updater==0.2.1=py36_0
  - defaults/linux-64::qt==5.9.5=h7e424d6_0
  - defaults/linux-64::blaze==0.11.3=py36h4e06776_0
  - defaults/linux-64::mistune==0.8.3=py36h14c3975_1
  - defaults/linux-64::backports.shutil_get_terminal_size==1.0.0=py36hfea85ff_2
  - defaults/linux-64::py==1.5.3=py36_0
  - defaults/linux-64::astroid==1.6.3=py36_0
  - defaults/linux-64::backcall==0.1.0=py36_0
  - defaults/linux-64::ipywidgets==7.2.1=py36_0
  - defaults/linux-64::qtawesome==0.4.4=py36h609ed8c_0
  - defaults/linux-64::anaconda-project==0.8.2=py36h44fb852_0
  - defaults/linux-64::cython==0.28.2=py36h14c3975_0
  - defaults/linux-64::jsonschema==2.6.0=py36h006f8b5_0
  - defaults/linux-64::sphinxcontrib==1.0=py36h6d0f590_1
  - defaults/linux-64::fastcache==1.0.2=py36h14c3975_2
  - defaults/linux-64::pysocks==1.6.8=py36_0
  - defaults/linux-64::bitarray==0.8.1=py36h14c3975_1
  - defaults/linux-64::zict==0.1.3=py36h3a3bf81_0
  - defaults/linux-64::boto==2.48.0=py36h6e4cd66_1
  - defaults/linux-64::terminado==0.8.1=py36_1
  - defaults/linux-64::jupyter_console==5.2.0=py36he59e554_1
  - defaults/linux-64::pyodbc==4.0.23=py36hf484d3e_0
  - defaults/linux-64::jupyter_client==5.2.3=py36_0
  - defaults/linux-64::chardet==3.0.4=py36h0f667ec_1
  - defaults/linux-64::babel==2.5.3=py36_0
  - defaults/linux-64::glob2==0.6=py36he249c77_0
  - defaults/linux-64::sphinx==1.7.4=py36_0
  - defaults/linux-64::jupyter_core==4.4.0=py36h7c827e3_0
  - defaults/linux-64::pep8==1.7.1=py36_0
  - defaults/linux-64::bokeh==0.12.16=py36_0
  - omnia/label/cuda91/linux-64::openmm==7.3.1=py36_cuda91_rc_2
  - defaults/linux-64::mkl_random==1.0.1=py36h629b387_0
  - defaults/linux-64::tblib==1.3.2=py36h34cf8b6_0
  - defaults/linux-64::nose==1.3.7=py36hcdf7029_2
  - defaults/linux-64::singledispatch==3.4.0.3=py36h7a266c3_0
  - defaults/linux-64::mkl-service==1.1.2=py36h17a0993_4
  - defaults/linux-64::rope==0.10.7=py36h147e2ec_0
  - defaults/linux-64::conda-package-handling==1.6.0=py36h7b6447c_0
  - defaults/linux-64::python==3.6.8=h0371630_0
  - defaults/linux-64::pytest-doctestplus==0.1.3=py36_0
  - defaults/linux-64::lazy-object-proxy==1.3.1=py36h10fcdad_0
  - defaults/linux-64::gevent==1.3.0=py36h14c3975_0
  - defaults/linux-64::wheel==0.33.4=py36_0
  - defaults/linux-64::numpydoc==0.8.0=py36_0
  - defaults/linux-64::traitlets==4.3.2=py36h674d592_0
  - defaults/linux-64::xlwt==1.3.0=py36h7b00a1f_0
  - defaults/linux-64::ply==3.11=py36_0
  - defaults/linux-64::pytest==3.5.1=py36_0
  - defaults/linux-64::sortedcontainers==1.5.10=py36_0
  - defaults/linux-64::pluggy==0.6.0=py36hb689045_0
  - defaults/linux-64::flask-cors==3.0.4=py36_0
  - defaults/linux-64::imagesize==1.0.0=py36_0
  - defaults/linux-64::pycodestyle==2.4.0=py36_0
  - defaults/linux-64::ruamel_yaml==0.15.35=py36h14c3975_1
  - defaults/linux-64::urllib3==1.24.2=py36_0
  - defaults/linux-64::wcwidth==0.1.7=py36hdf4376a_0
  - defaults/linux-64::jupyterlab_launcher==0.10.5=py36_0
  - defaults/linux-64::sympy==1.1.1=py36hc6d1c1c_0
  - defaults/linux-64::snowballstemmer==1.2.1=py36h6febd40_0
  - defaults/linux-64::path.py==11.0.1=py36_0
  - defaults/linux-64::dask==0.17.5=py36_0
  - defaults/linux-64::pytest-arraydiff==0.2=py36_0
  - defaults/linux-64::jdcal==1.4=py36_0
  - defaults/linux-64::lxml==4.2.1=py36h23eabaa_0
  - defaults/linux-64::llvmlite==0.23.1=py36hdbcaa40_0
  - defaults/linux-64::werkzeug==0.14.1=py36_0
  - defaults/linux-64::pytest-remotedata==0.2.1=py36_0
  - defaults/linux-64::certifi==2019.9.11=py36_0
  - defaults/linux-64::six==1.12.0=py36_0
  - defaults/linux-64::bkcharts==0.2=py36h735825a_0
failed with initial frozen solve.

Possible glibc issue when installing WiggleTools

Hello,

I tried installing WiggleTools and encountered a couple errors.

First, there seem to be some minor errors when installing kent. I don't think that the error matters much in this case as it's related to preserving the permissions for some files as shown below.

cp -pf gitFiles /home/bst/student/lcollado/software/userApps/scripts/gitFiles
cp: preserving permissions for `/home/bst/student/lcollado/software/userApps/scripts/gitFiles': Operation not supported
cp: preserving ACL for `/home/bst/student/lcollado/software/userApps/scripts/gitFiles': Operation not supported

I later encountered a warning when installing WiggleTools which lead to errors. The key part seems to be:

in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

These seems different to the errors you anticipated about -lssl or -lcrypto. Right? If so, google lead me to http://stackoverflow.com/questions/2725255/create-statically-linked-binary-that-uses-getaddrinfo which doesn't seem to help in this case. Then from http://bytes.com/topic/c/answers/877929-lnk-error-using-dlopen-statically-linked-apps-requires-runtime-shared-lib I checked the environment variable they mention:

22:04 WiggleTools $ echo $LD_LIBRARY_PATH
/hpscc/usr/local/gcc-4.1.2/install/libraries/lib:/opt/gridengine/lib/lx26-amd64:/opt/gridengine/lib/lx26-amd64::/usr/local/atlas/3.9.4/lib:/usr/local/gsl/1.12/lib

Could this be a compiler issue? Do you have any suggestions? I could also double check with the system administrators of our cluster environment.

Thank you!
Leonardo

Log

Here's the full log information https://gist.github.com/lcolladotor/9753882

Output overlapping bedgraph

When trying the output to bedgraph using write_bg when obtaining the mean of two bigwig files, the tool is returning a bedgraph containing overlapping regions which are not according to bedgraph format, is this normal or is there a fix?

bigBed file ending not recognised

Could wiggletools be modified slightly to allow the file ending bigBed to be recognised as well as bb for bigBed files? This is the file ending used be the ENCODE projects and others. Thanks

Mutation rate per epigenetic state

Hi,
Using the ChromHMM states from Epigenomics Roadmap, I had a bigbed of the epigenomic state of each segment of the genome. I'd like to calculate the mutation rate in each genomic segment, based on the set of genome-wide variants in a single large VCF file.
I can see a slow way to do this via decomposing the bigbed into one BED per state, then calculating & aggregating the mutation rate via bedtools intersect, but I was wondering if there was a clever way to do this in a single pass, directly from the bb, using WiggleTools?

many thanks,
Mark

Make error

Hi,

I'm getting this error when trying to run the make:

cc -g -Wall -O3 -std=gnu99 -L../lib -L../../libBigWig -L../../htslib wiggletools.c -lwiggletools -l:libBigWig.a -lcurl -l:libhts.a -lgsl  -lgslcblas -lz -lpthread -lm -o ../bin/wiggletools 
../../htslib/libhts.a(cram_io.o): In function `lzma_mem_deflate':
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:709: undefined reference to `lzma_stream_buffer_bound'
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:715: undefined reference to `lzma_easy_buffer_encode'
../../htslib/libhts.a(cram_io.o): In function `cram_compress_by_method':
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:1092: undefined reference to `BZ2_bzBuffToBuffCompress'
../../htslib/libhts.a(cram_io.o): In function `cram_uncompress_block':
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:1012: undefined reference to `BZ2_bzBuffToBuffDecompress'
../../htslib/libhts.a(cram_io.o): In function `lzma_mem_inflate':
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:731: undefined reference to `lzma_easy_decoder_memusage'
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:731: undefined reference to `lzma_stream_decoder'
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:749: undefined reference to `lzma_code'
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:762: undefined reference to `lzma_code'
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:773: undefined reference to `lzma_end'
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:778: undefined reference to `lzma_end'
/mnt/storage/home/tk19812/scratch/software/htslib/cram/cram_io.c:773: undefined reference to `lzma_end'
collect2: error: ld returned 1 exit status
make[1]: *** [../bin/wiggletools] Error 1
make[1]: Leaving directory `/mnt/storage/scratch/tk19812/software/WiggleTools/src'
make: *** [Wiggletools] Error 2

Any help?
Thanks a lot
F

ld: error: undefined symbol: rollYourOwn

cc -O2 -pipe -fno-omit-frame-pointer  -fstack-protector-strong -isystem /usr/local/include -fno-strict-aliasing  -L../lib -L../../libBigWig -L../../htslib wiggletools.c  -o ../bin/wiggletools 
ld: error: undefined symbol: rollYourOwn
>>> referenced by wiggletools.c
>>>               /tmp/wiggletools-47fd81.o:(main)

There is no ../../libBigWig. libbigwig is installed in $(PREFIX), you need to add -L$(PREFIX)/lib -lBigWig

Unexpected behaviour with trim vs overlaps?

Either I'm misunderstanding the functionality of trim and overlaps, or the behaviour isn't quite what it should be?

Specifically, running something like:
wiggletools overlaps test/fixedStep.bw test/overlapping_coverage.wig and running wiggletools trim test/fixedStep.bw test/overlapping_coverage.wig return different outputs... Since fixedStep.bw has values from position 1 through 9, and overlapping_coverage.wig has values from position 3 through 8, shouldn't all of overlapping_coverage.wig be returned by both calls?

gte not working and compression

Hi,
I tested 'gte' and it isn't working. Example below. I'm guessing lte might have the same issue.

Note also that gt/gte/lt/lte are still causing compression with write_bg. I think that is useful in some cases, but not desirable in other cases. So I think it would be good to have it as an option: either with or without compression. Maybe:

  • write_bg is compressed, and
  • write_bgu is uncompressed
wiggletools write_bg - bin 20 scale 0.05 fillIn genome.bed blah.bw | head
chr1	0	20	0.005435
chr1	20	40	0.000000
chr1	40	60	0.000000
chr1	60	80	0.000000
chr1	80	100	0.000000
chr1	100	120	0.000000
chr1	120	140	0.000000
chr1	140	160	0.005435
chr1	160	180	0.000000
chr1	180	200	0.000000
$ wiggletools write_bg - gte 0.005435 bin 20 scale 0.05 fillIn genome.bed blah.bw | head
chr1	280	400	1.000000
chr1	420	1940	1.000000
chr1	1960	2020	1.000000
chr1	2040	2280	1.000000
chr1	2300	2400	1.000000
chr1	2420	2440	1.000000
chr1	2460	2480	1.000000
chr1	2500	2540	1.000000
chr1	2560	2600	1.000000
chr1	2640	2660	1.000000
$ wiggletools write_bg - gte 0.005434 bin 20 scale 0.05 fillIn genome.bed blah.bw | head
chr1	0	20	1.000000
chr1	140	160	1.000000
chr1	260	2280	1.000000
chr1	2300	2440	1.000000
chr1	2460	2540	1.000000
chr1	2560	2620	1.000000
chr1	2640	2880	1.000000
chr1	2900	2920	1.000000
chr1	2960	3000	1.000000
chr1	3040	3060	1.000000

bufferedReader.c compilation error

Error below. Looks like include of string.h is missing.

cc -g -Wall -O3 -std=gnu99 -I../../libBigWig -I../../htslib -D_PBGZF_USE -c bufferedReader.c -o bufferedReader.o
bufferedReader.c:188:9: error: implicitly declaring library function 'strcmp' with type 'int (const char *, const char *)'
      [-Werror,-Wimplicit-function-declaration]
        return strcmp(cl_A->chrom, cl_B->chrom);
               ^
bufferedReader.c:188:9: note: include the header <string.h> or explicitly provide a declaration for 'strcmp'
1 error generated.
make[1]: *** [bufferedReader.o] Error 1
make: *** [Wiggletools] Error 2

Trim

The wiggletools trim function appears to give incomplete output when comparing a BED and BAM file that have different chromosomal orderings. The BED file has to be ordered by the first column, but if the genomic nomenclature uses a "chr" prefix the sorting has to be in lexigraphic order (e.g., chr1 then chr10). However, coordinate sorted BAM files can have chromosomal ordering as dictated by the header, not always in this same lexographical or numeric ordering.

If the BAM file ordering is chr1, chr2, ... chr10, ... chr22 abd the BED file is chr1, chr10, ... then the wiggle_tools trim will only output intersection of intervals on chromosomes 1-9 and misses the overlaps on the chromosomes 10-22.

If I try and sort the BED file to match the numeric ordering of BAM file, I get the following error:
"Bed file ./tmp.bed is not sorted!
Position chr10:714133 is before chr9:140899137"

even though in the actual BED file chr9 is before chr10

Is there a way to fix this?

profiles - sum of output values changes based on bin size

Hi:
I'm wondering if wiggletools profiles (not profile) uses overlapping bins (rather than non-overlapping bins). Here is some output where I use the same bed file and bigwig file but change the bin size.

wiggletools profiles - 10 ttemp.bed test.bw
chr1    944203  959000  37328.629147    167978.831162   74658.041263    93321.661506    93321.572868    
74657.258294    111986.227220   199975.700155    109129.104598   45518.073297

wiggletools profiles - 50 ttemp.bed test.bw
chr1    944203  959000  0.000000        3732.862915     3732.862915     3732.862915     7465.725829     
5561.203934     5789.746562     11046.226993     0.000000        0.000000        11198.745338    
3732.862915     0.000000        0.000000        0.000000        11198.606472     7465.725829     0.000000        
3732.862915     3732.862915     7275.273640     3923.315104     0.000000        6132.560503     
1333.165327      7046.731012     418.994817      0.000000        11198.618290    0.000000        
11198.627154    0.000000        0.00000014931.552115     3732.862915     3732.862915     2361.607150     
15236.254936    4799.395176     0.000000        3732.862915     7237.183202      6056.379627     
1637.888830     3732.862915     3732.862915     0.000000        0.000000        0.000000        1180.803575

The sum of values are 1,007,875 and 187,825.3, respectively. I expected that the sum of values would be the same or similar.

make failed on macOS fatal error: 'common.h' file not found

Hi,

I am installing wiggleTools, and failed on my macOS:

cd samtools; make
make[1]: Nothing to be done for `all'.
cd src; make -e
cc -g -Wall -O3 -std=gnu99 -I../samtools -I/inc -I -D_PBGZF_USE -c bigWiggleReader.c -o bigWiggleReader.o
In file included from bigWiggleReader.c:16:
./bigFileReader.h:24:10: fatal error: 'common.h' file not found
#include "common.h"
         ^
1 error generated.
make[1]: *** [bigWiggleReader.o] Error 1
make: *** [Wiggletools] Error 2

Any ideas why?

Thanks!

Ming

failed test - diff tmp expected

Any idea why this test would fail?

cd test; python2.7 test.py
Testing: ../bin/wiggletools do isZero diff fixedStep.bw variableStep.wig
Testing: ../bin/wiggletools diff fixedStep.bw variableStep.wig fixedStep.wig
Trailing tokens: the last tokens in your command were not read, check your syntax:
... fixedStep.wig
Testing: ../bin/wiggletools do isZero diff fixedStep.bw fixedStep.wig
Testing: ../bin/wiggletools do isZero offset -1 ratio variableStep.bw variableStep.wig
Testing: ../bin/wiggletools do isZero diff bam.bam pileup.bg
Testing: ../bin/wiggletools do isZero diff bam.bam cram.cram
Testing: ../bin/wiggletools do isZero diff bam.bam sam.sam
Testing: cat sam.sam | ../bin/wiggletools do isZero diff bam.bam sam -
Testing: ../bin/wiggletools do isZero diff overlapping.bed overlapping.bb
Testing: ../bin/wiggletools do isZero diff variableStep.bw variableStep.wig
Testing: ../bin/wiggletools do isZero diff vcf.vcf bcf.bcf
Testing: ../bin/wiggletools do isZero diff sum fixedStep.bw fixedStep.bw : scale 2 fixedStep.bw
Testing: ../bin/wiggletools do isZero diff sum fixedStep.bw fixedStep.bw : sum fixedStep.bw fixedStep.bw
Testing: ../bin/wiggletools do isZero diff ln fixedStep.bw sum map ln fixedStep.bw
Testing: ../bin/wiggletools do isZero diff ln exp fixedStep.bw fixedStep.wig
Testing: ../bin/wiggletools do isZero diff pow 2 fixedStep.bw mult fixedStep.wig fixedStep.wig
Testing: ../bin/wiggletools do isZero diff lt 5 fixedStep.wig gt -5 scale -1 fixedStep.wig
Testing: ../bin/wiggletools apply_paste tmp/regional_means.txt meanI overlapping.bed fixedStep.wig
Testing: ../bin/wiggletools print tmp/pearson.txt pearson fixedStep.wig variableStep.wig
Testing: ../bin/wiggletools profiles tmp/profiles.txt 3 overlapping.bed fixedStep.wig
Testing: ../bin/wiggletools profile tmp/profile.txt 3 overlapping.bed fixedStep.wig
Testing: ../bin/wiggletools do isZero diff fixedStep.wig overlaps fixedStep.wig fixedStep.wig
Testing: ../bin/wiggletools write_bg tmp/nearest_overlapping.bg nearest variableStep.wig overlapping.bed
Testing: ../bin/wiggletools write_bg tmp/nearest_fixedStep.bg nearest variableStep.wig fixedStep.bw
Testing: ../bin/wiggletools print - minI fixedStep.wig
Testing: ../bin/wiggletools print - maxI fixedStep.wig
Testing: ../bin/wiggletools do isZero diff overlapping_coverage.wig coverage overlapping.bed
Testing: ../bin/wiggletools do isZero diff trim overlapping.bed variableStep.wig mult overlapping.bed variableStep.wig
Testing: ../bin/wiggletools run program.txt
Testing: diff tmp expected
diff tmp/profiles.txt expected/profiles.txt
1,2c1,2
< chr1  3       7       2.000000        3.000000        9.000000
< chr1  4       9       3.000000        9.000000        13.000000
---
> chr1  3       7       2.666667        4.000000        12.000000
> chr1  4       9       5.000000        15.000000       21.666667
diff tmp/profile.txt expected/profile.txt
1,3c1,3
< 0     5.000000
< 1     12.000000
< 2     22.000000
---
> 0     7.666667
> 1     19.000000
> 2     33.666667
Traceback (most recent call last):
  File "test.py", line 112, in <module>
    assert test('diff tmp expected') == 0
AssertionError
make: *** [tests] Error 1```

wiggletools seek combines regions with multiple zero points (undesiderable)

Hi,

I'm trying to extract a region from a wiggle file like so:

wiggletools write output.wig seek SUPER_6 53887433 53926022 file.wig

The wig file has base-by-base values and this command tends to combine regions with multiple zero values like so:

....
0.446000
0.447000
-0.574000
0.447000
0.447000
0.429000
SUPER_6 53889550 53889558 0.000000
fixedStep chrom=SUPER_6 start=53889559 step=1
-0.323000
-0.574000
-1.049000
0.524000
-0.231000
0.447000
0.447000
...

This is undesideble and the conversion of the wig file in bed format fails due to that lines (e.g. SUPER_6 53889550 53889558 0.000000).

Can this merging be avoided? Many thanks.

error of reading .bam files with a new bamReader.c

I'm trying to install WiggleTools without sudo.
I could install this through miniconda3 but faced the known bug(=only chr1-9 can be handled).

I can compile Wiggle Tools but it can't read .bam files.
(Also I cannot pass the "make test"(please see below))

[usrname@sc245 WiggleTools]$ make test
cd test; python2.7 test.py
Testing: ../bin/wiggletools do isZero diff fixedStep.bw variableStep.wig
Testing: ../bin/wiggletools diff fixedStep.bw variableStep.wig fixedStep.wig
Trailing tokens: the last tokens in your command were not read, check your syntax:
... fixedStep.wig
Testing: ../bin/wiggletools do isZero diff fixedStep.bw fixedStep.wig
Testing: ../bin/wiggletools do isZero offset -1 ratio variableStep.bw variableStep.wig
Testing: ../bin/wiggletools do isZero diff bam.bam pileup.bg
Testing: ../bin/wiggletools do isZero diff bam.bam cram.cram
Failed to populate reference for id 29
Unable to fetch reference #29 167..186835
Failure to decode slice
Failed to populate reference for id 42
Unable to fetch reference #42 466..29856
Failure to decode slice
Traceback (most recent call last):
File "test.py", line 37, in
assert test('../bin/wiggletools do isZero diff bam.bam cram.cram') == 0
AssertionError
make: *** [tests] error1

(I checked that bam.bam is the same file of cram.cram with samtools view.)

I tried to use an older bamReader.c and it worked well (but with the known bug...).

I use htslib-1.3.2, gsl-2.4, libBigWig with
export PATH=${HOME}/software/WiggleTools/bin:${PATH}
export LD_LIBRARY_PATH=${HOME}/software/libBigWig:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${HOME}/software/htslib-1.3.2:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=${HOME}/software/gsl-2.4:$LD_LIBRARY_PATH

Please help.
Thank you.

Fill in zero values

How do I fill in zero values for all unspecified genomic regions for a single bigwig file?

bug in bin / question

Hi,
Does the bin command automatically fill in missing values with 0?

In other words, do I need to do: bin ## fillIn ### blah.bw in this order?
Or just bin ### blah.bw ?
Or fillIn ### bin ## fillIn ### blah.bw?
Or fillIn ### bin ## blah.bw?

output of CV iterator

Obtaining unexpected output using the CV function

chr1 0 249239823 nan
chr10 0 135499788 nan
chr11 0 134946510 nan
chr12 0 133833357 nan
chr13 0 115084925 nan
etc...

Using the same set of input files, but with the sum function, I obtain expected results
chr1 0 10444 0.000000
chr1 10444 10609 0.043840
chr1 10609 13296 0.000000
chr1 13296 13461 0.043840
chr1 13461 17353 0.000000
chr1 17353 17517 0.043400
etc...

Do I need to fill in default values? Or is this a bug?

Reduce size of bigWig files produced

Why is the size of a bigWig file produced by the diff command so large compared to the input files? For example, I have two bigWig files of size 173M and 156M and the subtracted bigWig produced is 2.2G

wiggletools diff treatment.bigWig control.bigWig > subtracted.bigWig

Is there a setting I can use to reduce the file size?

make failed on macOS X 10.9.5

When trying to make wiggletools I am getting the following error.

cd src; make -e
mkdir -p ../bin
cc -g -Wall -O3 -std=gnu99 -L/Users/jespinosa/software/userApps/kent/src/lib/ -L../lib -L/Users/jespinosa/software/htslib wiggletools.c -lwiggletools /Users/jespinosa/software/userApps/kent/src/lib/local/jkweb.a -lhts -lz -lpthread -lssl -lcrypto -ldl -lgsl -lgslcblas -lm -o ../bin/wiggletools 
Undefined symbols for architecture x86_64:
  "_ti_close", referenced from:
      _lineFileTabixMayOpen in jkweb.a(linefile.o)
      _lineFileClose in jkweb.a(linefile.o)
  "_ti_get_tid", referenced from:
      _lineFileSetTabixRegion in jkweb.a(linefile.o)
  "_ti_index_load", referenced from:
      _lineFileTabixMayOpen in jkweb.a(linefile.o)
  "_ti_iter_destroy", referenced from:
      _lineFileSetTabixRegion in jkweb.a(linefile.o)
      _lineFileClose in jkweb.a(linefile.o)
  "_ti_iter_first", referenced from:
      _lineFileTabixMayOpen in jkweb.a(linefile.o)
  "_ti_open", referenced from:
      _lineFileTabixMayOpen in jkweb.a(linefile.o)
  "_ti_queryi", referenced from:
      _lineFileSetTabixRegion in jkweb.a(linefile.o)
  "_ti_read", referenced from:
      _lineFileNext in jkweb.a(linefile.o)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [../bin/wiggletools] Error 1
make: *** [Wiggletools] Error 2

Any clue?
Thanks in advance!

Make errors

Trying to build on Ubuntu Xenial:

mkdir -p ../bin
cc -g -Wall -O3 -std=gnu99 -L../lib -L../../libBigWig -L../../htslib wiggletools.c -lwiggletools -l:libBigWig.a -lcurl -l:libhts.a -lgsl -lgslcblas -lz -lpthread -lm -llzma -lbz2 -o ../bin/wiggletools
//usr/local/lib/libhts.a(hfile_s3.o): In Funktion s3_sha256': /home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:111: Nicht definierter Verweis auf SHA256'
//usr/local/lib/libhts.a(hfile_s3.o): In Funktion s3_sign_sha256': /home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:116: Nicht definierter Verweis auf EVP_sha256'
/home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:116: Nicht definierter Verweis auf HMAC' /home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:116: Nicht definierter Verweis auf EVP_sha256'
/home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:116: Nicht definierter Verweis auf HMAC' /home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:116: Nicht definierter Verweis auf EVP_sha256'
/home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:116: Nicht definierter Verweis auf HMAC' /home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:116: Nicht definierter Verweis auf EVP_sha256'
/home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:116: Nicht definierter Verweis auf HMAC' /home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:116: Nicht definierter Verweis auf EVP_sha256'
/home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:116: Nicht definierter Verweis auf HMAC' //usr/local/lib/libhts.a(hfile_s3.o): In Funktion s3_sign':
/home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:104: Nicht definierter Verweis auf EVP_sha1' /home/mkiefer/software/builds/augustus-3.3.3/htslib/hfile_s3.c:104: Nicht definierter Verweis auf HMAC'
collect2: error: ld returned 1 exit status
Makefile:14: die Regel für Ziel „../bin/wiggletools“ scheiterte
make[1]: *** [../bin/wiggletools] Fehler 1
make[1]: Verzeichnis „/home/mkiefer/software/builds/WiggleTools/src“ wird verlassen
Makefile:7: die Regel für Ziel „Wiggletools“ scheiterte
make: *** [Wiggletools] Fehler 2

Any ideas what I have to change / install?

Best,
Markus

binary not working on mac OSX Yosemite version 10.10.1

not sure why:

➜  software ./wiggletools_x86_64_linux 
zsh: exec format error: ./wiggletools_x86_64_linux
➜  software brew install gsl
Warning: gsl-1.16 already installed
➜  software uname -m
x86_64

Thanks very much.

mwrite_bg cannot write zero?

Thanks for a nice tool!

I'm currently trying to merge multiple bigWig files into a single big table and thought mwrite_bg would be a good tools to use. In this case, the bigWig files sometimes cover different regions and it seems as if mwrite_bg introduces ones instead of zeros for uncovered regions.

wiggletools_x86_64_linux mwrite_bg - *.bw |grep -n -w "0.000000"

The above command identifies zeros on the first three columns, and nothing beyond that.

Is there something obvious I'm missing?

Best,
Karl

bedGraph error line & Segmentation fault

I have two replicates and one input. They are all in bam format.
I'm trying to create a mean wig and then remove the signal from input bam and keep only positive difference values. So i'm doing what follows but i get the error mentionned in the title of the issue, any idea ?

wiggletools write mean.wig Rep1.bam Rep2.bam
wiggletools write mean.normalised.wig diff  mean.wig input.bam 
wiggletools write  mean.normalised.posOnly.wig  gt 0 mean.normalised.wig

 wigToBigWig -clip mean.normalised.posOnly.wig  pathToChromSize mean.normalised.posOnly.bw

bedGraph error line 19560665 of mean.normalised.pos.wig: chromosome GL000219.1 has size 179198 but item ends at 179204
bedGraph error line 19563187 of mean.normalised.pos.wig: chromosome KI270330.1 has size 1652 but item ends at 1718
line 25: 29659 Segmentation fault

In fact trying to create bigwig with wigToBigWig ,after wiggletools use, seems to fail.

setComparisons.c compile error

Hello Daniel,

After discussing briefly #1 with our system administrators, they recommended me to use the brand new version of our local cluster they just finished installing. In this version, they have OpenSSL with krb5 support installed system wide, so that resolved #1.

However, I have now encountered a new issue when attempting to install WiggleTools. Basically it's:

setComparisons.c:17:25: error: gsl/gsl_cdf.h: No such file or directory
setComparisons.c: In function ‘TTestReductionPop2’:
setComparisons.c:116: warning: implicit declaration of function ‘gsl_cdf_tdist_Q’
make[1]: *** [setComparisons.o] Error 1
make[1]: Leaving directory `/home/bst/student/lcollado/software/userApps/WiggleTools/src'
make: *** [Wiggletools] Error 2

Have you seen such an error before? How do you recommend proceeding?

Thank you!
Leonardo

System info

12:51 WiggleTools $ openssl version -a
OpenSSL 1.0.1e-fips 11 Feb 2013
built on: Tue Jan  7 04:14:47 EST 2014
platform: linux-x86_64
options:  bn(64,64) md2(int) rc4(8x,int) des(idx,cisc,16,int) idea(int) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DKRB5_MIT -m64 -DL_ENDIAN -DTERMIO -Wall -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Wa,--noexecstack -DPURIFY -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
OPENSSLDIR: "/etc/pki/tls"
engines:  dynamic
12:52 WiggleTools $ cat /etc/*-release
Cluster Manager v6.1
slave
Red Hat Enterprise Linux Server release 6.3 (Santiago)
Red Hat Enterprise Linux Server release 6.3 (Santiago)
12:52 WiggleTools $ lsb_release -a
LSB Version:    :core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description:    Red Hat Enterprise Linux Server release 6.3 (Santiago)
Release:    6.3
Codename:   Santiago
12:52 WiggleTools $ uname -a
Linux compute-058 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
12:52 WiggleTools $ uname -mrs
Linux 2.6.32-279.el6.x86_64 x86_64
12:52 WiggleTools $ cat /proc/version
Linux version 2.6.32-279.el6.x86_64 ([email protected]) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Wed Jun 13 18:24:36 EDT 2012
12:52 WiggleTools $ pkg-config openssl --libs --static
-Wl,-z,relro -lssl -lcrypto -ldl -lz -lgssapi_krb5 -lkrb5 -lcom_err -lk5crypto

Log

Here's the summary of commands used:

$ cd software/
12:47 software $ rm -fr userApps/
12:47 software $ git archive --format=zip -9 --remote=git://genome-source.cse.ucsc.edu/kent.git beta src/userApps > userApps.zip
12:47 software $ unzip -d userApps -j userApps.zip
12:47 software $ rm userApps.zip
12:47 software $ cd userApps
12:47 userApps $ make fetchSource
12:47 userApps $ make
12:50 userApps $ export KENT_SRC=$PWD/kent/src
12:50 userApps $ git clone https://github.com/samtools/tabix.git
12:50 userApps $ export TABIX_SRC=$PWD/tabix
12:50 userApps $ cd tabix
12:50 tabix $ make
12:50 tabix $ cd ..
12:50 userApps $ git clone [email protected]:Ensembl/WiggleTools.git
12:51 userApps $ cd WiggleTools/
12:51 WiggleTools $ make
12:51 WiggleTools $ openssl version -a
12:52 WiggleTools $ cat /etc/*-release
12:52 WiggleTools $ lsb_release -a
12:52 WiggleTools $ uname -a
12:52 WiggleTools $ uname -mrs
12:52 WiggleTools $ cat /proc/version
12:52 WiggleTools $ pkg-config openssl --libs --static

The full log is available here https://gist.github.com/lcolladotor/9766383

test fails for one diff

/WiggleTools/test$ ../bin/wiggletools do isZero diff bam.bam cram.cram
[E::cram_get_ref] Failed to populate reference for id 29
[E::cram_decode_slice] Unable to fetch reference #29 167..186835
[E::cram_next_slice] Failure to decode slice
[E::cram_get_ref] Failed to populate reference for id 42
[E::cram_decode_slice] Unable to fetch reference #42 466..29856
[E::cram_next_slice] Failure to decode slice

All other tests pass.

Unexpected output for some functions?

Perhaps I am misunderstanding usage of wiggletools, but for "meanI" I expected a single number, but instead a wiggle is returned with several BED entries. e.g.

chr1    0   2   1.000000
fixedStep chrom=chr1 start=8 step=1
4.000000
fixedStep chrom=chr1 start=14 step=1
1.000000
1.000000
fixedStep chrom=chr1 start=17 step=1
...

pearson correlation appears to output a bed file too:

>wiggletools pearson file1.bw  file2.bw | head
fixedStep chrom=chr1 start=1 step=1
2.000000
5.000000
5.000000
fixedStep chrom=chr1 start=8 step=1
5.000000
1.000000
fixedStep chrom=chr1 start=14 step=1
1.000000
1.000000

What am I missing?

bin not yielding all windows

Hi,
There's some issue with bin. It is not producing bins for regions without data in the wiggle file, even after doing fillIn.
Instead it is just making a large bin with score = 0. But I want 100 bp bins regardless of whether there is data. And fillIn should work.

$ head -n 20 blah.windows.bg
chr1	0	100	0
chr1	100	200	0
chr1	200	300	0
chr1	300	400	0
chr1	400	500	0
chr1	500	600	0
chr1	600	700	0
chr1	700	800	0
chr1	800	900	0
chr1	900	1000	0
chr1	1000	1100	0
chr1	1100	1200	0
chr1	1200	1300	0
chr1	1300	1400	0
chr1	1400	1500	0
chr1	1500	1600	0
chr1	1600	1700	0
chr1	1700	1800	0
chr1	1800	1900	0
chr1	1900	2000	0
$ wiggletools write_bg - bin 100 scale 0.01 fillIn blah.windows.bg k24.umap.sorted.bw | head
chr1	0	100	0.137970
chr1	100	200	0.702080
chr1	200	300	0.210000
chr1	300	1800	0.000000
chr1	1800	1900	0.079950
chr1	1900	2300	0.000000
chr1	2300	2400	0.149580
chr1	2400	2500	0.180420
chr1	2500	2600	0.220080
chr1	2600	2800	0.000000

As you can see, there is an entry "chr1 300 1800 0.000000".
But the expected behavior should be as below, or at least an option to do that.
chr1 300 400 0.0000
chr1 400 500 0.0000
chr1 500 600 0.0000
...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.