GithubHelp home page GithubHelp logo

aallahyar / hic-spector Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gersteinlab/hic-spector

0.0 1.0 0.0 50 KB

Spectral and reproducibility analysis of Hi-C contact maps

Julia 73.35% Python 26.65%

hic-spector's Introduction

HiC-spector

A matrix library for spectral and reproducibility analysis of Hi-C contact maps. Several useful functions:

get_reproducibility

  • to calculate the reproducibility metric between 2 HiC contact maps

knight_ruiz

  • Knight Ruiz algorithm for matrix balancing

get_expect_vs_d_single_chr_v0

  • to find the average contact frequency as a function of genomic distance

get_compartment_A_B

  • to find A, B compartment, using method described in Liberman et al. Science 2009

and a few functions for binning a genome, and reading HiC maps

Installation

HiC-spector is mostly written in Julia. It has been tested in Julia 0.4 and 0.5. Both the Julia language (http://julialang.org/) and the required packages have to be installed. Please refer to the beginning of the file HiC_spector.jl for the necessary packages. To get some contact maps for testing the code, please follow the instructions shown in the file data/readme_data.

There is a Python script available for quantifying reproducibility. The Python version can read files in genomic coordinates as well as the .hic format (https://github.com/theaidenlab/juicebox/wiki/Data). To do so, please download the Python version of the tool straw (straw.py) developed by the Aiden lab (https://github.com/theaidenlab/straw).

Usage

The script run_reproducibility.jl is used to get the reproducibility score from a command-line interface. Usage: > julia run_reproducibility.jl matrix_file1 matrix_file2

The input file here is a simple text delimited format with no header.

1 1 20

1 2 18

...

The first and second columns represent the row and column indices of a contact map, whereas the third column is the count. To represent a full matrix, only the upper-triangular component is required. Note that the index should begin with 1.

Please use the files stored in the folder A549 mentioned in ./data/readme_data to test the script run_reproducibility.jl.

Julia users can include the file HiC_spector.jl for their own analysis by simply using

include("./HiC_spector.jl");

Please refer to the file hic_spector_tutorial.jl for how to use some of the functions and how to read files in other formats.

For non-Julia users, one can use the Python script run_reproducibility.py to obtain the reproducibility score. Usage:

python run_reproducibility.py -F matrix_file1 matrix_file2

If the matrix files are labeled in genomic coordinates of bins, USage:

python run_reproducibility.py t matrix_file1 matrix_file2 40000

where 40000 is the bin size used in the two files

In addition to the text delimited input files, the Python script can calculate reproducibility score for contact maps stored in .hic format. Usage:

python run_reproducibility.py -f hic_file1 hic_file2 chrid resolution

A script is provided in the tool straw (https://github.com/theaidenlab/straw/tree/master/python) for reading the headers (including chr id and the available resolutions) in .hic file.

Regarding memory, given two contact maps of human chr1 binned in a bin-size of 10kb, the code works fine in a laptop (16GB memory) from our experience.

Aurthor/Support

Koon-Kiu Yan, [email protected]; Mark Gerstein, [email protected]

Reference

Yan KK, Galip Gürkan Yardımcı, William S Noble and Gerstein M. HiC-Spector: a matrix library for spectral and reproducibility analysis of Hi-C contact maps. Bioinformatics 22 March 2017. https://doi.org/10.1093/bioinformatics/btx152

hic-spector's People

Contributors

quantum-man avatar srcoulombe avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.