dekkerlab / crane-nature-2015 Goto Github PK
View Code? Open in Web Editor NEWcode associated with crane-nature-2015, 10.1038/nature14450
License: Apache License 2.0
code associated with crane-nature-2015, 10.1038/nature14450
License: Apache License 2.0
Supplying a raw matrix of numbers throws ERROR: Must supply headered matrix!
Does this mean inputs must have headers in the format of your example inputs (something like bin#|genome|chr:bp-bp
)? Would be helpful to document either way (apologies my perl is too rusty to figure this out from the source).
in usage, it displays:
Options:
-b [] size (bp) of the insulation square
but I find that '-b' is wrong, '-is' is true after my test and code check
Hi, I want to analyse ice matrix from hicpro with the script matrix2insulation.pl, is this posible? because I got the ERROR: Must supply headered matrix.
Hi,
I wonder if this method can used to calculate the insulation score for high resolution contact matrix(eg.5kb)?
how to set the insulation square(-b) and insulation delta span(-ids) parameters?
Thank you!
Best wishes.
Eillie
This is test data's output in your packages.
The output of N2-DpnII__10kb__chrX.is500001.ids200001.insulation:
bin6000201|ce10|chrX:2010001-2020001 2010001 2020001 2015001 201 202 201.5 -0.237082756600542 -0.00117094427035152 -1
The output of N2-DpnII__10kb__chrX.is500001.ids200001.insulation.boundaries:
boundary.3|ce10|chrX:2010001-2020001 2010001 2020001 201 202 201.5 bin6000201|ce10|chrX:2010001-2020001 0.484256934770599
I want to know the difference of insulatioscore between the two files.
Thank you very much!
Yusen
sorry, i convert the HiC-Pro output matrix to dense matrix, and I add the row name and col name to this matrix. there is always a error:
Illegal division by zero at scripts/matrix2insulation.pl line 561.
i do not know what is the $headerSpacing? so, how to create the input file?
thanks!
Hi - first of all, thanks for making your code available!
I'd like to use it on data from mouse, but am coming across issues with the larger chromosomes at high resolution, getting errors such as:
ERROR: matrix interactions too large - cannot handle in memory [19720 x 19720] (388,878,400 > 256,000,000 limit)
The actual memory usage of the code when running is low as far as I can tell, and I'm running it on a server with 512GB RAM, so I'm wondering if this matrix size limit can be adjusted? What would you recommend for using this code with mouse or human data?
Hello,
I tried your methods on IMR90 (Jin 2013) at 40kb. According to your paper, after getting insulation score profile, boundaries are detected at the local minimum locus. Therefore, I use R to select all the bins that go from positive delta value to negative delta value (i.e., having a decreasing slope on the left then an increasing slope on the right). However, when I compared the output boundary list with that offered by your scripts, I found some local maximum peaks are included as the boundary. For example:
In *.insulation.boundaries:
header start end binStart binEnd binMidpoint header insulationScore
boundary.28|h19|chr1:13320001-13360001 13320001 13360001 333 334 333.5 rep1|h19|chr1:13320001-13360001 9.41871899515255
However, in *.insulation:
header start end midpoint binStart binEnd binMidpoint insulationScore delta deltaSquare
rep1|h19|chr1:13200001-13240001 13200001 13240001 13220001 330 331 330.5 -12.0651923157993 -3.25140591288299 -1
rep1|h19|chr1:13240001-13280001 13240001 13280001 13260001 331 332 331.5 -12.0651923157993 -3.25140591288299 -1
rep1|h19|chr1:13280001-13320001 13280001 13320001 13300001 332 333 332.5 -5.44620861391896 -1.59665998741291 -1
rep1|h19|chr1:13320001-13360001 13320001 13360001 13340001 333 334 333.5 -5.67855236614767 1.65474592547009 1
rep1|h19|chr1:13360001-13400001 13360001 13400001 13380001 334 335 334.5 -12.0651923157993 3.25140591288299 1
rep1|h19|chr1:13400001-13440001 13400001 13440001 13420001 335 336 335.5 -12.0651923157993 3.25140591288299 1
rep1|h19|chr1:13440001-13480001 13440001 13480001 13460001 336 337 336.5 -12.0651923157993 3.25140591288299 1
By looking at the insulationScore in *.insulation file, rep1|h19|chr1:13320001-13360001 is the local maximum. There are a few more examples like this. Therefore, I feel there must be something wrong in the scripts or I missed something. Hope it can draw your attention.
Thanks,
Ye Zheng
Hi,
I use a 40kb matrix to get the TAD boundries, while I find there are overlaps between different boundries. The part results are like this:
chr1 17200000 17480000
chr1 17440000 17720000
chr1 17560000 17840000
I don't think this is right. Could you please tell me whether it's reasonable?
Thank you so much.
Best wishes.
min
Hi,
Recently, I am interested to your developed method "insulation score". And I tried to use matrix2insulation.pl to identify TAD domains. I am curious about the result of this perl script. Because, the final result from matrix2insulation.pl is a summary about the boundary location. So, if I want to call TAD domains, is that simply divide the genome into several parts according to this boundaries? For example, if the boundary is chr1:4000000-4500000. Then the TAD domains are two parts?(chr1:0-4000000,chr1:4500000-...)
Could you give me some advise?
Thank you so much!
Best,
Garen
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.