vgteam / sequencetubemap Goto Github PK
View Code? Open in Web Editor NEWdisplays multiple genomic sequences in the form of a tube map
License: MIT License
displays multiple genomic sequences in the form of a tube map
License: MIT License
I'm looking for a read, so I enter the read's start node into the tube map in node mode and hit go.
I don't see the read. It's because (I think) the read maps to the reverse strand. The tube map starts at the node I entered and heads locally right, while the read wanders off locally left.
But I don't even see the part of the read that visits the node I entered. I think we might only be collecting reads fully withing the graph.
Starting at the read's ending node gets it to show up.
In Firefox at least, when you refresh the page, the checkboxes and select dropdowns retain their previous states. For example, if you turned off Remove redundant nodes
, the checkbox will be clear on page refresh.
But the frontend doesn't know about this; it will render things with redundant nodes removed until you re-check and re-un-check the checkbox.
Either the checkboxes and dropdowns should be reset by the page to their default values on page load, or some logic should be added to see if the browser has adjusted the state of these controls, so their settings are respected.
See that floating G? It's a substitution that's supposed to be on that blue read, but it isn't drawn right; it's floating below the node it belongs inside of, and not overlayed on the read that has it. The problem may be that the read loops around, later in the graph, in a way that is possible to express but which the aligner generally does not generate.
I can send the files that cause this problem, or you should be able to reproduce it (while my server is running) by entering the settings shown at http://kolossus.sdsc.edu:9000/
As a genomicist, I can select and mark a read, haplotype, or positional path, so I can keep track of it as I pan the view around.
The examples in the README like docker exec -it <container_id> ./prepare_vg.sh <vg_file>
don't work because the scripts aren't runnable.
docker exec -it 50346e057492 ls -la | grep prepare
-rw-r--r-- 1 root root 207 Nov 19 2017 prepare_gam.sh
-rw-r--r-- 1 root root 270 Nov 19 2017 prepare_vg.sh
This is worked around by adding sh
to the command. docker exec -it <container_id> sh ./prepare_vg.sh <vg_file>
Either the README or the scripts should be updated to patch this...
As a VG developer, I can call up graph displays given a node ID, so that I can find the parts of the graph I am interested in.
As a genomicist, I can upload multiple .gam files where I can view multiple samples of read pileups within the same genomic site so that I can do mendelian or conserved variant analysis.
I want to be able to record a view I have brought up in the tube map (including the combination of files, track settings, and start and length settings, with the files and position being the most important).
The UCSC Genome Browser solves this problem with "sessions", but I think a permalink-based approach might be better. I should be able to copy a link that when I open it takes me back to where I was.
The backend gets information from vg by making vg dump files (like regions.tsv
) in the current directory, and then reading them.
If two requests are being handled at once, files from different requests will overwrite one another.
The backend probably needs to use a separate temp directory or set of named pipes for each request.
While we're at it vg chunk
's -T
option needs to be modified not to just generate its own filename to write to but to take one as input.
Clicking on "custom" and "reload" in the interface triggers
error:[vg chunk] unable to load xg index file ./mountedData/none
As a biologist, I can call up graph displays given a genomic coordinate on the GRCh38 reference, so that I can find the parts of the genome I study.
As a VG developer, I can upload a small raw .gam file and a small .vg graph, and have them be displayed together, so that I can see if they do what I want on a display that's more robust and intuitive than the GraphViz display.
Is it possible to generate the sequence tube map in a standalone way on the command line?
As a genomicist, I can see when several reads differ from the graph in the same way, so that I can evaluate the correctness of a variant call at that position.
As a genomicist, I can expand or contract a displayed region, so I can get a wider context or a narrower focus.
As a genomicist, I can display custom read data against an existing indexed graph, so that I can analyze new samples.
My understanding is that a .sorted.gam
+ .gai
are supposed to provide all of the functionality of a .gam.index
, but the software seems to be written to require both indexes. Is there a reason for this? If we could get rid of the RocksDB index, it could substantially reduce the preprocessing time.
@benedictpaten wants reads shown on the tube map to be able to be colored/shaded according to mapping quality, instead of just randomly as they are now.
I want to be able to paste a sequence into the tube map browser UI and have it automatically align it (with vg align
) to the region I am looking at and display the alignment.
This would be very useful for working out why a read did not map to a certain graph region, because I would be able to visually see how good the optimal alignment there was, and I wouldn't have to mess around with the command line tools myself.
If I enter a node ID that is not in the graph as the node to start at, I get a vg error on the backend and no real error message on the front end (just a blank tube map). There should be an error on the frontend that the node is not in the graph.
As a genomicist, I can see where my reads differ from the graph that they are aligned to and where they match it, so that I can get an idea of the quality of my reads and do visual variant calling.
Insert character should be a vertical line.
You should be able to mouse over it and see the inserted sequence and where it fits in the node's sequence (with an extended vertical line to the top).
fails on vg + unitig assembly fasta with error
error: failed to include path
https://drive.google.com/open?id=1jtMG9kWA9FYKxGJU_GhiKdMaAP9whkV1
contains the files and command
As a genomicist, I can display sequencing reads in the context of both the primary reference and local unique haplotypes simultaneously, so that I can see how well my reads fit with both.
It seems like the work to make TubeMap use @adamnovak's fancy new indexing code is incomplete. I'm trying to work with the file-upload
branch, which has the command line arguments fixed in vg chunk
. However, the data preprocessing scripts (data/prepare.sh, data/prepare_dev.sh, backend/prepare_gam.sh, maybe more) still seem to be written to create the old RocksDB-based index. I understand that this branch is a work in progress, but I also can't use master
, since it's still written for the old vg chunk
API.
@adamnovak says he can update the scripts for me, since he understands what's going on in the TubeMap internals.
I want to be able to copy-paste the sequence form a node, so I can BLAT it on the genome browser. When looking around in node mode, there's no position legend, so it's hard to find out where in the linear reference I am.
We could also have the scale bar in node mode. And maybe a "go to UCSC genome browser" button or something?
I just merged a PR into VG allowing vg chunk
to source its haplotypes from a GBWT in a .gbwt
file, instead of from a gPBWT embedded in a .xg
file.
This is the setup I am using for all my haplotype-informed mapping experiments, so I need the tube map to be able to read this new format. I think it would just consist of knowing enough to pick up the new file when present and to pass it along to the vg
calls.
They should be toggleable with a checkbox somewhere (so you can just ignore them), and should be distinguished form normal inserts by a different color and/or letter.
Hi There :)
When I zoom (in or out), the scaling of the drawn bases usually does not fit anymore. They are drawn too wide.
Best,
Simon
It would be nice to be able to reload the lists of files in the dropdowns without reloading the page and messing up my visualization settings.
Hello, I would like to arrange all the nodes on the selected path horizontally.
I also hope to align all the paths horizontally as possible.
I am trying to do so in this branch, but it fails in some cases.
In this case, the blue path is selected and it is able to align horizontally at all. But, the green path is overlapping with the blue path. Moreover, the purple path can be more straightened.
I would be grateful if you could check it.
Hello, I am grateful to this library and I would like to use it to display genome graphs :)
I found that it causes an error when I click the path shown in the bottom. In this case, I received an error when I make ref
unchecked.
And I have several questions:
Thank you,
The README references sequenceTubeMap.js, but this appears to have been moved/renamed (to app/main.js?).
Hi,
I just played around with the provied dockerfile and changed in that process to a newer vg docker image which results into following error:
sh
./vg/vg chunk -x ./internalData/snp1kg-BRAC1.vg.xg -a ./internalData/NA12878-BRCA1.gam.index -g -A -p 17:1-101 -T -E regions.tsv | ./vg/vg view -j - >c5717fe0-dc25-11e7-8804-b9ee87b476a9.json
received request for filenames
Error: ENOENT: no such file or directory, scandir './mountedData/'
at Error (native)
at Object.fs.readdirSync (fs.js:961:18)
at app.post (/usr/src/app/app.js:209:6)
at Layer.handle [as handle_request] (/usr/src/app/node_modules/express/lib/router/layer.js:95:5)
at next (/usr/src/app/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/usr/src/app/node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (/usr/src/app/node_modules/express/lib/router/layer.js:95:5)
at /usr/src/app/node_modules/express/lib/router/index.js:281:22
at Function.process_params (/usr/src/app/node_modules/express/lib/router/index.js:335:12)
at next (/usr/src/app/node_modules/express/lib/router/index.js:275:10)
err data: error:[vg chunk] context expansion steps must be specified with -c/--context when chunking on paths
My assumption is that there was a change in vg chunk parameters handling logic. Unfortunately I am pretty new into vg and not sure how to set the correct value (in this case for c). Could someone tell me what value I should provide?
Removing the redundant nodes gives it a cleaner look, but the way it presents the node IDs is confusing. I believe the mouse-over only shows the first node ID. I spent a while confused about this because I thought there were missing nodes in the visualization. Perhaps it would be better to make the mouse-over give a list of IDs, or maybe an ID range?
Change the header name to a logo or something next to the controls
Default to gray haplotypes and red and blue reads
Hide the radius-based nature of the region extraction and support a start and length or start and end
Move download button to the top
The backend code needs mountedData
to exist, but there's no step to create it.
I've been looking at single reads. This involves throwing a node ID that the read touches into the tube map, and then manually panning left and right by adjusting the start node ID and distance until I can see the whole read.
If I set the start node to an ID that happens to be a SNP allele, instead of a fixed backbone node, I will only get the haplotypes that take that allele of the SNP, and the parts of the graph they touch. I may not see a particular allele at the next SNP, for example, because it is in perfect LD with the SNP I am visiting.
I want to see a bit more context; if I start on a SNP allele, I want to see the other allele of the SNP, and I definitely need to see all the alleles of downstream SNPs that my read might visit. If more haplotypes come into the view from other parts of the graph that aren't reachable going right from the node I started on, I want to see those haplotypes, too, even if I can't see their start nodes.
Hi Wolfgang ;)
I hope you had a smooth transition into the new year :)
I was wondering if there is any particular reason, why you are still using D3.js V3 instead of V4?
Best,
Simon
Hello :)
I am very grateful for your tool, but I found some unexpected behavior.
In this figure, the red and green lines seem to be concatenated, but they are not adjacent by observing input JSON's rank
, which is generated by vg find -x <xg> -P <path> -c <contexts>
from our internal dataset.
{"name":"chr20","mapping":[{"position":{"node_id":22608},"rank":6042},{"position":{"node_id":22623},"rank":6057},{"position":{"node_id":22624},"rank":6058},{"position":{"node_id":22625},"rank":6059},{"position":{"node_id":22626},"rank":6060},{"position":{"node_id":22627},"rank":6061},{"position":{"node_id":22632},"rank":6066},{"position":{"node_id":22633},"rank":6067},{"position":{"node_id":22634},"rank":6068}]}
In the same reason, the scale shown on the path is not correct for green / red path.
In my opinion, the link which the both nodes are not adjacent but on the same path had better be a dotted line.
Moreover, I have several requests:
Sincerely,
Some folks at Illumina have developed a tube map fork that understands Illumina's "paragraph" format, including node coloring for the reference path and support for transparency-based MAPQ visualization which would address #44.
We want to pull in their improvements, but since they forked off before we added some features (like softclip display and strand coloration), it might be a difficult merge to do. Also, it's not clear where exactly their code is located.
When I look up a node by ID and draw so many nodes downstream of it, sometimes I will get this situation, where the reference path (top) extends one node further than the haplotypes (bottom). The haplotypes do extend into that next node in reality, but they are getting cut off one node before the reference path is by the chunking.
This is probably some kind of bug in the vg chunk code. @yoheirosen?
Do you need to sort graph files in order for the visualization to work properly? I don't see this anywhere in the documentation. They still seem to load fine.
As an HGVM developer, I can display the haplotype threads and embedded positional paths from a custom .xg file, so that I can evaluate them visually.
I want to be able to copy-paste a read in JSON format into a text box in the tube map and visualize it on the part of the graph I am looking at.
This would be useful for visualizing alignments of reads that I have manually created, for mapper debugging purposes.
As a VG developer, I can drag and drop a small .vg file onto the visualizer to display it, so that I can easily look at my graphs.
I have successfully set up the backend. But I could not build the frontend. I followed the instructions in the README file, but no success. I get dist
directory list of files at http://localhost:8080
.
As far as I understand, the issue is related to gulp
since it produces dist
directory without any html file. Also, the scripts
and styles
directories are empty:
dist
├── apple-touch-icon.png
├── favicon.ico
├── fonts
│ ├── glyphicons-halflings-regular.eot
│ ├── glyphicons-halflings-regular.svg
│ ├── glyphicons-halflings-regular.ttf
│ ├── glyphicons-halflings-regular.woff
│ └── glyphicons-halflings-regular.woff2
├── images
│ └── logo.png
├── robots.txt
├── scripts
└── styles
I am not familiar with "gulp.js". Do you have any idea why it does not build the frontend?
I'm curious if we could have a version of this which would work on the command line, allowing us to pipe in the data and render out a (possibly static) visualization.
Also, the "oldindex" disappeared, https://vgteam.github.io/sequenceTubeMap/oldIndex.html. Is there any way to bring it back? It was very useful for rendering small graphs, much more legible than the vg view -d - | dot
renderings.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.