artic-network / rampart Goto Github PK
View Code? Open in Web Editor NEWRead Assignment, Mapping, and Phylogenetic Analysis in Real Time
License: GNU General Public License v3.0
Read Assignment, Mapping, and Phylogenetic Analysis in Real Time
License: GNU General Public License v3.0
We will often have many more reference genomes than there is space on the reference heatmap but only a few of them will have any significant number of reads mapping to them. This chart should dynamically adjust to only show top references. Perhaps specify a maximum number plus some cutoff for the minimum number of reads for it to be shown.
This would probably be a separate app from RAMPART as it would be run at sample receipt before the lab work begins. Would allow you to provide sample IDs, select intended protocols, assign barcodes, specify which RAMPART instance to use (i.e., what virus it is) etc.
Ultimately it should provide a customized protocol ready for lamination (you did bring the laminator didn't you @igoodfel?).
Implement a full page stats page with more detailed plots, tables, QC, and stats for any channel. This would be reached by a button on each channel panel (also same page is available for the whole run at the top). This could also have various 'action' buttons such as to assemble the consensus genome and push it to the analysis package.
Grey out switch or don't show switch? Or possibly just show solid block of colour with vertical gaps to show 0 coverage?
Need 25 (nice) colours for all the native barcodes + one for 'none'
If you click on a sample name or sample bar in the top panel then scroll to view that sample.
Currently, at line 77 in processServerData.js, state.sampleColours = createSampleColours(25);
creates an array of 25 colours on a spectrum. This allows for the 24 native barcodes + 1 extra (i.e., 'none'). This needs to be dynamic for the number of samples.
If the number samples is greater than this then there is an array out of bounds exception thrown in some of the components that use it. These should probably 'wrap' around to avoid the exception.
Double click an icon on the desktop to start everything up with minimum fuss/command line gubbins.
At the moment the primer locations are in a json file. Could be possible to point towards a .bed
file to get these.
The call to MiniMap2 in map_single_fastq.py returns the %similarity to the mapped reference. It would be good to be able to include this as a chart in each sample panel (either for the majority mapping or selecting a specific reference).
Channels are index from 1 in the CSV files but when looking up the names in run_info.json
these are indexed from 0 causing the names to be off by one and an 'undefined' for the last one.
Coverage plots should have option to switch to log y-axis.
This would involve mapping any reads that have appeared in the inbox since it was last run. Not entirely clear how to do this? Look at the timestamps of the actual reads?
The spline curves don't give the detail needed to see the individual amplicons. Use a stepped line chart. For the overview, perhaps a stacked chart to show the channels.
Add a button to each sample panel to bin the reads currently displayed into a file or folder (also some way of doing this as a batch across all samples). Appropriate labelling of file.
Read counts (and potentially coverage) axes are in log space. Perhaps log and linear can be toggled by clicking on the axis?
If you resize a page or zoom the browser using the cmd + or - options in Safari or Chrome, the components don't resize to fit together until a reload of the page.
A sample panel only appears if at least one read has demuxed and mapped. If you specify barcodeNames then you are expecting those samples so they should be shown even if empty.
The bottom row of the reference heatmap could be an unmapped
count. It would be useful to see if your references are too divergent for a particular sample.
Reads over time loses history.
More of a stretch goal but this should go modular - Being able to plug in a module that does additional read annotation in the server process and also adds additional views/components to visualize this in the browser app.
This could include modules for metagenomics with Kraken or kraken-style classification.
I.e., add brushes on the length distribution chart, click reference matches etc. Doing it on the top panel will filter across all samples but can also filter in each individual sample panel.
Would make it possible to restart the server and show history. Would need to record timestamps in the rows of the CSV
Make the design responsiveness - adapt to smaller screen sizes and add touchscreen abilities
Target iPads and Mk1C screens initially?
The Reference Matches
heat map currently shows percent reads mapped per reference for each sample. It might be useful to show absolute number of reads mapped per reference to compare sample to sample better. This could also be shown in log scale.
Time units for reads over time should switch to minutes then hours as appropriate.
To allow for comparison
Roll over graphs to show actual numbers in popup box.
Currently the maximum number of references shown is hard coded at 10 (and minimum read fraction at 5%). These should be user configurable through a command line option and in the protocol config files.
Currently the paths in the configuration json are specified as relative to the current working directory. They should be specified relative to the location of the configuration file.
If barcode names are specified in the command line:
I.e., --barcodeNames BC01=sample1 BC02=sample2 ...
..then only those samples should be shown in RAMPART. At the moment, other barcodes occasionally come up which just add noise to the display. The simplest thing may be to simply pass these to porechop as the search set.
The order in the RAMPART display should probably be the the order specified (currently it is by the order they are first found).
Currently the heatmap shows the highest values as dark red and the lowest values as a pale yellow. This isn't the natural scale for a heatmap (hotter is more white). Particularly with the dark background, the brightest colour should represent the highest value. Suggest reverse the colour scale.
Generate reports as well formatted documents. Export figures as SVG. Export data as CSV.
Probably more attractive to have an area plot for this rather than the dots.
Set the x-axis range for all the channel read length plots to match.
As well as total reads per channel, we could have plot of rate of read per unit time (also in the title of each channel panel).
Provide reference genome names in configuration JSON. Use index in CSV file to save space.
The annotation script should read the BED file (or similar) and use the MiniMap2 coordinates to infer the amplicon for each read. Some QC could also be done here - I.e. filter multi amplicon chimeras etc. The amplicon number would be added as a column in the annotation CSV file for use by the RAMPART UI.
Plan to switch from modified version of porechop to using off-the-shelf qcat (https://github.com/nanoporetech/qcat). Need to assess the best way of doing this (i.e., do we still put barcode calls into the read headers).
A new pop-out panel which allows you to add filters on the data being displayed. Filters will include band-pass lengths, reference sets.
Log scales might work better for coverage where there is high variability (would need to be a pseudo count).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.