artic-network / rampart Goto Github PK
View Code? Open in Web Editor NEWRead Assignment, Mapping, and Phylogenetic Analysis in Real Time
License: GNU General Public License v3.0
Read Assignment, Mapping, and Phylogenetic Analysis in Real Time
License: GNU General Public License v3.0
Would make it possible to restart the server and show history. Would need to record timestamps in the rows of the CSV
Channels are index from 1 in the CSV files but when looking up the names in run_info.json
these are indexed from 0 causing the names to be off by one and an 'undefined' for the last one.
Need 25 (nice) colours for all the native barcodes + one for 'none'
As well as total reads per channel, we could have plot of rate of read per unit time (also in the title of each channel panel).
To allow for comparison
A new pop-out panel which allows you to add filters on the data being displayed. Filters will include band-pass lengths, reference sets.
Probably more attractive to have an area plot for this rather than the dots.
Provide reference genome names in configuration JSON. Use index in CSV file to save space.
Currently the maximum number of references shown is hard coded at 10 (and minimum read fraction at 5%). These should be user configurable through a command line option and in the protocol config files.
If you click on a sample name or sample bar in the top panel then scroll to view that sample.
This would involve mapping any reads that have appeared in the inbox since it was last run. Not entirely clear how to do this? Look at the timestamps of the actual reads?
Currently, at line 77 in processServerData.js, state.sampleColours = createSampleColours(25);
creates an array of 25 colours on a spectrum. This allows for the 24 native barcodes + 1 extra (i.e., 'none'). This needs to be dynamic for the number of samples.
If the number samples is greater than this then there is an array out of bounds exception thrown in some of the components that use it. These should probably 'wrap' around to avoid the exception.
The Reference Matches
heat map currently shows percent reads mapped per reference for each sample. It might be useful to show absolute number of reads mapped per reference to compare sample to sample better. This could also be shown in log scale.
Generate reports as well formatted documents. Export figures as SVG. Export data as CSV.
More of a stretch goal but this should go modular - Being able to plug in a module that does additional read annotation in the server process and also adds additional views/components to visualize this in the browser app.
This could include modules for metagenomics with Kraken or kraken-style classification.
This would probably be a separate app from RAMPART as it would be run at sample receipt before the lab work begins. Would allow you to provide sample IDs, select intended protocols, assign barcodes, specify which RAMPART instance to use (i.e., what virus it is) etc.
Ultimately it should provide a customized protocol ready for lamination (you did bring the laminator didn't you @igoodfel?).
Log scales might work better for coverage where there is high variability (would need to be a pseudo count).
Double click an icon on the desktop to start everything up with minimum fuss/command line gubbins.
If barcode names are specified in the command line:
I.e., --barcodeNames BC01=sample1 BC02=sample2 ...
..then only those samples should be shown in RAMPART. At the moment, other barcodes occasionally come up which just add noise to the display. The simplest thing may be to simply pass these to porechop as the search set.
The order in the RAMPART display should probably be the the order specified (currently it is by the order they are first found).
Roll over graphs to show actual numbers in popup box.
At the moment the primer locations are in a json file. Could be possible to point towards a .bed
file to get these.
The call to MiniMap2 in map_single_fastq.py returns the %similarity to the mapped reference. It would be good to be able to include this as a chart in each sample panel (either for the majority mapping or selecting a specific reference).
Make the design responsiveness - adapt to smaller screen sizes and add touchscreen abilities
Target iPads and Mk1C screens initially?
We will often have many more reference genomes than there is space on the reference heatmap but only a few of them will have any significant number of reads mapping to them. This chart should dynamically adjust to only show top references. Perhaps specify a maximum number plus some cutoff for the minimum number of reads for it to be shown.
Set the x-axis range for all the channel read length plots to match.
Reads over time loses history.
If you resize a page or zoom the browser using the cmd + or - options in Safari or Chrome, the components don't resize to fit together until a reload of the page.
Grey out switch or don't show switch? Or possibly just show solid block of colour with vertical gaps to show 0 coverage?
Add a button to each sample panel to bin the reads currently displayed into a file or folder (also some way of doing this as a batch across all samples). Appropriate labelling of file.
I.e., add brushes on the length distribution chart, click reference matches etc. Doing it on the top panel will filter across all samples but can also filter in each individual sample panel.
Plan to switch from modified version of porechop to using off-the-shelf qcat (https://github.com/nanoporetech/qcat). Need to assess the best way of doing this (i.e., do we still put barcode calls into the read headers).
Implement a full page stats page with more detailed plots, tables, QC, and stats for any channel. This would be reached by a button on each channel panel (also same page is available for the whole run at the top). This could also have various 'action' buttons such as to assemble the consensus genome and push it to the analysis package.
Coverage plots should have option to switch to log y-axis.
The bottom row of the reference heatmap could be an unmapped
count. It would be useful to see if your references are too divergent for a particular sample.
The spline curves don't give the detail needed to see the individual amplicons. Use a stepped line chart. For the overview, perhaps a stacked chart to show the channels.
Time units for reads over time should switch to minutes then hours as appropriate.
Currently the paths in the configuration json are specified as relative to the current working directory. They should be specified relative to the location of the configuration file.
The annotation script should read the BED file (or similar) and use the MiniMap2 coordinates to infer the amplicon for each read. Some QC could also be done here - I.e. filter multi amplicon chimeras etc. The amplicon number would be added as a column in the annotation CSV file for use by the RAMPART UI.
A sample panel only appears if at least one read has demuxed and mapped. If you specify barcodeNames then you are expecting those samples so they should be shown even if empty.
Read counts (and potentially coverage) axes are in log space. Perhaps log and linear can be toggled by clicking on the axis?
Currently the heatmap shows the highest values as dark red and the lowest values as a pale yellow. This isn't the natural scale for a heatmap (hotter is more white). Particularly with the dark background, the brightest colour should represent the highest value. Suggest reverse the colour scale.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.