genspectrum / cov-spectrum-website Goto Github PK
View Code? Open in Web Editor NEWA web platform to detect and analyze variants of SARS-CoV-2
Home Page: https://cov-spectrum.org
License: GNU General Public License v3.0
A web platform to detect and analyze variants of SARS-CoV-2
Home Page: https://cov-spectrum.org
License: GNU General Public License v3.0
We need a page describing the features and goals of the project.
In the explore area, we would like to show a plot presenting the sequencing intensity of the selected country.
The corresponding API is at https://github.com/cevo-public/cov-spectrum-docs/blob/develop/API.md#sequencing-intensity-through-time
A suggestion:
On hover, a tooltip should show the absolute number of cases, the absolute number of sequenced samples, and the relative number as percentage.
A description of the mutation naming (e.g., ORF8:Q27*) should be added to the FAQ. It should probably contain a link to Nextstrain or Nextclade, explain clearly how the stop codon and deletions are coded and which genes exist.
We need an (at least a little) better README...
Hospitalization and death rate plots should be created for the private Switzerland area.
https://github.com/cevo-public/cov-spectrum-docs/blob/develop/API.md#sample-new provides all the needed information.
Hey. Since recharts generates SVG elements, would it be easy to export a plot as an SVG file? PNG is also good for now. Could we have it for all the plots?
The reason is again that I believe that it would be very helpful if the user can easily download the plots and use them in their papers and presentations.
Let's merge the pangolin lineage search with the mutations search!
Hi. I received a request from @tanja819 earlier:
Did we so far see any B117 which carry also a 484 mutation in Switzerland? It would be great to check for that in future regularly, as the 484 may induce immune escape
Do you think that we should be able to answer this type of request with Spectrum? If yes, how?
Right now, we tend to call a known variant if 80% of the lineage-defining mutations are present. One main reason for that is that sequencing is not perfect and we don't get full coverage so that we are not always sure about whether a mutation is present. Furthermore, missing a few mutations might not change the properties of a variant entirely. However, as in the case Tanja mentioned, there are certain mutations of special interest.
Should we implement an advanced search where the user can enter required and optional mutations?
This will be very easy to do once #36 is merged, so I'll do it then.
As discussed in the meeting, make the pre-fill the "Search by mutations" fields search field when the selected variant changes. If the user edits the search fields, the "focus" panel (just like now) shouldn't update until they click "Search".
As suggested by @TKGZ, we would like to use the design of the time distribution plot for the age distribution plot as well.
Country and variant selection should change the URL, and vice versa, the user should be able to open a certain country and variant with a direct link.
A first proposal:
## Basic pages:
/login
/about
## The splitted explore/focus page:
/e/{country}/ -> No variant selected
/e/{country}/variant?name=B.1.1.7&matchPercentage=0.8 -> known variant
/e/{country}/variant?mutations=....&matchPercentage=0.8 -> not a known variant
## Deep focus:
/e/{country}/variant?mutations=....&matchPercentage=0.8/samples -> sample list
/e/{country}/variant?mutations=....&matchPercentage=0.8/demographics
...
The "e" in /e/
may be understood as an abbreviation for "explore."
Hey @tehwalris and @TKGZ! Let's slowly (not urgent) start putting a color scheme together for the overall website and the plots.
Until now, I've been using the following colors for many variant plots (but not in this project): #0D4A70, #67916E, #1883C6, #99D9A4, and we could re-use them if you like.
Do you have experience in defining a color scheme for a website/plots? How many/which "types" of colors do we need?
With some kind of basic stats or plots to help decide which variants to select
We would like to have a map that shows the geographical distribution of samples of a selected variant. In this initial step, the map shall present the total number of found samples per zip code-area. It should be a heat map, i.e., the more cases an area had, the darker should its color be.
The map should only be shown if (1) the user is logged in and (2) the selected country is Switzerland.
Corresponding API: https://github.com/cevo-public/cov-spectrum-docs/blob/develop/API.md#variant-time-zip-code-distribution
The amount of age information can differ a lot, and I think that the user should be able to see quickly how much information is available in order to judge whether the age plot is useful.
Maybe we could just add a sentence below the plot: "The age information is unknown for XX% of the samples." ?
Instead of a country, the user should be able to select a region (or continent) or the whole world. The names of the regions can be retrieved from /resource/region
.
I think that the URL structure should stay as it is, i.e., /explore/{country|region|"world"}/...
.
It depends on GenSpectrum/cov-spectrum-server#11 and #62
Per default, Swiss data from three weeks ago should be loaded in the "Potential new variants" component.
Let's collect some ideas for the overall structure of the website.
I see two central workflows for the platform:
When seeing an interesting variant, the user could want to know "everything" about it, especially where it was found, how it spread through time, etc.
What would be a good structure to support these flows?
Since it is not merged to develop at the moment, we can't make any changes to it. Make sure that the component respects the global dataType
("Sampling strategy") setting once you merge.
Even though already public for a while, I still need to create a PR. What's missing is a description of the model and some code cleaning.
Github actions shall build a docker image upon each push and upload the image to the Github container registry.
The user should be able to switch the scale of the international comparison plot to logarithmic. Data points with y=0 should be omitted (since log(0) is undefined).
For B.1.1.7, it would look like this:
(Source: https://ibz-shiny.ethz.ch/covidDashboard/variant-plot/index.html)
Address suggestions in #37.
/resource/sample2
now also returns the sex.
What do you think about a pie chart for a change?
Taking "N:P80R" as an example, do we know something about the N or even about N:80?
When the user clicks on "N", a short description (and maybe some references) about the N-gene should be shown in a tooltip. Later, we will integrate more detailed information about certain regions within a gene.
In the sample list view, when the user hovers over the GISAID ID...
...more details about the sample should be shown in a tooltip if the user is logged in.
Example content for the tooltip:
EPI_ISL_751193
Submitting lab: Department of Biosystems Science and Engineering, ETH Zürich
Country: Switzerland
Division: Basel-Stadt
Location: 4058 Basel
Date: 2020-12-18
---------------------------
Host: Human
Age: 30
Sex: Male
Corresponding API: https://github.com/cevo-public/cov-spectrum-docs/blob/develop/API.md#sample
A lot of the data come from https://www.gisaid.org/ and of course from our own dataset. The following acknowledgment should be added to the footer (including the links):
"Enabled by the data from the Swiss Viollier Sequencing Consortium and [the green/white GISAID Logo]"
Most of our plots have a time axis in weeks. I used tickvals
so that there is a tick every week, which looks nice on most time scales. Without this Plotly chooses to place ticks very sparsely and at pretty random intervals.
The problem is that a tick for every week is too much if the plot contains lots of weeks (the labels become vertical and can even overlap). In addition every week is labeled with a year which uses tons of space, but the year only interesting for the boundary between two years, as well as maybe first and last data points of the whole plot.
Ideally we would:
We might be able to find some good existing code to do this. Maybe there's a function for spacing weekly ticks in D3 or a related library.
For all the plots we currently have, we can safely assume that the response contains the full dataset and weeks/ages/etc., that are not mentioned, occurred zero times.
For example...
Here, we know that between the weeks 23 and 26, the variant was never sequenced. Further, making the assumption that we did perform sequencing through the whole time, we can also set the proportion to 0%.
On large screens we should have a maximum size for the explore panel, since it's useless when it fills half of a large screen. The explore panel should still fill exactly 50% of the width on small screens, like it does now.
Whatever @tehwalris is currently doing :) just to keep track of the ongoing work on the project board..
Every user (no login needed) should be able to bookmark a variant (especially an unknown variant). These should then be listed on the "variant list"-page.
The data should be stored in the user's browser's local storage,
The country (and region) selection in the top bar deserves an improvement :)
See also #67
Next to the "Show samples" buttons, a new button "Show on Nextclade" should be added. On click, a new tab will be opened redirecting to Nextclade.
The following should happen in the app:
/internal/create-temporary-jwt
)GET /resource/sample-fasta?<params>&jwt=<temporary token>
https://clades.nextstrain.org/?input-fasta=<endpoint URL>
. The endpoint URL has to be encoded (-> use encodeURIComponent()).The button should be only available for logged-in users.
For Switzerland, the "Show on Nextclade" button can be made public since (hopefully often enough), there will be public sequence data available - see GenSpectrum/cov-spectrum-server#9.
When hovering over the button, a tooltip should explain that only samples from "BSSE, ETH Zurich" will be used.
Please address @tehwalris' comments in #47.
In the top bar, next to the country selection, another filter should be added. The user should be able to choose between "all samples" and "surveillance". The "surveillance"-option should be only available for Switzerland (and otherwise grayed out?). If surveillance is selected, the international comparison plot should be hidden.
The default (for Switzerland) should show all samples.
This issue depends on GenSpectrum/cov-spectrum-server#10. The API was updated: all appropriate endpoints now have an additional, optional query param called dataType
. If nothing is provided, all samples will be used, and if SURVEILLANCE
is passed, the server will only use selected samples.
Hi @tehwalris, I found two bugs:
/explore/Switzerland/
is currently redirecting to /explore/Switzerland/variants
which does not work. Could you change the target to /explore/Switzerland/AllSamples
, please?Currently all the charts in the "focus" panel are stacked vertically. That means you almost always have to scroll, and some charts are way too wide (eg. age distribution). We could put some of these plots side by side on a bigger screen. When we do that, we should consider adding borders, backgrounds or shadows to keep the layout visually clear.
When no sample can be found, the focus page should show a clear note instead of a set of empty spaces.
Who has a cool idea for the 404 page? :)
A Github mark linking to this repository should be added - maybe to the footer?
A mutation (e.g., S:N501Y
) has the following format: [protein]:[wildtype amino acid][position][variant amino acid]
. When mentioned in a list, mutations should be sorted first by the protein, and then by the position.
Hi @TKGZ and @tehwalris. How do you feel about the Plotly library we currently are using? Do you like it? Would you like to use it as the main library for common charts or do you have another preference?
Anything else you would like to discuss about plotting?
(I don't have any preferences.)
Following @tehwalris' idea, a new, general endpoint to get sample information was created. It should be used to calculate all the variant-specific plots and render the /plot/variant/*
endpoints obsolete.
Documentation: https://github.com/cevo-public/cov-spectrum-docs/blob/develop/API.md#sample-new
The platform will be used to perform a large variety of analyses that can consume very different amounts of time. I propose thinking in terms of the following categories:
Unordered comments:
@tehwalris and @TKGZ, what are your thoughts?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.