GithubHelp home page GithubHelp logo

genspectrum / cov-spectrum-website Goto Github PK

View Code? Open in Web Editor NEW
57.0 57.0 13.0 25.67 MB

A web platform to detect and analyze variants of SARS-CoV-2

Home Page: https://cov-spectrum.org

License: GNU General Public License v3.0

Dockerfile 0.04% HTML 0.15% TypeScript 99.64% CSS 0.13% JavaScript 0.05%
cov-spectrum covid-19 epidemiology genomics research sars-cov-2

cov-spectrum-website's People

Contributors

anastasia-escher avatar chaoran-chen avatar corneliusroemer avatar dameyerdave avatar dependabot[bot] avatar dr-david avatar dryak avatar fengelniederhammer avatar gautier-collab avatar jonaskellerer avatar mcarrara-bioinfo avatar ningxie1991 avatar philipschulz avatar poolsar42 avatar spacephoton avatar tehwalris avatar theosanderson avatar wszczawinski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cov-spectrum-website's Issues

About page

We need a page describing the features and goals of the project.

Describe format of mutations in FAQ

A description of the mutation naming (e.g., ORF8:Q27*) should be added to the FAQ. It should probably contain a link to Nextstrain or Nextclade, explain clearly how the stop codon and deletions are coded and which genes exist.

Export plot as PNG

Hey. Since recharts generates SVG elements, would it be easy to export a plot as an SVG file? PNG is also good for now. Could we have it for all the plots?

The reason is again that I believe that it would be very helpful if the user can easily download the plots and use them in their papers and presentations.

Improve tooltips in the international comparison plot

And another tooltip issue...

image

  • The x-axis has again a weekly scale, i.e., we would like to replace "Dec 14" with "Week XX, 2020 (from 14.12)"
  • Could we have only one tooltip per country? The text could be in this case: "0.55% [0.19%, 1.61%] | Switzerland"

Combined pangolin lineage and mutation search

Update (03.07.2021):

Let's merge the pangolin lineage search with the mutations search!

  • It should be possible to put in at most one and an arbitrary number of amino acid mutations.
  • How to deal with "Match Percentage"?? Maybe let's leave it out in the merged search and keep the "Search by mutations" in the private version until we found a better solution. Let's see if someone misses it.
  • The search bar should work similarly to the one in the international comparison plot (see screenshot below): the entries should be parsed in real-time. The user should get feedback when they enter an invalid value.

image


Original text:

Hi. I received a request from @tanja819 earlier:

Did we so far see any B117 which carry also a 484 mutation in Switzerland? It would be great to check for that in future regularly, as the 484 may induce immune escape

Do you think that we should be able to answer this type of request with Spectrum? If yes, how?

Right now, we tend to call a known variant if 80% of the lineage-defining mutations are present. One main reason for that is that sequencing is not perfect and we don't get full coverage so that we are not always sure about whether a mutation is present. Furthermore, missing a few mutations might not change the properties of a variant entirely. However, as in the case Tanja mentioned, there are certain mutations of special interest.

Should we implement an advanced search where the user can enter required and optional mutations?

Synchronize search with selection

As discussed in the meeting, make the pre-fill the "Search by mutations" fields search field when the selected variant changes. If the user edits the search fields, the "focus" panel (just like now) shouldn't update until they click "Search".

Routing system / bookmarkable URLs for the variant page

Country and variant selection should change the URL, and vice versa, the user should be able to open a certain country and variant with a direct link.

A first proposal:

## Basic pages:
/login
/about

## The splitted explore/focus page:
/e/{country}/ -> No variant selected
/e/{country}/variant?name=B.1.1.7&matchPercentage=0.8 -> known variant
/e/{country}/variant?mutations=....&matchPercentage=0.8 -> not a known variant

## Deep focus:
/e/{country}/variant?mutations=....&matchPercentage=0.8/samples -> sample list
/e/{country}/variant?mutations=....&matchPercentage=0.8/demographics
...

The "e" in /e/ may be understood as an abbreviation for "explore."

Color scheme

Hey @tehwalris and @TKGZ! Let's slowly (not urgent) start putting a color scheme together for the overall website and the plots.

Until now, I've been using the following colors for many variant plots (but not in this project): #0D4A70, #67916E, #1883C6, #99D9A4, and we could re-use them if you like.

Do you have experience in defining a color scheme for a website/plots? How many/which "types" of colors do we need?

Switzerland Postal Code Map

We would like to have a map that shows the geographical distribution of samples of a selected variant. In this initial step, the map shall present the total number of found samples per zip code-area. It should be a heat map, i.e., the more cases an area had, the darker should its color be.

The map should only be shown if (1) the user is logged in and (2) the selected country is Switzerland.

Corresponding API: https://github.com/cevo-public/cov-spectrum-docs/blob/develop/API.md#variant-time-zip-code-distribution

Show percentage of unknowns in age plot

The amount of age information can differ a lot, and I think that the user should be able to see quickly how much information is available in order to judge whether the age plot is useful.

Maybe we could just add a sentence below the plot: "The age information is unknown for XX% of the samples." ?

Global/continent view

Instead of a country, the user should be able to select a region (or continent) or the whole world. The names of the regions can be retrieved from /resource/region.

I think that the URL structure should stay as it is, i.e., /explore/{country|region|"world"}/....

It depends on GenSpectrum/cov-spectrum-server#11 and #62

Website structure

Let's collect some ideas for the overall structure of the website.

I see two central workflows for the platform:

  • The user wants to know what's going on in a country. Then, she might want to start the journey at a dashboard that shows the table with uprising variants (our current "Find new variants" tab) but also other statistics such as the number of sequenced samples through time and their geographic distribution. A plot similar to https://covariants.org/per-country would be super cool but it might be difficult to build since variants - we define them simply as a set of mutations - are not distinct. I.e., a sample is assigned to a large number of variants.
  • The user wants to know what's going on with a variant: both for a particular country and globally (maybe with an emphasis on the neighboring countries).

When seeing an interesting variant, the user could want to know "everything" about it, especially where it was found, how it spread through time, etc.

What would be a good structure to support these flows?

Properly merge feature/model-chen2021Fitness

Since it is not merged to develop at the moment, we can't make any changes to it. Make sure that the component respects the global dataType ("Sampling strategy") setting once you merge.

Fitness advantage estimation model

Even though already public for a while, I still need to create a PR. What's missing is a description of the model and some code cleaning.

Proper error handling

  • Treat non-200 status from the server as an error
  • Show the user something useful when stuff fails to load
  • Add error boundaries
    • Currently our application will fully crash if the server replies with invalid data (eg. because it's down)

Sex plot

/resource/sample2 now also returns the sex.

What do you think about a pie chart for a change?

Show general information about genes

Taking "N:P80R" as an example, do we know something about the N or even about N:80?

When the user clicks on "N", a short description (and maybe some references) about the N-gene should be shown in a tooltip. Later, we will integrate more detailed information about certain regions within a gene.

Sample details tooltip

In the sample list view, when the user hovers over the GISAID ID...

image

...more details about the sample should be shown in a tooltip if the user is logged in.

Example content for the tooltip:

EPI_ISL_751193

Submitting lab: Department of Biosystems Science and Engineering, ETH Zürich
Country: Switzerland
Division: Basel-Stadt
Location: 4058 Basel
Date: 2020-12-18
---------------------------
Host: Human
Age: 30
Sex: Male

Corresponding API: https://github.com/cevo-public/cov-spectrum-docs/blob/develop/API.md#sample

Smart ticks for week axis

Most of our plots have a time axis in weeks. I used tickvals so that there is a tick every week, which looks nice on most time scales. Without this Plotly chooses to place ticks very sparsely and at pretty random intervals.

The problem is that a tick for every week is too much if the plot contains lots of weeks (the labels become vertical and can even overlap). In addition every week is labeled with a year which uses tons of space, but the year only interesting for the boundary between two years, as well as maybe first and last data points of the whole plot.

Ideally we would:

  • Show week ticks at uniform intervals
  • Show one tick every week as long as that fits well
  • Remove ticks "smartly" if we don't have space (keep first week, last week, and weeks near year changes)
  • Hide the year on most ticks (show on first week, last week, and near year changes)

We might be able to find some good existing code to do this. Maybe there's a function for spacing weekly ticks in D3 or a related library.

Nothing = Zero

For all the plots we currently have, we can safely assume that the response contains the full dataset and weeks/ages/etc., that are not mentioned, occurred zero times.

For example...

image

Here, we know that between the weeks 23 and 26, the variant was never sequenced. Further, making the assumption that we did perform sequencing through the whole time, we can also set the proportion to 0%.

Improve tooltips in the time and age distribution plots

That's how it currently looks like:

image

A suggestion for improvement:

  • The date tooltip could print: "Week 1, 2021 (04.01)"
  • Instead of "72 | trace 0": "Number of sequences: 72"
  • Instead of "11.46497 | trace 1": "Proportion: 11.46%"

This would also make it much clearer what the lines and bars mean.

Limit size of explore panel

On large screens we should have a maximum size for the explore panel, since it's useless when it fills half of a large screen. The explore panel should still fill exactly 50% of the width on small screens, like it does now.

Bookmark a variant

Every user (no login needed) should be able to bookmark a variant (especially an unknown variant). These should then be listed on the "variant list"-page.

The data should be stored in the user's browser's local storage,

Nextclade integration

Next to the "Show samples" buttons, a new button "Show on Nextclade" should be added. On click, a new tab will be opened redirecting to Nextclade.

The following should happen in the app:

  1. Create a temporary JWT token (with /internal/create-temporary-jwt)
  2. The sequences in fasta format can be fetched with GET /resource/sample-fasta?<params>&jwt=<temporary token>
  3. Open the following link in a new tab https://clades.nextstrain.org/?input-fasta=<endpoint URL>. The endpoint URL has to be encoded (-> use encodeURIComponent()).

The button should be only available for logged-in users.

Random samples-only filter

In the top bar, next to the country selection, another filter should be added. The user should be able to choose between "all samples" and "surveillance". The "surveillance"-option should be only available for Switzerland (and otherwise grayed out?). If surveillance is selected, the international comparison plot should be hidden.

The default (for Switzerland) should show all samples.

This issue depends on GenSpectrum/cov-spectrum-server#10. The API was updated: all appropriate endpoints now have an additional, optional query param called dataType. If nothing is provided, all samples will be used, and if SURVEILLANCE is passed, the server will only use selected samples.

Bugs related to the sampling strategy selection

Hi @tehwalris, I found two bugs:

  • /explore/Switzerland/ is currently redirecting to /explore/Switzerland/variants which does not work. Could you change the target to /explore/Switzerland/AllSamples, please?
  • When we select "Surveillance" for Switzerland and then switch the country, the surveillance selection will stay and unchangeable. I suggest to switch to AllSamples automatically for countries that do not support the current selection. Screenshot:

image

Compact layout in focus panel

Currently all the charts in the "focus" panel are stacked vertically. That means you almost always have to scroll, and some charts are way too wide (eg. age distribution). We could put some of these plots side by side on a bigger screen. When we do that, we should consider adding borders, backgrounds or shadows to keep the layout visually clear.

404 page

Who has a cool idea for the 404 page? :)

Add Github mark

A Github mark linking to this repository should be added - maybe to the footer?

Better sorting of mutation lists

A mutation (e.g., S:N501Y) has the following format: [protein]:[wildtype amino acid][position][variant amino acid]. When mentioned in a list, mutations should be sorted first by the protein, and then by the position.

Chart library

Hi @TKGZ and @tehwalris. How do you feel about the Plotly library we currently are using? Do you like it? Would you like to use it as the main library for common charts or do you have another preference?

Anything else you would like to discuss about plotting?

(I don't have any preferences.)

Response times

The platform will be used to perform a large variety of analyses that can consume very different amounts of time. I propose thinking in terms of the following categories:

  • (A) <1s
  • (B) 1s to 10s
  • (C) 10s to 3 min
  • (D) 3 min to 15 min
  • (E) 15 min to multiple days

Unordered comments:

  • Most of the requests in (A) will hopefully only need <150ms and include simple lookups and loading of pre-computed/cached information. Given our fragile setting with both the backend server and the database running in virtual and shared environments and the database also used by other services, however, I am not sure if we can give better guarantees.
  • All (or only most?) of the country-level information - i.e. what will be shown in the explore area - can be pre-computed and fall into (A).
  • On the variant-level, pre-computations are only possible to a limited extent. We can prepare the results for the known variants and those on the top of our "potential interesting variants" list and do some caching. Everything that is pre-computed will be in (A) - but obviously, we can't have everything ready for every variant.
  • We should keep the main plots in the focus area in (B). Getting them into (A) is probably possible but might take too much effort.
  • For results in (C), the user might just have to wait... We should show a good loading visualization stating that it will not take more than 3 minutes (maybe with a counter).
  • Not sure how I feel about (D). Nextclade often falls into (D) and people wait... But since they require significant server power, we have to be careful.
  • (E) are requests for which the user can't wait for. Not right now - but in a month or two - we might want to incorporate analyses that a user can trigger in the web interface and will start a job on the cluster. Not every user should be able to do it directly. We have to limit it to logged-in users and maybe non-logged-in users may send a request that an admin can approve. Once the results come back, they can be presented on the website and the user should get a notification.

@tehwalris and @TKGZ, what are your thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.