GithubHelp home page GithubHelp logo

dw-data / covid19-vaccine-tracker Goto Github PK

View Code? Open in Web Editor NEW
12.0 4.0 3.0 194 KB

Researchers around the world are trying to develop safe and effective vaccines against SARS-CoV-2, the virus that causes COVID-19. Here's their progress so far.

Home Page: https://p.dw.com/p/3lUln

Jupyter Notebook 100.00%
covid-19 pandemic vaccines vaccine-tracker

covid19-vaccine-tracker's Introduction

COVID-19 vaccine tracker

Idea: Eva Lopez

Research, data analysis and visualization, writing: Gianna-Carina Grün

Editing: Fabian Schmidt, Zulfikar Abbany

You can read the story in English, Deutsch, Español, Português do Brasil, Indonesia

Data sources

This analysis is based on two different data sources that are structured differently and thus used to answer different questions as part of this analysis.

In the dataset provided by the WHO, each row corresponds to one developer and their vaccine candidate. In this analysis, this dataset is used to

  • show how many vaccine candidates are in which phase
  • classify the vaccine candidates according to their type, and subsequently also phase

In the "Clinical trials database" compiled by the London School of Hygiene and Tropical Medicine (LSHTM), each row corresponds to a registered clinical trial -- accordingly, a developer team testing their vaccine in multiple phases in parallel will see multiple entries in this dataset. In this analysis, this dataset is used to

  • show trial sizes by phase by company
  • provide estimates (min, max, median) of the clinical phases

Comparisons with other vaccine trackers

Other media companies too have implemented vaccine development trackers (NYTimes, Guardian, Milken Institute, Zeit Online).

You will see that numbers differ, based on

  • the sources used and
  • whether trials or vaccine candidates are counted and
  • based on the distinction between "in approval" and "approved"; we chose to not show the process, but only signify a vaccine as "approved" once the major authorities have done so

Definitions

Assignment of clinical phases

There are five phases of clinical trials: Phase I, Phase I/II, Phase II, Phase II/III, Phase III.

If a vaccine is in a dual phase like Phase I/II or in Phase II/III it is tested there simultaneously, but for analysis assigned to the higher phase: a vaccine candidate in Phase I/IIis assigned to Phase II, whereas a vaccine candidate tested in Phase II/III it is assigned to Phase III.

Beyond these dual phases, vaccine candidates can be in different clinical trial phases at the same time (e.g. Phase I and Phase III) with different trial parameters (age, pre-existing conditions). If that is the case for a candidate, it is shown in both phases.

Approval

Each country has their own national regulatory authorities (NRA) responsible for approving for new drugs in their country. There are several NRAs that are particularly relevant, among them the US-American FDA and the European EMA. We signify a COVID-19 vaccine as approved if one of these bodies approves a vaccine.

We will also classify a vaccine as approved if one of these two bodies greenlights a new vaccine on their "emergency use listing", or if the WHO does so.

Vaccine Candidate Approved by on notes
BioNTech-Pfizer FDA 2020-12-12 for emergency use
BioNTech-Pfizer EMA subsequently for emergency use
Moderna FDA 2020-12-19 for emergency use
Moderna EMA subsequently for emergency use
Oxford/AstraZeneca EMA 2021-01-29 for emergency use
Janssen EMA 2021-03-11 for emergency use
Sinopharm WHO 2021-06-03 for emergency use
Sinovac WHO 2021-06-03 for emergency use

Time ranges

For the duration of clinical trial phases, we rely on the LSHTM data publishing the estimates provided by research groups. They proved a start date for their clinical trials and a primary completion date.

The primary completion date is defined as "The date on which the last participant in a clinical study was examined or received an intervention to collect final data for the primary outcome measure. Whether the clinical study ended according to the protocol or was terminated does not affect this date. For clinical studies with more than one primary outcome measure with different completion dates, this term refers to the date on which data collection is completed for all the primary outcome measures. The "estimated" primary completion date is the date that the researchers think will be the primary completion date for the study."

Data processing

Datasets were downloaded from the above mentioned sources. The data provided by LSHTM did not undergo any form of pre-processesing as it already came in a machine-readable format.

As the data by WHO is published in form of a pdf, the dowloaded file undergoes a couple of pre-processing steps before analysis can start.

First, the pdf was converted into a machine-readable format using the software Abbyy Fine Reader and subsequently cleaned manually.

The datasets consists of two parts: the first table is labelled xx candidate vaccines in clinical evaluation , whereas xx corresponds to the number of rows in that table. It's followed by the second table yy candidate vaccines in preclinical evaluation that has a slightly different columns compared to the first one.

In combining both datasets, we established the one for analysis that holds the following columns

column description
COVID-19 Vaccine developer/manufacturer Name of developer/manufacturer or team of developers/manufacturers, separated by /
Vaccine platform Type of vaccine*
Preclinical preclinical if True, else no value
Phase 1 if there's a trial in this phase, cell holds one or multiple trial numbers
Phase 1/2 if there's a trial in this phase, cell holds one or multiple trial numbers
Phase 2 if there's a trial in this phase, cell holds one or multiple trial numbers
Phase 3 if there's a trial in this phase, cell holds one or multiple trial numbers

* For entries in the column Vaccine platforms we noticed different spellings of the same type, for example Protein subunit, Protein Subunit or Protein Sub-unit or Non-replicating viral vector or non replicating viral vector, which were unified to be spelled the same way before exporting the dataset for analysis.

covid19-vaccine-tracker's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.