GithubHelp home page GithubHelp logo

jeff-lewis / open-covid-19-data Goto Github PK

View Code? Open in Web Editor NEW

This project forked from open-covid-19/data

1.0 1.0 0.0 3.24 MB

Crowd-sourced COVID-19 data

Python 1.29% Jupyter Notebook 98.59% Shell 0.12%

open-covid-19-data's Introduction

Open COVID-19 Dataset

This repo contains free datasets of historical data related to COVID-19. The current datasets are:

  • World:

    • Date: ISO 8601 date (YYYY-MM-DD) of the datapoint
    • CountryCode: ISO 3166-1 alpha-2 code of the country
    • CountryName: American English name of the country
    • Confirmed: total number of cases confirmed after positive test
    • Deaths: total number of deaths from a positive COVID-19 case
    • Latitude: floatig point representing the geographic coordinate
    • Longitude: floatig point representing the geographic coordinate
  • China:

    • Date: ISO 8601 date (YYYY-MM-DD) of the datapoint
    • Region: American English name of the province
    • CountryCode: ISO 3166-1 alpha-2 code of the country
    • CountryName: American English name of the country
    • Confirmed: total number of cases confirmed after positive test
    • Deaths: total number of deaths from a positive COVID-19 case
    • Latitude: floatig point representing the geographic coordinate
    • Longitude: floatig point representing the geographic coordinate
  • USA:

    • Date: ISO 8601 date (YYYY-MM-DD) of the datapoint
    • Region: American English name of the province
    • CountryCode: ISO 3166-1 alpha-2 code of the country
    • CountryName: American English name of the country
    • Confirmed: total number of cases confirmed after positive test
    • Deaths: total number of deaths from a positive COVID-19 case
    • Tested: total number of tests performed to determine COVID-19 case
    • Latitude: floatig point representing the geographic coordinate
    • Longitude: floatig point representing the geographic coordinate

Analyze the data

You can find Jupyter Notebooks in the analysis folder with examples of how to load and analyze the data. You can use Google Colab if you want to run your analysis without having to install anything in your computer, simply go to this URL: https://colab.research.google.com/github/open-covid-19/data/

Why another dataset?

This dataset is heavily inspired by the dataset maintained by Johns Hopkins University. Unfortunately, that dataset is currently experiencing maintenance issues and a lot of applications depend on this critical data being available in a timely manner. Further, the true sources of data for that dataset are still unclear.

Source of data

The world data comes from the daily reports at the ECDC portal. The XLS file is downloaded and parsed using scrapy and pandas.

Data for Chinese regions comes from the daily WHO situation reports, which are automatically parsed from their PDF source using scrapy and ghostscript.

The data is automatically crawled and parsed using the scripts found in the input folder. This is done daily, and as part of the processing some additional columns are added, like country-level coordinates.

Update the data

To update the contents of the output folder, run the following:

# Install dependencies
pip install -r requirements.txt
# Update world data
sh input/update_world_data.sh
# Update China data
sh input/update_china_data.sh
# Update USA data
sh input/update_usa_data.sh

Note that this will only fetch the latest report from the WHO and ECDC sources. If a report is skipped or amended, manual operation will be required.

open-covid-19-data's People

Contributors

owahltinez avatar dmamalis avatar jeff-lewis avatar

Stargazers

Roman avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.