coronavirus

The coronavirus package provides a tidy format dataset of the 2019 Novel Coronavirus COVID-19 (2019-nCoV) epidemic. The raw data pulled from the Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) Coronavirus repository.

More details available here, and a csv format of the package dataset available here

Source: Centers for Disease Control and Prevention’s Public Health Image Library

Important Note

As this an ongoing situation, frequent changes in the data format may occur, please visit the package news to get updates about those changes

Installation

Install the CRAN version:

install.packages("coronavirus")

Install the Github version (refreshed on a daily bases):

# install.packages("devtools")
devtools::install_github("RamiKrispin/coronavirus")

Data refresh

While the coronavirus CRAN version is updated every month or two, the Github (Dev) version is updated on a daily bases. The update_dataset function enables to overcome this gap and keep the installed version with the most recent data available on the Github version:

library(coronavirus)
update_dataset()

Note: must restart the R session to have the updates available

Alternatively, you can pull the data using the Covid19R project data standard format with the refresh_coronavirus_jhu function:

covid19_df <- refresh_coronavirus_jhu()
head(covid19_df)
#>         date    location location_type location_code location_code_type     data_type value      lat      long
#> 1 2021-05-03 Afghanistan       country            AF         iso_3166_2    deaths_new     5 33.93911 67.709953
#> 2 2020-08-13 Afghanistan       country            AF         iso_3166_2     cases_new    86 33.93911 67.709953
#> 3 2020-10-24 Afghanistan       country            AF         iso_3166_2 recovered_new    13 33.93911 67.709953
#> 4 2021-03-14 Afghanistan       country            AF         iso_3166_2    deaths_new     3 33.93911 67.709953
#> 5 2020-02-28 Afghanistan       country            AF         iso_3166_2     cases_new     0 33.93911 67.709953
#> 6 2020-07-19 Afghanistan       country            AF         iso_3166_2    deaths_new    17 33.93911 67.709953

Dashboard

A supporting dashboard is available here

Usage

data("coronavirus")

This coronavirus dataset has the following fields:

date - The date of the summary
province - The province or state, when applicable
country - The country or region name
lat - Latitude point
long - Longitude point
type - the type of case (i.e., confirmed, death)
cases - the number of daily cases (corresponding to the case type)

head(coronavirus)
#>         date province             country       lat       long      type cases
#> 1 2020-01-22                  Afghanistan  33.93911  67.709953 confirmed     0
#> 2 2020-01-22                      Albania  41.15330  20.168300 confirmed     0
#> 3 2020-01-22                      Algeria  28.03390   1.659600 confirmed     0
#> 4 2020-01-22                      Andorra  42.50630   1.521800 confirmed     0
#> 5 2020-01-22                       Angola -11.20270  17.873900 confirmed     0
#> 6 2020-01-22          Antigua and Barbuda  17.06080 -61.796400 confirmed     0

Summary of the total confrimed cases by country (top 20):

library(dplyr)

summary_df <- coronavirus %>% 
  filter(type == "confirmed") %>%
  group_by(country) %>%
  summarise(total_cases = sum(cases)) %>%
  arrange(-total_cases)

summary_df %>% head(20) 
#> # A tibble: 20 x 2
#>    country        total_cases
#>    <chr>                <dbl>
#>  1 US                33166418
#>  2 India             27157795
#>  3 Brazil            16194209
#>  4 France             5670486
#>  5 Turkey             5203385
#>  6 Russia             4960174
#>  7 United Kingdom     4483177
#>  8 Italy              4197892
#>  9 Germany            3662568
#> 10 Spain              3652879
#> 11 Argentina          3586736
#> 12 Colombia           3270614
#> 13 Poland             2867187
#> 14 Iran               2855396
#> 15 Mexico             2399790
#> 16 Ukraine            2244084
#> 17 Peru               1932255
#> 18 Indonesia          1786187
#> 19 Czechia            1658778
#> 20 Netherlands        1658587

Summary of new cases during the past 24 hours by country and type (as of 2021-05-25):

library(tidyr)

coronavirus %>% 
  filter(date == max(date)) %>%
  select(country, type, cases) %>%
  group_by(country, type) %>%
  summarise(total_cases = sum(cases)) %>%
  pivot_wider(names_from = type,
              values_from = total_cases) %>%
  arrange(-confirmed)
#> # A tibble: 192 x 4
#> # Groups:   country [192]
#>    country              confirmed death recovered
#>    <chr>                    <dbl> <dbl>     <dbl>
#>  1 India                   208921  4157    295955
#>  2 Brazil                   73453  2173     41347
#>  3 Argentina                24601   576     24477
#>  4 US                       22756   621         0
#>  5 Colombia                 21181   459     17183
#>  6 Iran                     11873   208     14676
#>  7 Turkey                    9375   175     11192
#>  8 Nepal                     8387   169      6404
#>  9 Russia                    7762   385      8579
#> 10 Malaysia                  7289    60      3789
#> 11 Peru                      6966   417     11883
#> 12 Sweden                    6034    30         0
#> 13 Bolivia                   5696   159      3160
#> 14 Spain                     5359    90         0
#> 15 Indonesia                 5060   172      3795
#> 16 Iraq                      4938    27      4279
#> 17 Chile                     4160    37      5394
#> 18 Uruguay                   3971    51      2988
#> 19 Philippines               3966    36      4646
#> 20 Japan                     3918   106      5270
#> 21 Canada                    3700    38      7389
#> 22 Thailand                  3226    26         0
#> 23 Paraguay                  3223   117      2215
#> 24 Italy                     3220   166     11348
#> 25 France                    3155   221       837
#> 26 Switzerland               2770     7         0
#> 27 Bahrain                   2766    18      1535
#> 28 Ukraine                   2730   257     17667
#> 29 Sri Lanka                 2728    26      1228
#> 30 Pakistan                  2724    65      4686
#> 31 Germany                   2578   272     14190
#> 32 Netherlands               2497    13        48
#> 33 Mexico                    2483   265      1814
#> 34 United Kingdom            2417    15         8
#> 35 Greece                    2402    50         0
#> 36 Costa Rica                2370    28       684
#> 37 Kazakhstan                1860     4      2850
#> 38 Bangladesh                1675    40      1279
#> 39 United Arab Emirates      1672     4      1630
#> 40 Kuwait                    1408     5      1158
#> # … with 152 more rows

Plotting the total cases by type worldwide:

library(plotly)

coronavirus %>% 
  group_by(type, date) %>%
  summarise(total_cases = sum(cases)) %>%
  pivot_wider(names_from = type, values_from = total_cases) %>%
  arrange(date) %>%
  mutate(active = confirmed - death - recovered) %>%
  mutate(active_total = cumsum(active),
                recovered_total = cumsum(recovered),
                death_total = cumsum(death)) %>%
  plot_ly(x = ~ date,
                  y = ~ active_total,
                  name = 'Active', 
                  fillcolor = '#1f77b4',
                  type = 'scatter',
                  mode = 'none', 
                  stackgroup = 'one') %>%
  add_trace(y = ~ death_total, 
             name = "Death",
             fillcolor = '#E41317') %>%
  add_trace(y = ~recovered_total, 
            name = 'Recovered', 
            fillcolor = 'forestgreen') %>%
  layout(title = "Distribution of Covid19 Cases Worldwide",
         legend = list(x = 0.1, y = 0.9),
         yaxis = list(title = "Number of Cases"),
         xaxis = list(title = "Source: Johns Hopkins University Center for Systems Science and Engineering"))

Plot the confirmed cases distribution by counrty with treemap plot:

conf_df <- coronavirus %>% 
  filter(type == "confirmed") %>%
  group_by(country) %>%
  summarise(total_cases = sum(cases)) %>%
  arrange(-total_cases) %>%
  mutate(parents = "Confirmed") %>%
  ungroup() 
  
  plot_ly(data = conf_df,
          type= "treemap",
          values = ~total_cases,
          labels= ~ country,
          parents=  ~parents,
          domain = list(column=0),
          name = "Confirmed",
          textinfo="label+value+percent parent")

Data Sources

The raw data pulled and arranged by the Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) from the following resources:

World Health Organization (WHO): https://www.who.int/
DXY.cn. Pneumonia. 2020. https://ncov.dxy.cn/ncovh5/view/pneumonia.
BNO News: https://bnonews.com/index.php/2020/04/the-latest-coronavirus-cases/
National Health Commission of the People’s Republic of China (NHC):
http:://www.nhc.gov.cn/xcs/yqtb/list_gzbd.shtml
China CDC (CCDC): http:://weekly.chinacdc.cn/news/TrackingtheEpidemic.htm
Hong Kong Department of Health: https://www.chp.gov.hk/en/features/102465.html
Macau Government: https://www.ssm.gov.mo/portal/
Taiwan CDC: https://sites.google.com/cdc.gov.tw/2019ncov/taiwan?authuser=0
US CDC: https://www.cdc.gov/coronavirus/2019-ncov/index.html
Government of Canada: https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection/symptoms.html
Australia Government Department of Health:https://www.health.gov.au/news/health-alerts/novel-coronavirus-2019-ncov-health-alert
European Centre for Disease Prevention and Control (ECDC): https://www.ecdc.europa.eu/en/geographical-distribution-2019-ncov-cases
Ministry of Health Singapore (MOH): https://www.moh.gov.sg/covid-19
Italy Ministry of Health: http://www.salute.gov.it/nuovocoronavirus
1Point3Arces: https://coronavirus.1point3acres.com/en
WorldoMeters: https://www.worldometers.info/coronavirus/
COVID Tracking Project: https://covidtracking.com/data. (US Testing and Hospitalization Data. We use the maximum reported value from “Currently” and “Cumulative” Hospitalized for our hospitalization number reported for each state.)
French Government: https://dashboard.covid19.data.gouv.fr/
COVID Live (Australia): https://covidlive.com.au/
Washington State Department of Health:https://www.doh.wa.gov/Emergencies/COVID19
Maryland Department of Health: https://coronavirus.maryland.gov/
New York State Department of Health: https://health.data.ny.gov/Health/New-York-State-Statewide-COVID-19-Testing/xdss-u53e/data
NYC Department of Health and Mental Hygiene: https://www1.nyc.gov/site/doh/covid/covid-19-data.page and https://github.com/nychealth/coronavirus-data
Florida Department of Health Dashboard: https://services1.arcgis.com/CY1LXxl9zlJeBuRZ/arcgis/rest/services/Florida_COVID19_Cases/FeatureServer/0 and https://fdoh.maps.arcgis.com/apps/opsdashboard/index.html#/8d0de33f260d444c852a615dc7837c86
Palestine (West Bank and Gaza): https://corona.ps/details
Israel: https://govextra.gov.il/ministry-of-health/corona/corona-virus/
Colorado: https://covid19.colorado.gov/data)

jcmartinmu / coronavirus Goto Github PK

coronavirus's Introduction

coronavirus

Important Note

Installation

Data refresh

Dashboard

Usage

Data Sources

coronavirus's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs