GithubHelp home page GithubHelp logo

openzh / covid_19 Goto Github PK

View Code? Open in Web Editor NEW
426.0 25.0 178.0 2.72 GB

COVID19 case numbers of Cantons of Switzerland and Principality of Liechtenstein (FL). The data is updated at best once a day (times of collection and update may vary). Start with the README.

Home Page: https://www.zh.ch/de/gesundheit/coronavirus/zahlen-fakten-covid-19.zhweb-noredirect.zhweb-cache.html?keywords=covid19&keyword=covid19#/

License: Creative Commons Attribution 4.0 International

Shell 3.78% JavaScript 1.52% Jupyter Notebook 32.34% Python 62.19% Ruby 0.17%

covid_19's Introduction

OpenZH-logo

GD-logo

GitHub commit Binder

SARS-CoV-2 open government data reported by the Swiss Cantons and the Principality of Liechtenstein

Important note

Find and use high quality data published by our data colleagues of the FOPH for all Cantons and FL:

  1. visualized (Dashboard): https://www.covid19.admin.ch/en/overview
  2. published as 'open government data': https://opendata.swiss/en/dataset/covid-19-schweiz

Aim of this repository

The aim of this repository is to provide open government datasets for SARS-CoV-2 related data reported by the Swiss Cantons and the Principality of Liechtenstein. Since Jun 8, 2020 most cantons report case numbers at least once or twice a week. Updates of cantonal case numbers during weekends are infrequent.

If you have any questions, please don't hestitate to contact us:

List of open government datasets published in this repository

Swiss Cantons and Principality of Liechtenstein

Canton Zurich

Don't forget to take a look at the community contributions.

Swiss Cantons and Principality of Liechtenstein: Unified dataset

General description
This data is generated and validated daily using manual and automated procedures. Note that we only publish data that are reported by the Swiss Cantons and the Principality of Liechtenstein. Thus, gaps result if Swiss Cantons or the Principality of Liechtenstein do not report data for the specific date.

Data

https://github.com/openZH/covid_19/tree/master/fallzahlen_kanton_total_csv_v2
Description: Case numbers for each spatial unit separately
Spatial unit: Swiss cantons and Principality of Liechtenstein
Format: csv
Additional remark: Link to deprecated dataset (data structure has changed)

https://github.com/openZH/covid_19/blob/master/COVID19_Fallzahlen_CH_total_v2.csv
Description: Case numbers for all spatial units in one single file.
Spatial unit: Swiss cantons and Principality of Liechtenstein
Format: csv
Additional remark: Link to deprecated dataset (data structure has changed)

Metadata

Field Name Description Format Note
date Date of notification YYYY-MM-DD
time Time of notification HH:MM
abbreviation_canton_and_fl Abbreviation of the reporting canton Text
ncumul_tested Reported number of tests performed as of date Number Irrespective of canton of residence
ncumul_conf Reported number of confirmed cases as of date Number Only cases that reside in the current canton
new_hosp new hospitalisations since last date Number Irrespective of canton of residence
current_hosp Reported number of hospitalised patients on date Number Irrespective of canton of residence
current_icu Reported number of hospitalised patients in ICUs on date Number Irrespective of canton of residence
current_vent Reported number of patients requiring invasive ventilation on date Number Irrespective of canton of residence
ncumul_released Reported number of patients released from hospitals or reported recovered as of date Number Irrespective of canton of residence
ncumul_deceased Reported number of deceased as of date Number Only cases that reside in the current canton
source Source of the information href
current_isolated Reported number of isolated persons on date Number Infected persons, who are not hospitalised
current_quarantined Reported number of quarantined persons on date Number Persons, who were in 'close contact' with an infected person, while that person was infectious, and are not hospitalised themselves
current_quarantined_riskareatravel Reported number of quarantined persons on date Number People arriving in Switzerland from certain countries and areas, who are required to go into quarantine.

Empty values vs. 0

Value Meaning
0 Zero cases are reported.
empty No value is reported.

Latest updates

The latest updates are visualized here.

Canton / FL Last update (of any variable) Important notes
FL Last update on 2023-04-04
AG Last update on 2022-04-06 Since 2022-03-04 AG is not publishing updated case numbers on its website anymore, but referencing to FOPH. You find respective data via FOPH's API: https://www.covid19.admin.ch/api/data/context
AI Last update on 2022-03-31 Since 2022-03-31 AI is not publishing updated case numbers on its website anymore, but referencing to FOPH. You find respective data via FOPH's API: https://www.covid19.admin.ch/api/data/context
AR Last update on 2022-03-24 Since 2021-01-22 AR is not publishing updated case numbers on its website anymore, but referencing to FOPH. You find respective data via FOPH's API: https://www.covid19.admin.ch/api/data/context
BE Last update on 2022-03-31 Since 2022-03-31 BE is not publishing updated case numbers on its website anymore, but referencing to FOPH. You find respective data via FOPH's API: https://www.covid19.admin.ch/api/data/context
BL Last update on 2023-09-24
BS Last update on 2023-07-05
FR Last update on 2023-01-08
GE Last update on 2024-05-05
GL Last update on 2022-02-09 Since 2022-02-09 GL is not publishing updated case numbers on its website anymore, but referencing to FOPH. You find respective data via FOPH's API: https://www.covid19.admin.ch/api/data/context
GR Last update on 2022-03-31 Since 2022-03-31 GR is not publishing updated case numbers on its website anymore, but referencing to FOPH. You find respective data via FOPH's API: https://www.covid19.admin.ch/api/data/context
JU Last update on 2022-06-03
LU Last update on 2022-12-29
NE Last update on 2024-02-19
NW Last update on 2023-03-30
OW Last update on 2022-12-29
SG Last update on 2023-03-28
SH Last update on 2023-01-29
SO Last update on 2023-03-30
SZ Last update on 2022-03-07 Since 2022-03-07 SZ is not publishing updated case numbers on its website anymore, but referencing to FOPH. You find respective data via FOPH's API: https://www.covid19.admin.ch/api/data/context
TG Last update on 2023-01-09 Since 2022-05-16 TG is updating data only once per week (on Mondays).
TI Last update on 2023-04-26
UR Last update on 2022-03-31 Since 2022-03-31 UR is not publishing updated case numbers on its website anymore, but referencing to FOPH. You find respective data via FOPH's API: https://www.covid19.admin.ch/api/data/context
VD Last update on 2023-01-29
VS Last update on 2022-05-29 Since 2022-05-31 VS is not publishing updated case numbers on its website anymore, but referencing to FOPH. You find respective data via FOPH's API: https://www.covid19.admin.ch/api/data/context
ZG Last update on 2023-01-04
ZH Last update on 2023-05-02 Since 2022-03-14 ZH is updating data only once per week (on Tuesdays). Since 2023-01-03 ZH is not publishing updated case numbers.

Swiss Cantons and Principality of Liechtenstein: More detailed dataset

Data

https://github.com/openZH/covid_19/tree/master/fallzahlen_kanton_alter_geschlecht_csv
Description: Selected cantons publish more detailed datasets.
Spatial unit: Swiss cantons and Principality of Liechtenstein
Format: csv
Additional remark: Not all datasets are maintained.

Maintained datasets

Unmaintained datasets

Metadata for unmaintained datasets

Field Name Description Format Reporting Cantons
Date ZH = Date of test result (NewConfCases) / Date of death (NewDeaths)
BL = Date of death
BS = Date of notification
YYYY-MM-DD
Area Abbreviation of the reporting canton
AgeYear Number ZH,BS,BL
Gender Text ZH,BS,BL
NewConfCases Number of Confirmed Cases Number ZH
NewDeaths Number of Deceased Number ZH,BS,BL
PreExistingCond Pre-Existing Conditions Text BL,BS

Canton Zürich: Unified dataset

Data

https://github.com/openZH/covid_19#swiss-cantons-and-principality-of-liechtenstein-unified-dataset
Description: open data swiss: COVID_19 Fallzahlen Kanton Zürich Total

Canton Zürich: More detailed dataset

Data

https://github.com/openZH/covid_19/blob/master/fallzahlen_kanton_alter_geschlecht_csv/COVID19_Fallzahlen_Kanton_ZH_altersklassen_geschlecht.csv
Description: open data swiss: COVID_19 Verteilung der Fälle im Kanton Zürich nach Altersklasse, Geschlecht und Kalenderwoche
Spatial unit: Canton Zürich
Format: csv
Additional remark: Comparable data for the canton of Thurgau is published at opendata.swiss.

Metadata

Spaltenname / Fieldname Beschreibung (DE) Description (EN) Format
Week Kalenderwoche des Befundes (NewConfCases) / Todesdatums (NewDeaths) Calendar week of test result (NewConfCases) / Date of death (NewDeaths) Zahl
Year Jahr des Befundes (NewConfCases) / Todesdatums (NewDeaths) Year of test result (NewConfCases) / Date of death (NewDeaths) Zahl
Area Kanton Abbreviation of the reporting canton Text
AgeYearCat 10-Jahres Altersklassen Age groups (10 year steps) Text
Gender Geschlecht Gender Text
NewConfCases Neue bestätigte Fälle Newly confirmed number of cases Zahl
NewDeaths Neue Todesfälle Newly confirmed number of deaths Zahl

Data

https://github.com/openZH/covid_19/blob/master/fallzahlen_kanton_alter_geschlecht_csv/COVID19_Einwohner_Kanton_ZH_altersklassen_geschlecht.csv
Description: Inhabitants per age category and gender.
Spatial unit: Canton Zürich
Format: csv

Metadata

Spaltenname / Fieldname Beschreibung (DE) Description (EN) Format
Year Stichtag ist jeweils der 31.12 des angegebenen Jahres The reporting date is the 31.12 of the indicated year Zahl
Area Kanton Abbreviation of the reporting canton Text
AgeYearCat 10-Jahres Altersklassen Age groups (10 year steps) Text
Gender Geschlecht Gender Text
Inhabitants Anzahl Einwohner Number of inhabitants Zahl

Data

https://raw.githubusercontent.com/openZH/covid_19/master/fallzahlen_kanton_zh/COVID19_Anteil_positiver_Test_pro_KW.csv
Description: opendata.swiss: COVID_19 Anteil der positiven SARS-CoV-2 Tests im Kanton Zürich nach Kalenderwoche
Spatial unit: Canton Zürich
Format: csv
Additional remark:

Metadata

Spaltenname / Fieldname Beschreibung (DE) Description (EN) Format
Woche_von Beginn der Kalenderwoche (Datum) Start of the calendar week (Date) YYYY-MM-DD
Woche_bis Ende der Kalenderwoche (Datum) End of the calendar week (Date) YYYY-MM-DD
Kalenderwoche Kalenderwoche Abbreviation of the reporting canton Zahl
Anzahl_positiv Anzahl positiver Tests Number of positive tests Text
Anzahl_negativ Anzahl negativer Tests Number of negative tests Text
Anteil_positiv Anteil der positiven Tests an allen Tests Share of positive tests Zahl

Canton Zürich: Postal codes (Postleitzahl)

Data

https://github.com/openZH/covid_19/blob/master/fallzahlen_plz/fallzahlen_kanton_ZH_plz.csv
Description: opendata.swiss: COVID_19 Fallzahlen Kanton Zürich nach Bezirk und Kalenderwoche
Spatial unit: Canton Zürich
Format: csv
Additional remark:

Metadata

Fieldname Beschreibung (DE) Description (EN) Format
PLZ Postleitzahl* Postalcode* Zahl
Date Datum des Befundes Date of test result (NewConfCases) Zahl
Population Einwohner mit Hauptwohnsitz Inhabitants with main residency Zahl
NewConfCases_7days Neue bestätigte Fälle in den letzten sieben Tagen (Kategorien) Newly confirmed cases (Categories) Text

Geodata

https://github.com/openZH/covid_19/blob/master/fallzahlen_plz/PLZ_gen_epsg4326_F_KTZH_2020.json

https://github.com/openZH/covid_19/blob/master/fallzahlen_plz/PLZ_gen_epsg2056_F_KTZH_2020.json

Canton Zurich: Districts (Bezirk)

Data

https://github.com/openZH/covid_19/blob/master/fallzahlen_bezirke/fallzahlen_kanton_ZH_bezirk.csv
Description: opendata.swiss: COVID_19 Verteilung der Fälle im Kanton Zürich nach Postleitzahl
Spatial unit: Canton Zürich
Format: csv

Metadata

Fieldname Beschreibung (DE) Description (EN) Format
DistrictId Bezirks-ID (BFS-Nummer)* District (BFS-Id)* Zahl
District Bezirksname* District name* Text
Population Wohnbevölkerung Population Zahl
Week Kalenderwoche des Befundes (NewConfCases) / Todesdatums (NewDeaths) Calendar week of test result (NewConfCases) / Date of death (NewDeaths) Zahl
Year Jahr des Befundes (NewConfCases) / Todesdatums (NewDeaths) Year of test result (NewConfCases) / Date of death (NewDeaths) Zahl
NewConfCases Neue bestätigte Fälle Newly confirmed number of cases Zahl
NewDeaths Neue Todesfälle Newly confirmed number of deaths Zahl
TotalConfCases Total der bestätigten Fälle (kumuliert) Total of confirmed cases (cumulated) Zahl
TotalDeaths Total der Todesfälle (kumuliert) Total of confirmed deaths (cumulated) Zahl

Geodata

https://github.com/openZH/covid_19/blob/master/fallzahlen_bezirke/BezirkeAlleSee_gen_epsg4326_F_KTZH_2020.json

https://github.com/openZH/covid_19/blob/master/fallzahlen_bezirke/BezirkeAlleSee_gen_epsg2056_F_KTZH_2020.json

Canton Zurich: Travel self quarantine

Data

https://github.com/openZH/covid_19/blob/master/fallzahlen_kanton_zh/COVID19_Einreisequarantaene_pro_KW.csv
Description: opendata.swiss: COVID_19 Einreisequarantäne im Kanton Zürich
Spatial unit: Canton Zürich
Format: csv

Metadata

Fieldname Beschreibung (DE) Description (EN) Format
Kalenderwoche Kalenderwoche Calendar week Zahl
Einreiseland Aufenthaltsland vor der Einreise (Risikogebiete gemäss BAG-Liste) Country of stay before entry (risk areas) Text
Anzahl_Einreisende Anzahl Einreisende aus Risikogebiet Number of people returning from risk area Zahl

Canton Zurich: Intensive care occupancy

Data

https://github.com/openZH/covid_19/blob/master/fallzahlen_kanton_zh/COVID19_Belegung_Intensivpflege.csv
Description: opendata.swiss: COVID_19 Belegung Intensivpflege Kanton Zürich
Spatial unit: Canton Zürich
Format: csv

Metadata

Fieldname Description (EN) Format
date Date of notification YYYY-MM-DD
time Time of notification HH:MM
abbreviation_canton_and_fl Abbreviation of the reporting canton Text
hospital_name Full name of the hospital Text
current_icu_service_certified Reported number of certified 'Intensive Care Unit' (ICU) beds on date and time Number
current_icu_target_covid Target number of Covid19 patients in whose treatment a hospital would currently have to participate. (Target values are defined by the Health Department Canton Zurich together with the hospitals.) Number
current_hosp_covid Reported number of hospitalised Covid19 patients on date. (These data are communicated by the Health Department Canton Zurich on weekdays, and available here: https://github.com/openZH/covid_19/tree/master#swiss-cantons-and-principality-of-liechtenstein-unified-dataset) Number
current_icu_covid Reported number of hospitalised Covid19 patients in ICU on date. Number
current_vent_covid Reported number of hospitalised Covid19 patients requiring invasive ventilation on date. (These data are communicated by the Health Department Canton Zurich on weekdays, and available here: https://github.com/openZH/covid_19/tree/master#swiss-cantons-and-principality-of-liechtenstein-unified-dataset) Number
current_icu_not_covid Reported number of hospitalised non-Covid19 patients in ICU on date. Number
current_icu_service_certified_operated Reported number of currently operated certified 'Intensive Care Unit' (ICU) beds on date and time Number
source Source URL of the data reported. String

Canton Zurich: Variants of Concern

Note: ZH data is deprecated (2021-02-12) - this resource will not be updated further from 2021-02-12 as the week of 2021-02-15 will be used to analyse the completeness of the collection of available data and adjust the approach to which VOCs are tested.
Since 2021-02-19 FOPH publishes data for all Cantons ("virusVariants", https://www.covid19.admin.ch/api/data/context).
Variants of concern ('VOC') can not be detected by 'rapid' tests, but can be detected by PCR tests. Virus mutations are classified as being of concern because, among other things, they are more infectious than the wild type of the virus.

Data

https://github.com/openZH/covid_19/blob/master/fallzahlen_kanton_zh/COVID19_VOC_Kanton_ZH.csv
Description: Ressource: "COVID_19 PCR-Tests und besorgniserregende Virusmutationen im Kanton Zürich"
Spatial unit: Canton Zürich
Format: csv

Metadata

Fieldname Beschreibung (EN) Format
date Date of notification YYYY-MM-DD
new_pcr_pos Number of newly positive PCR tests Number
new_voc Number of newly detected variants of concern ('VOC') Number

Community Contributions

Visualization of Swiss and Cantonal Case Numbers over Time

ArcGIS Dashboard

corona-data.ch

Interactive Small Multiples of Case Numbers by Canton

shellyBits Interactive Dashboard

REST-API

Estimated reproduction number by Canton

Data for Basel-Stadt

COVID-19 Data Hub

Visualization of Covid-19 cases in Switzerland

Many thanks for the great work!

covid_19's People

Contributors

amslera avatar baryluk avatar borisdjakovic avatar calmyournerves avatar claudia013 avatar corinnehuegli avatar davidezollino avatar dkoltg avatar dominikgehl avatar ebeusch avatar fab-benz avatar fabian avatar gaberoo avatar gd-zh avatar gmacauda avatar janetzkoa avatar jb3-2 avatar je1982 avatar judithbouman2412 avatar kalakaru avatar kks-pmt avatar maekke avatar metaodi avatar mmznrstat avatar sarahnadeau avatar simgraworldwide avatar statovsky avatar tlorusso avatar viktoria023 avatar zukunft avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

covid_19's Issues

Primitive scraper for ZH

echo ZH; d=$(curl --silent https://gd.zh.ch/internet/gesundheitsdirektion/de/themen/coronavirus.html | egrep "Im Kanton Zürich sind zurzeit|\\(Stand"); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | sed -E -e 's/.*Stand (.+) Uhr.*/\1/'; echo -n "Confirmed cases: "; echo "$d" | sed -e 's/ /\n/g' | egrep '[0-9]+' | head -1
ZH
Scraped at 2020-03-21T15:25:55+00:00
Date and time: 20.3.2020, 16.30
Confirmed cases: 773

Primitive scraper for NE

echo NE; d=$(curl --silent "https://www.ne.ch/autorites/DFS/SCSP/medecin-cantonal/maladies-vaccinations/Pages/Coronavirus.aspx" | grep 'Nombre de cas confirmés'); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | sed -E -e 's/^.*>Neuchâtel(&#160;)* +([^<]+)<\/span>.*$/\2/'; echo -n "Confirmed cases: "; echo "$d" | sed -E -e 's/<br>/\n/g' | grep -A 3 ">Neuchâtel" | egrep "Nombre de .* confirmés" | sed -E -e 's/^.*[^0-9]+([0-9]+) pers.*$/\1/'; echo -n "Deaths: "; echo "$d" | sed -E -e 's/<br>/\n/g' | grep -A 3 ">Neuchâtel" | egrep "Nombre.* décès" | head -1 | sed -E -e 's/^.*[^0-9]+ ([0-9]+)( pers.*|<\/strong).*$/\1/'
NE
Scraped at 2020-03-21T17:11:14+00:00
Date and time: 21.03.2020, 15h30
Confirmed cases: 177
Deaths: 2

Primitive scraper for BS

echo BS; URL=$(curl --silent https://www.gd.bs.ch/ | egrep 'Tagesbulletin.*Corona' | grep href | head -1 | awk -F '"' '{print $2;}'); d=$(curl --silent "https://www.gd.bs.ch/${URL}" | grep "positive Fälle"); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | sed -E -e 's/^.*Stand [A-Za-z]*,? (.+), insgesamt.*$/\1/'; echo -n "Confirmed cases: "; echo "$d" | sed -E -e 's/^.*insgesamt ([0-9]+) positive.*$/\1/'
BS
Scraped at 2020-03-21T16:47:22+00:00
Date and time: 21. März 2020, 10 Uhr
Confirmed cases: 299

Very sloppy. It goes to the gd.bs.ch homepage, and takes the first bulletin, then follows it and try to parse. It looks last 6 days the text was relatively consistent in the form, so the script should work fine.

Enable Esri Data Format

Es wäre gut, wenn die Daten in diesem Format (gleich wie Italien und China) aufbereitet werden:

Pro Tag eine Zeile für alle Kantone
https://github.com/zdavatz/covid19_ch/blob/master/data-cantons-csv/dd-covid19-ch-cantons-20200318-example.csv

Pro Tag eine Zeile für die ganze Schweiz.
https://github.com/zdavatz/covid19_ch/blob/master/data-switzerland-csv/dd-covid19-ch-switzerland-20200318-example.csv

Dann kann man die Daten direkt ins Esri reinziehen. Siehe z.B. Italien, JohnHopkins

Open Source Helps!

Thanks for your work to help the people in need! Your site has been added! I currently maintain the Open-Source-COVID-19 page, which collects all open source projects related to COVID-19, including maps, data, news, api, analysis, medical and supply information, etc. Please share to anyone who might need the information in the list, or will possibly contribute to some of those projects. You are also welcome to recommend more projects.

http://open-source-covid-19.weileizeng.com/

Cheers!

einheitliche Datenfiles der Kantone

Besteht die Möglichkeit, dass wir für alle Kantone ein einheitliches CSV-Format kriegen? Am besten wäre ein Tool, welches das Format validiert vor dem Upload.

Primitive scraper for TG

echo TG; d=$(curl --silent https://www.tg.ch/news/fachdossier-coronavirus.html/10552 | egrep "<li>Anzahl bestätigter|<em>Stand"); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | sed -E -e 's/^.*Stand ([^<]+)<.*$/\1/'; echo -n "Confirmed cases: "; echo "$d" | grep 'Anzahl' | sed -E -e 's/.* ([0-9]+)<.*$/\1/'
TG
Scraped at 2020-03-21T15:30:42+00:00
Date and time: 21.3.20
Confirmed cases: 56

GE, SZ, VG failed

@baryluk

I ran it again, now I get:

GE - - - FAILED
SZ - - - FAILED
VD - - - FAILED

Can you add a verbose option, to show where it fails? Does that make sense?

NW scraper fails

NW changed the wording on their site.

New:

Bisher sind 42 Personen im Kanton Nidwalden positiv auf das Coronavirus getestet worden

Primitive scraper for JU

echo JU; d=$(curl --silent --user-agent "Mozilla Firefox Mozilla/5.0; openZH covid_19 at github" "https://www.jura.ch/fr/Autorites/Coronavirus/Accueil/Coronavirus-Informations-officielles-a-la-population-jurassienne.html" | egrep -B 2 'Situation .*2020'); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | grep Situation | sed -E -e 's/^.*Situation (.+)<\/em.*$/\1/'; echo -n "Confirmed cases: "; echo "$d" | egrep "<p.*<strong>[0-9]+" | sed -E -e 's/^.*>([0-9]+)<.*$/\1/'
JU
Scraped at 2020-03-21T17:21:17+00:00
Date and time: 20 mars 2020 (17h)
Confirmed cases: 29

better monitoring of the updating process

As more and more people is involved (yay!) in updating the data and building scrapers, we might have to start thinking about how to monitor the updating process, and think about how we can ensure that the data is being updated even if one of the scrapers fails or someone forgets to check for new data.

Do you habe suggestions how we could manage this @metaodi @baryluk @ebeusch @herrstucki @andreasamsler @zdavatz ?

A table in the Readme (or somewhere else?) which is refreshed automatically after each push to a single file might help, similar to what @baryluk has created here:

#61

The table we have now is built by hand.

Add neighbouring countries' regions

As the borders are not closed (and even if closed, wouldn't be foolproof), it would be interesting to collect and display the contamination rates of neighbouring France/Italy/Germany/Austria; ideally not the country-wide average, but the border regions only.

For France, one can find the numbers for Ain and Haute-Savoie faily clearly in "point de situation"-titled PDFs:
https://www.auvergne-rhone-alpes.ars.sante.fr/liste-communiques-presse
Jura, Belfort, Doubs don't have clear separate numbers so far:
https://www.bourgogne-franche-comte.ars.sante.fr/liste-communiques-presse
Haut-Rhin has again clear numbers in PDFs here:
https://www.grand-est.ars.sante.fr/liste-communiques-presse?field_archive_ars_value=0

An official repository here:
https://www.data.gouv.fr/fr/datasets/donnees-relatives-a-lepidemie-du-covid-19/#_
Otherwise an unofficial data repository is here:
https://github.com/opencovid19-fr/data

Italy has a faily complete official repository from which one could pull numbers:
https://github.com/pcm-dpc/COVID-19

Found this for Germany, at Landkreise level:
https://experience.arcgis.com/experience/478220a4c454480e823b17327b2bf1d4/page/page_1/
Here is an unofficial repository, but I can't vouch for it:
https://github.com/marlon360/rki-covid-api

And this for Austria:
https://info.gesundheitsministerium.at/
Again, maybe scraping from Wikipedia is more feasible, with the risk it may be unreliable or defaced.

Terminology clarification

Dear all,

thanks a lot for keeping this repo up-to-date! There seems to be some confusion regarding the terminology: the infection rates refer to the infection with SARS-Cov-2, whereas the disease caused by the virus is called Covid-19.

Best wishes,
-Filippo

Scraper for VD

Here is a python scraper for Canton VD.

Hopefully it will do the job ongoing, but it's not perfect as I've had to extract the data provided in text format as I didn't find out how to download the data from datawrapper (javascript).
Also, the data provided there doesn't include data prior to 10/03/2020.

Anyways, hope this helps.

# -*- coding: utf-8 -*-
from selenium import webdriver
from bs4 import BeautifulSoup
import numpy as np
import pandas as pd

### Watch-out: installing Selenium requires Gekko and it may be easier to configure it with Chrome
geckk=r'C:\Program Files (x86)\Mozilla Firefox\firefox.exe'

'''Documentation & resources to help set-up & use selenium:  
    https://www.tutorialspoint.com/python_web_scraping/python_web_scraping_dynamic_websites.htm
    https://www.selenium.dev/documentation/en/webdriver/web_element/
    https://realpython.com/modern-web-automation-with-python-and-selenium/
    https://stackoverflow.com/questions/7861775/python-selenium-accessing-html-source
    https://stackoverflow.com/questions/51273995/selenium-python-dynamic-table
'''

### Set options for Selenium 
options = webdriver.FirefoxOptions()
options.headless = True
options.add_argument("disable-gpu")
options.add_argument("headless")
options.add_argument("no-default-browser-check")
options.add_argument("no-first-run")
options.add_argument("no-sandbox")
options.add_argument("marionette=True")
options.add_argument("--test-type")
options.set_preference('browser.download.manager.showWhenStarting', False)
options.set_preference('browser.helperApps.neverAsk.saveToDisk', 'text/csv')
options.set_preference("browser.download.folderList",2)
options.set_preference("browser.download.manager.showWhenStarting",False)
options.set_preference("browser.download.dir","c:\\downloads")

profile = webdriver.FirefoxProfile()
profile.accept_untrusted_certs = True

driver = webdriver.Firefox(firefox_binary=geckk,options=options, firefox_profile=profile)

### Download
driver.get("https://datawrapper.dwcdn.net/tr5bJ/16/")
soup=BeautifulSoup(driver.page_source, 'html.parser')

### Get the data required (didn't manage to do differently than browsing across text)
data_cursor_start=soup.text.find('chartData: "')
data_cursor_stop=data_cursor_start+soup.text[data_cursor_start:].find('",')
zoom=str(soup.text)[data_cursor_start:data_cursor_stop]

### Create DataFrame
table_lines=zoom.split('\\n')
line_array=[]
for each_line in table_lines[1:]:
    line=each_line.split('\\t')
    line_array += line
line_matrix= [line_array[x:x+5] for x in range(0, len(line_array),5)]

df=pd.DataFrame(line_matrix, columns=['date', 'ncumul_hosp','ncumul_released','ncumul_deceased','ncumul_conf'])
df['date']=pd.to_datetime(df['date'],yearfirst=True)
df['abbreviation_canton_and_fl']='VD'
df['time']=np.NaN
df['ncumul_tested']=np.NaN
df['ncumul_ICU']=np.NaN
df['ncumul_vent']=np.NaN
df['source']='https://datawrapper.dwcdn.net/tr5bJ/16/'

### Format & Save CSV
new_order=['date','time','abbreviation_canton_and_fl','ncumul_tested','ncumul_conf',\
           'ncumul_hosp','ncumul_ICU','ncumul_vent','ncumul_released','ncumul_deceased','source']
df=df.T.reindex(new_order).T.set_index('date')

df.to_csv('COVID19_Fallzahlen_Kanton_VD_total.csv')

Fix UR scraper

It just broken, because website changed format a bit.

It is now in form of a table. Should be easy to fix.

On it.

Cases für Solothurn

Gibt es Daten ebenfalls für Solothurn, sonst wären ja alle Kanton da

Also, COVID19_Fallzahlen_Kanton_SO_total.csv

Data from SZ and FR missing

Die Kantone Schwyz (77 Infizierte, bisher kein Todesfall) und Freiburg (202 Fälle, 4 Verstorbene) liefern ihre Zahlen nur auf Anfrage. Da keine flächendeckenden Tests gemacht würden, seien die Fallzahlen «keine relevante Zahl», begründet Freiburg seinen passive Informationspolitik.

see: https://www.blick.ch/news/politik/aktuelle-coronavirus-zahlen-der-schweiz-so-informieren-kantone-ueber-die-corona-fallzahlen-id15810865.html

@baryluk all other Cantons should now be available on the website.

Primitive scraper for GR

echo GR; d=$(curl --silent "https://www.gr.ch/DE/institutionen/verwaltung/djsg/ga/coronavirus/info/Seiten/Start.aspx" | egrep ">Fallzahlen|Best(ä|&auml;)tigte F(ä|&auml;)lle|Personen in Spitalpflege|Verstorbene Personen"); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | grep Fallzahlen | sed -E -e 's/.*Fallzahlen ([^<]+)<.*/\1/';  echo -n "Confirmed cases: "; echo "$d" | egrep "Best(ä|&auml;)tigte F(ä|&auml;)lle" | sed -E -e 's/( |<)/\n/g' | egrep '[0-9]+' | head -1; echo -n "Deaths: "; echo "$d" | grep "Verstorbene" | sed -E -e 's/( |<)/\n/g' | egrep '[0-9]+' | head -1
GR
Scraped at 2020-03-21T15:43:48+00:00
Date and time: 20.03.2020
Confirmed cases: 213
Deaths: 3

Bitte Files in den Ordnern unterscheiden

Könnt Ihr bitte die Files in unterschiedlichen Ordner sammeln, Vorschlag:

  • fallzahlen_kanton_total_csv -> für alle Updates. Ein File pro Kanton.
  • fallzahlen_kanton_alter_geschlecht_csv -> für detailliertere Infos. Ein File pro Kanton.

Die Files in den Ordner sollten validiert sein, d.h. gleiche Anzahl Spalten, gleiche Spalten Titel.

Primitive scraper for GE

echo GE; d=$(curl --silent "https://www.ge.ch/document/point-coronavirus-maladie-covid-19/telecharger" | pdftotext - - | egrep "Dans le canton de Genève|Actuellement.*cas ont|décédées"); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | grep "Dans le" | sed -E -e 's/.*\((.*)\).*$/\1/'; echo -n "Confirmed cases: "; echo "$d" | grep "cas ont" | sed -E -e 's/( |<)/\n/g' | egrep '[0-9]+' | head -1; echo -n "Deaths: "; echo "$d" | grep "décédées" | sed -E -e 's/^.*([0-9]+) [^,]* décédées.*$/\1/'
GE
Scraped at 2020-03-21T15:58:51+00:00
Date and time: 20.03 à 8h00
Confirmed cases: 873
Deaths: 7

using pdftotext from poppler-utils (using version 0.71.0).

There are some extra data in pdf, like number of hospitalized cases, number of cases with care, and number of cases with intensive care.

There is also this pdf, with a nice table and two day history (yesterday and today)
https://www.ge.ch/document/covid-19-situation-epidemiologique-geneve/telecharger
but the table and graphs are raster images, so really not conductive to parsing. It can be done, but it is better to ask them to improve the website instead.

Primitive scraper for SH

echo SH; d=$(curl --silent 'https://sh.ch/CMS/content.jsp?contentid=3209198&language=DE&_=1584807070095' | grep data_post_content | sed -E -e 's/\\n/\n/g'); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | grep "Im Kanton Schaffhausen gibt es" | sed -E -e 's/^.*\(([0-9.]+)\).*$/\1/'; echo -n "Confirmed cases: "; echo "$d" | grep "best&auml;tige" | sed -E -e 's/^.*strong>([0-9]+)[^0-9]*$/\1/'
SH
Scraped at 2020-03-21T16:19:46+00:00
Date and time: 20.03.2020
Confirmed cases: 14

This is really hacky, because sh.ch is absolutely abhorrent with amount of JavaScript and content that is loaded dynamically. But it works.

The URL that generates json content, was reverse engineered from looking at network traffic when loading this site: https://sh.ch/CMS/Webseite/Kanton-Schaffhausen/Beh-rde/Verwaltung/Departement-des-Innern/Gesundheitsamt-3209198-DE.html

I have no idea what the _ parameter is, some kind of timestamp I think, but I am not sure. Could be caching, and might not work. To be seen.

Primitive scraper for LU

echo LU; d=$(curl --silent 'https://gesundheit.lu.ch/themen/Humanmedizin/Infektionskrankheiten/Coronavirus' | grep "Im Kanton Luzern gibt es" | awk -F '>' '{print $3;}'); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | sed -E -e 's/^.*Stand: (.+)(Uhr)?\).+$/\1/'; echo -n "Confirmed cases: "; echo "$d" | sed -e 's/ /\n/g' | egrep '[0-9]+' | head -1;
LU
Scraped at 2020-03-21T15:21:08+00:00
Date and time: 21. M&auml;rz 2020, 11:00 Uhr
Confirmed cases: 109

A quick script for computing latest totals

In the directory with data:

for f in *.csv; do awk -F , '{if ($5) { print $1, $3, $5; }}' "$f" | tail -1; done | awk 'BEGIN { sum = 0; } { sum += $3; } END { print sum; }'

6262

Getting latest confirmed cases by subdivision:

for f in *.csv; do awk -F , '{if ($5) { print $1, $3, $5; }}' "$f" | tail -1; done |  sort -r -n -k 3
2020-03-20 VD 1432
2020-03-21 TI 918
2020-03-20 GE 873
2020-03-20 ZH 773
2020-03-20 BE 377
2020-03-21 BS 299
2020-03-21 BL 282
2020-03-20 VS 282
2020-03-20 GR 213
2020-03-20 AG 165
2020-03-20 NE 159
2020-03-21 LU 109
2020-03-20 SG 98
2020-03-21 TG 56
2020-03-20 ZG 48
2020-03-20 FL 40
2020-03-20 JU 29
2020-03-20 NW 28
2020-03-19 GL 17
2020-03-20 SH 14
2020-03-15 SZ 13
2020-03-21 UR 12
2020-03-18 AR 11
2020-03-09 FR 11
2020-03-14 AI 2
2020-03-13 OW 1

Feel free to include it in the repo, or readme. Public Domain. Signed off, Witold Baryluk.

Scraper FR

Received an email answer from FR that they will start putting the numbers on their website as of today (time unclear) and keep updating them from today onwards.

Their email was:
Bonjour Madame
Nous avons bien reçu votre demande.
A ce sujet, dès aujourd’hui les chiffres concernant le canton de Fribourg seront publiés sur le site de l’Etat de Fribourg (www.fr.ch) et seront régulièrement mis à jour.

This serves as a reminder to myself or others to check the website periodically today.

Rmd or Jupyter notebooks

Does someone know of Rmd or Jupyter notebooks that read and visualise the data?

If yes it could be nice to include one or two of them here. We can also add a mybinder.org link to let people run them.

Great work!

UR scraper fails

Currently the UR scraper fails:

Run the scraper...
Traceback (most recent call last):
  File "/home/runner/work/covid_19/covid_19/scrapers/parse_scrape_output.py", line 170, in <module>
    print("{:2} {:<16} {:>7} {:>7} OK {}".format(abbr, date, cases, deaths if not deaths is None else "-", scrape_time))
TypeError: unsupported format string passed to NoneType.__format__
Export database to CSV...

cc @baryluk

Primitive scraper for UR

echo UR; d=$(curl --silent "https://www.ur.ch/themen/2920" | grep "Personen gestiegen"); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | sed -E -e 's/^.*\(Stand[A-Za-z ]*, ([^\)]+)\).*$/\1/' ; echo -n "Confirmed cases: "; echo "$d" | sed -E -e 's/^.* ([0-9]+) Personen gestiegen.*$/\1/'
UR
Scraped at 2020-03-21T16:57:49+00:00
Date and time: 21. März 2020, 8.00 Uhr
Confirmed cases: 12

Update Readme

Currently the readme doesn't reflect that most canton's are scraped by @baryluk 's army of scrapers.

Primitive scraper for AG

echo AG; URL=$(curl --silent 'https://www.ag.ch/de/themen_1/coronavirus_2/lagebulletins/lagebulletins_1.jsp' | sed -E -e 's/<li>/\n<li>/g' | grep Bulletin | grep pdf | grep href | awk -F '"' '{print $6;}' | head -1); d=$(curl --silent "https://www.ag.ch/${URL}" | pdftotext - - | egrep -A 2 "(Aarau, .+Uhr|Stand [A-Za-z]*, [0-9]+)"); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | grep Aarau, | sed -E -e 's/.*, (.+)/\1/'; echo -n "Confirmed cases: "; echo "$d" | egrep '^[0-9]+$'
AG
Scraped at 2020-03-21T17:36:17+00:00
Date and time: 20. März 2020 15.00 Uhr
Confirmed cases: 168

totals file vs cantonal files

Thanks for aggregating these data. We are using the file
https://github.com/openZH/covid_19/blob/master/COVID19_Cases_Cantons_CH_total.csv
to feed data into neherlab.org/covid19/

However, it seems the file we are using is not in sync with the files in

https://github.com/openZH/covid_19/tree/master/fallzahlen_kanton_total_csv

please advise as to files we should be using and what is kept up-to-date.

Our parser is here:
https://github.com/neherlab/covid19_scenarios_data/blob/master/parsers/switzerland.py

Primitive scraper for BE

echo BE; d=$(curl --silent 'https://www.besondere-lage.sites.be.ch/besondere-lage_sites/de/index/corona/index.html' | grep -A 20 'table cellspacing="0" summary="Laufend aktualisierte Zahlen'); echo "Scraped at $(date --iso-8601=seconds)"; echo -n "Date and time: "; echo "$d" | grep "Stand:" | sed -E -e 's/^.*Stand: (.+)\).*$/\1/'; echo -n "Confirmed cases: "; echo "$d" | egrep '<td .*<strong>[0-9]+<' | sed -E -e 's/.*>([0-9]+)<.*/\1/'; echo -n "Deaths: "; echo "$d" | egrep '<td[^<>]*>[0-9]+</td>' | sed -E -e 's/.*>([0-9]+)<.*/\1/';
BE
Scraped at 2020-03-21T16:34:20+00:00
Date and time: 21. März 2020
Confirmed cases: 377
Deaths: 3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.