Comments (12)
To calculate currently infected people worldwide (active cases), the only data what is needed is the daily new cases and daily deaths.
active = sum(daily_new[today - 2.weeks: today])
Recoveries and deaths come with a 2 weeks gap relative to active cases.
deaths = daily_deaths[today - 2.weeks: today]
recoveries = daily_new[today - 4.weeks: today - 2.weeks] - deaths
from covid-19-data.
For those looking for code to parse active cases from the JHU data sets: JHUData.cs
This C# file is part of the COV2CON project on GitHub.
Problem is that most countries do not provide data or only incomplete data (e.g. Belgium did have recovery data during the first wave but nowadays there seems to be no new data). In the end I went with the inference as suggested by @javierconcha and plotted that over the provided data where available, for quite a few countries that have recovery data it looks like they calculated it in a similar fashion. So I stopped using the JHU recovery data and went with my own estimation.
from covid-19-data.
Our current data source on this is the European CDC (https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide) which do not currently provide figures on recoveries.
For now we will focus on confirmed cases and deaths, but may think about extending this work in the future.
from covid-19-data.
It would still be nice to see several data sources for cross-checking? I have no idea where all these institutions get the data from my county, because there is official data publication source even for citizens.
from covid-19-data.
I came here while looking at the visualization tool on this page:
https://www.visualcapitalist.com/infection-trajectory-flattening-the-covid19-curve/
First, thanks for the great work.
Second, I agree with @abitrolly about having recovered case, as a way to infer active cases (active =total-recovered-deaths), and not only cumulative data, would be very useful.
I would argue that to show the active cases time series would make a better case in order to understand the "flattening the curve" concept as the typical figure showed to the general public to explain this concept is active cases and not cumulative cases. I am referring to this curve (also shown in linked I mentioned at the beginning):
Flattening the curve figure
Lastly, I tried to find a visualization tool like the one used in the first link I mentioned but for active cases, and surprisingly I did not find any. Therefore, a similar visualization for active cases would make this repo even more relevant.
from covid-19-data.
For those looking for code to parse active cases from the JHU data sets: JHUData.cs
This C# file is part of the COV2CON project on GitHub.
from covid-19-data.
@pegasone by making it in Python you can submit a PR.
from covid-19-data.
@toxyl This is a plot of recovered cases in Belgium as of yesterday, 2020-09-28. The curve doesn't plateau out so any stretches where data are missing must be quite short.
The first step is obtaining reported data, however incomplete or inconsistent. My sample code pulls active cases from the JHU data sets only for those dates where total cases, recovered cases, and deaths are available.
from covid-19-data.
@pegasone You should compare the recovered cases (as reported by JHU) with the total cases and the deaths to see if they make sense. What you are looking at are the total numbers, so by today (2020-10-01) there are 118,223 cases of which 10,014 have died. Your graph shows about 19,000 recoveries, so if we simplify the calculation we get 120,000 cases - 10,000 deaths - 20,000 recoveries = 90,000 active cases. Is that realistic? Not quite, which becomes pretty clear in your second graph: active cases should have gone down after the first wave and started going back up when the second wave hit. But what your graph shows is that active cases are only going up.
To be more realistic you have to take the new infections X days ago (my best estimate was around 13 days) and subtract the current deaths (i.e. 13 days after being infected one either recovers or dies). This approach also works with the total numbers and then it looks roughly like this:
from covid-19-data.
@toxyl "Cases" actually means PCR-positive cases. These may continue to accrue until potentially all the population is tested, especially in countries that have the means to do it. Therefore, I am not surprised to see "active cases" increasing with time at this stage. As a PCR-positive individual is not necessarily sick (i.e., has no signs and symptoms), this molecular testing method seems to be a poor choice for evaluating the COVID-19 status in a region, and unfortunately it drives important decisions such as when and where to initiate lockdowns, quarantines etc. If only patients presenting with COVID-19 signs and symptoms were reported as confirmed cases then I would agree with your rationale. Besides, technically the duration of the disease is from 2 to 6 weeks and the distribution may vary with the region (e.g., reported median incubation period of 5 days vs 8-14 days depending on publication).
from covid-19-data.
@pegasone What you are referring to is called "confirmed" in the JHU dataset. Which is not the same as active cases. And at least here in the Netherlands there wasn't much testing in the beginning, only individuals with COVID-19 signs were tested and therefore only those appeared in the "confirmed" data of the JHU. And this should probably apply to most other countries because all of them first had to get hold of enough tests before they could start testing people without signs of infections.
To get definitions straight: an active case is someone who at the given point in time has the disease. Which consequentially means that that active case has to disappear from the record after 2-6 weeks because they have either recovered or died. And this is not effected by whether a case is asymptomatic or not, as asymptomatic cases also recover from it. Following this definition it not logical for active cases to only increase because that would imply that infected people never recover.
And yes, I know that the disease technically can take longer than 2 weeks, but that isn't easily accounted for without actual recovery data. Furthermore, a part of the incubation period has already passed when someone appears as a confirmed case in the data, i.e. my estimate of roughly 13 days is the time after the case has been reported, so before I consider an infected individual recovered or dead about 3-4 weeks since infection have passed. And like I said, it is an estimate, which I have purely based on the average duration of the disease and visually matching the curve to deaths, overall infections (when they [almost] plateau you can assume that people must have started getting better 1-2 weeks prior) and, where available, recovery data.
from covid-19-data.
I'll close this issue, as there is no reliable method to calculate recovered/active cases for all countries (and no aggregated data source that makes this data available).
from covid-19-data.
Related Issues (20)
- owid-covid-data.csv has not been updated for 2 days HOT 2
- data: overshoots in covid-attributed death count for Chile, Ecuador, Colombia HOT 6
- Vac data: Anguilla HOT 2
- data: Question on US vax per 100 calculation HOT 3
- data: Vac: Switzerland and Liechtenstein HOT 2
- data: Covid recovered cases HOT 1
- data: Australia hospitalised cases - sudden rise in latest data point, unsupported by other sources
- data: Vac: Honduras HOT 4
- data: Data Update ceases HOT 2
- data: WHO reporting anomaly for South Korea HOT 1
- bug: wrong numbers from CoVariants HOT 1
- data: cases, deaths, severe cases (ICU) HOT 1
- data: Swedish COVID-19 Booster doses appear to be too high / wrong. HOT 3
- Project: Move hosting off of Netlify HOT 3
- data: cases_deaths pipeline with data from WHO HOT 1
- data: Who does not report data from Paraguay HOT 1
- check: COVID numbers HOT 1
- fix: relax day tolerance check from case/death data from the WHO HOT 1
- data: Appears like your deaths data is now weekly? HOT 1
- Vac data: Ireland
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from covid-19-data.