dkobak / excess-mortality Goto Github PK
View Code? Open in Web Editor NEWExcess mortality during COVID-19 pandemic
License: GNU General Public License v3.0
Excess mortality during COVID-19 pandemic
License: GNU General Public License v3.0
Hi, greatly appreciated your efforts and they were very useful in the first days of the pandemic to understand the full impact. However, I noticed people are now getting confused about excess deaths for countries that have significant population profile differences in the recent few years. For example, Japan has a massive spike (> 40%) more people in age of 70 to 80 than they did during the reference range.
This is now distorting the conversation and prominent influencers are using this to spread misunderstanding. For example, you're reporting excess deaths in Japan at a high rate, but I believe that adjusted for population profile changes that in fact the death rate is lower than expected.
Maybe you're already adjusting for this, but I think you are not?
I believe the right technique for the expected deaths would be to multiply it by the change in population in each age group since the middle of the reference range. For example, if Japan has 40% more people from age 70 to 80 and has 30% more deaths in that age range than in the reference range, then in fact they have a lower than expected death rate.
Maybe I did the math wrong, but I think that in the four years since this project started that most countries have an aging population that is significant enough to need adjustment.
Cheers!
What's the license for code in this repo?
In the readme:
Top-10 countries in the World Mortality Dataset according to different metrics (only countries with over 50,000 population are shown):
guess it should be 50,000,000
There is no Excess per 100k and Excess as % of annual baseline columns for some countries, for instance Finland, Norway and more.
I suggest to put 'placeholder' chunks in if() switch like
if (1):
country='Sweden'
X = allcountries[country][0]
baseline = allcountries[country][1]
...
that's quicker & easier (for me at least) than (un)comment all relevant lines or even change the cell type (code, markup, raw nbr in Jupyter) to (de)activate the chunk as needed
In the paper you state:
Moscow and St. Petersburg, two regions with arguably the most reliable reporting of Covid‐19 mortality.
However, the data for both of those cities exhibits the following peculiarity:
This is not characteristic of the other regions. SPb’s 192-day streak and Moscow’s 176 are followed by a very distant third, Nizhegorodskaya oblast where the figures haven’t repeated for 34 days. The median is 2 days.
A back-of-the-envelope calculation: assuming that a particular day’s figure could have been anything within range given by data reported within ±3 days of the given date (distributed uniformly), what’s the probability it never matches the previous day’s value? This probability is 5×10⁻⁴ for SPb and 5×10⁻¹² for Moscow (the latter includes 60 consecutive days, 2020-11-15 through 2020-01-13, where all the data fell into the [70, 77] interval, yet, despite its narrowness—in itself uncharacteristic of a random process—never repeated, and never exceeded the previous maximum, the 2020-05-30 value of 78).
A histogram of the Moscow data is below.
Perhaps the statement needs revision?
0 3 |||
1 3 |||
2 3 |||
3 1 |
4 0
5 2 ||
6 0
7 3 |||
8 2 ||
9 3 |||
10 9 |||||||||
11 15 |||||||||||||||
12 18 ||||||||||||||||||
13 12 ||||||||||||
14 14 ||||||||||||||
15 3 |||
16 3 |||
17 2 ||
18 1 |
19 1 |
20 2 ||
21 1 |
22 1 |
23 2 ||
24 5 |||||
25 3 |||
26 1 |
27 4 ||||
28 6 ||||||
29 4 ||||
30 1 |
31 2 ||
32 3 |||
33 1 |
34 4 ||||
35 4 ||||
36 0
37 3 |||
38 1 |
39 2 ||
40 0
41 3 |||
42 0
43 0
44 2 ||
45 0
46 0
47 1 |
48 2 ||
49 3 |||
50 2 ||
51 3 |||
52 5 |||||
53 4 ||||
54 2 ||
55 4 ||||
56 3 |||
57 2 ||
58 4 ||||
59 2 ||
60 1 |
61 4 ||||
62 3 |||
63 4 ||||
64 2 ||
65 3 |||
66 3 |||
67 5 |||||
68 7 |||||||
69 5 |||||
70 3 |||
71 11 |||||||||||
72 9 |||||||||
73 11 |||||||||||
74 13 |||||||||||||
75 14 ||||||||||||||
76 15 |||||||||||||||
77 10 ||||||||||
78 1 |
79 1 |
80 0
81 1 |
82 0
83 0
84 1 |
I'm running the code in local instance of Jupiter, at some point got an error for df_official. I see now there's a hack in the code as chunk 3, but my fix was simply
df_official = pd.read_csv('https://github.com/owid/covid-19-data/blob/master/public/data/owid-covid-data.csv?raw=true')
seems OWD wants to move data traffic onto github. Just checked this URL and original from owd and they're in sync. Indeed, now http://covid.ourworldindata.org redirects to above github account so also old original URL should just work without hack.
In my local copy I adjusted the caption according to code / preprint.
Use/adjust as you see fit in case.
--- figtext 2021-02-22 13:15:30.275003847 +0100
+++ figtextmp 2021-02-23 12:05:00.018288814 +0100
@@ -2 +2 @@
-'Data: World Mortality Dataset, github.com/akarlinsky/world_mortality. '
+'Data: World Mortality Dataset, github.com/akarlinsky/world_mortality, github.com/datasets/, github.com/owid/ . '
@@ -4,5 +4,6 @@
-'Excess mortality is computed relative to the baseline extrapolated from 2015–19. '
-'Red number: excess mortality starting from the first officially reported covid19 death.\n'
-'Gray: as a % of baseline yearly deaths. '
-'Black: per 100,000 population. '
-'Blue: ratio to the daily reported covid19 deaths over the same period. '
+'Excess mortality is computed relative to the baseline extrapolated from 2015–2019. '
+'Lines: black: baseline, gray: 2015-2019, red: 2020, magenta: 2021\n'
+'Numbers: red: estimated excess mortality starting from the first officially reported covid19 deaths up to last available official mortality data,\n'
+'gray: as a % of baseline yearly deaths. black: per 100,000 population, '
+'blue: ratio to the daily reported covid19 deaths over the same period.\n'
+'(*) less war / heatwave excess deaths.\n'
Caution though to define the blu ratio as undercount in general, although there are some glaring cases: e.g. for Italy it's documented (also in your ref. Beaney,2020) that ca. 1/3 excess deaths are non-covid19 due to missed cares either for fear of contagion, lockdown measures, overburden of care facilities:
And e.g. for USA For 6% of the deaths, COVID-19 was the only cause mentioned of the covid19 counts while for the rest it's a matter of choosing the most (often more politically than medically) convenient label https://www.cdc.gov/nchs/nvss/vsrr/covid_weekly/index.htm#Comorbidities with a number of collateral damages like increase in drug abuse https://jamanetwork.com/journals/jama/fullarticle/2776212 and psycho issues e.g. https://www.bmj.com/content/371/bmj.m4352.short
hi,
trying the 'run in browser' link from https://github.com/dkobak/excess-mortality failed with this notice:
Notebook loading error
There was an error loading this notebook. Ensure that the file is accessible and try again.
An invalid or illegal string was specified
https://github.com/dkobak/excess-mortality/blob/main/all-countries.ipynb
An invalid or illegal string was specified
GA@https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20210128-085606-RC00_354297656:1310:69
d/<@https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20210128-085606-RC00_354297656:2199:97
Fa@https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20210128-085606-RC00_354297656:19:336
Da.prototype.next_@https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20210128-085606-RC00_354297656:17:503
Ia/this.next@https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20210128-085606-RC00_354297656:20:206
f@https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20210128-085606-RC00_354297656:62:101
In order for the picture to be realistic, you need to take into account the age groups and the mortality rate in them in previous years. Otherwise you get the wrong picture when you suddenly have spikes in certain age groups of the population. And the different demographics of countries.
https://www.cebm.net/covid-19/excess-mortality-across-countries-in-2020/
Country image is sorted by excess percent, not by name, while CSV is sorted by name. It's not handy to find specific country on image.
Hi, it seems that excess percent in file excess-mortality.csv takes as base mortality yearly one, while other data - covid and excess deaths are taken from very beginning pf epidemic.
For instance according to excess-mortality.csv Israel on 12/09/21 has 7416 covid deaths (that is correct) and 6514 total excess, thus yielding 0.88 undercount ratio. But excess is 13.9%, as if baseline were 46863, number corresponds to one year, but not for more than 1.5 year.
The same for Sweden - it looks like base line for 1.5 year is 91762, but it's regular Sweden yearly mortality.
Or did I miss something?
Anyway it would be nice to maintain separately data for 20 (already done), 21, and from very beginning.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.