The Hypothesis of Testing: Paradoxes arising out of reported coronavirus case-counts

Many statisticians, epidemiologists, economists and data scientists have registered their serious reservations regarding the reported coronavirus case-counts. Comparing countries and states using those case-counts seem inappropriate when every nation/state have adopted different testing strategies and protocols. Estimating prevalence of COVID-19 based on these data is a hopeless exercise and several groups have recently argued for estimating the number of truly infected cases by using mortality rates. In this project, we aim to (a) posit a conceptual mathematical framework to characterize sampling bias, misclassification/imperfection of the test, and heterogeneity in the reproductive number simultaneously on the estimation of the prevalence rate, (b) review current testing strategies in some of the countries where we have testing data, and (c) provide guidelines for testing strategy/disease surveillance that may help track the pulse of the epidemic, to identify disease free areas and identify disease outbreaks.

Project Description

This project includes the code needed to reproduce results. This includes (A) sourcing both US and World testing (B) algorithmic development, and (C) application of models to the cleaned datasets. If using this code please cite the paper using the following bibtex:

@article{dempsey:2020,
author = {Du, Jiacong and Dempsey, Walter and Mukherjee, Bhramar},
title = {The Hypothesis of Testing: Paradoxes arising out of reported coronavirus case-counts},
booktitle = {arXiv},
year = {2020}}

Code Description

If there are steps to run the code list them as follows:

Dependencies: all code is developed in Python using Anaconda.

The Anaconda environment can be installed using covid.yml. See here for instructions on creating the environment. Simply open Anaconda shell, open to github repo and run:

conda env create -f covid.yml

Datasets and exploratory data analysis

World testing data is accessed here and country population totals is accessed here
US testing data is accessed here and US population totals is accessed here. For AS, GU, MP, and VI are extracted from here
Exploratory data analysis is presented as a set of ipython notebooks. Descriptive statistics are used to inform the prior on the measurement-error models using in the analysis phase

The methods directory contains all algorithms for estimation under selection bias, measurement-error, and heterogeneity. Algorithms are developed within the pymc3.
All evaluation functions can be found in the the evaluation directory. In particular, we perform posterior predictive checks to confirm model fit to the data.
Final report can be found in the write-up directory

ywa136 / covid-umich Goto Github PK

covid-umich's Introduction

The Hypothesis of Testing: Paradoxes arising out of reported coronavirus case-counts

Project Description

Code Description

covid-umich's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs