GithubHelp home page GithubHelp logo

adultcoverage's Introduction

AdultCoverage

This repository contains R code for a technical paper in progress, provisionally titled "R implementations of three growth balance methods for estimating adult mortality coverage", with Everton Lima and Bernardo Queiroz. It is likely too early to cite, however you are free to see what we're up to and use (with attribution):

Creative Commons License
"R implementations of three growth balance methods for estimating adult mortality coverage" by Everton Lima, Bernardo Queiroz, and Timothy L. M. Riffe is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

DDM R package

This project has produced a small R package that implements three methods for indirect estimation of death registration coverage (Generalized Growth Balance, Synthetic Extinct Generations, and a hybrid of the two).

A short tutorial

To install from the central R repository (current version 1.0-0):

install.packages("DDM")

The development version (current version 1.0-0) hosted here on github is always the most up-to-date version. There are different ways to install the development version of the package.

Download the zip ball or tar ball, decompress and run R CMD INSTALL on the subfolder called R/DDM in the terminal command line, or (easier) use the devtools package to install the development version:

# install.packages("devtools")

library(devtools)
install_github("timriffe/AdultCoverage/AdultCoverage/R/DDM")

Then you can load the package using:

library(DDM)

Be aware that if you report a bug and we fix it, then you'll need to reinstall (from github) to get the changes.

Your data need to be in this kind of shape:

head(Moz)
cod    pop1   pop2  deaths age sex year1 year2
  1 1388350  1963660 88248   0   f  1997  2007
  1 1113675  1615244 11424   5   f  1997  2007
  1  878429  1183939  5677  10   f  1997  2007
  1  854078   991323  6123  15   f  1997  2007
  1  827614   986526  7280  20   f  1997  2007
  1  654465   841416  7212  25   f  1997  2007

Here cod indicates the group, a single year, sex, region of data that is to be tested. pop1 and pop2 are the first and second census, respectively. deaths can contain the average number of deaths in each age group in the intercensal period or it can contain the sum of the deaths in each age in the intercensal period. If you give the sum, then specify deaths.summed = TRUE in the arguments to any of the estimation functions. Otherwise the default is to treat deaths as the average. This could be a straight arithmetic average, or simply the average of the deaths observed around census 1 or census 2. For this later case, you'll need to average yourself beforehand, as deaths.summed = TRUE will only do the right thing if deaths over the whole intercensal period are given.

age should be the lower bound of five-year age groups (incl. age 0-4!). If you give standard abridged data (0,1,5), then the pop1, pop2, and deaths from ages 0 and 1 are automatically summed together into the infant category. Don't give single-age data at this time. We hope to add an abridgement function soon, though, to handle such data automatically. sex is character, either "f" or "m". Census dates can be conveyed in a variety of ways. If only year1 and year2 are given, we assume Jan 1. It is best to specify proper date classes and use date1, date2 as column names instead:

cod    pop1    pop2 deaths age sex      date1      date2
  1 1388350 1963660  88248   0   f 1997-08-01 2007-08-01
  1 1113675 1615244  11424   5   f 1997-08-01 2007-08-01
  1  878429 1183939   5677  10   f 1997-08-01 2007-08-01
  1  854078  991323   6123  15   f 1997-08-01 2007-08-01
  1  827614  986526   7280  20   f 1997-08-01 2007-08-01
  1  654465  841416   7212  25   f 1997-08-01 2007-08-01

Results are contingent on evaluating results for particular age ranges. In spreadsheets this is typically done visually, which a plot referenced to some cell range that the user could manipulate. Here, we have a function that works similarly, but you need to use it just for one data grouping at a time (cod):

my_ages <- ggbChooseAges(x[x$cod==1,])

This will open a graphics device, where you can interactively select age ranges by clicking on ages. When you are done, click in the margin to close the device, and it returns the vector of ages. You can use these, or any other vector of ages, to manually specify the age range that each method should use:

ggb(Moz, exact.ages = my_ages)
seg(Moz, exact.ages = my_ages)
ggbseg(Moz, exact.ages = my_ages)

By default these functions will pick a decent age-range on their own:

ggb(Moz)
seg(Moz)
ggbseg(Moz)

And the result will depend on the age-range chosen. If left to automatically choose age-ranges, the evaluation methods will pick one independently for each data grouping (cod). Let's say your data has a large number of groupings (regions, countries, intercensal periods, whatever). You can get a messy overview of results by running:

Results <- ddm(my.huge.data)
ddmplot(Results)

This overview plot also gives the harmonic mean of the coverage estimate given from the three methods provided.

What's missing?

This code is newish, and certainly changes will be made as users report issues or make requests. The next steps will include improvements to graphical diagnostics and an easier way to return the interim data object used to calculate results. Possibly in the future we would think about adding more methods, such as DDM methods that adjust for migration.

adultcoverage's People

Contributors

converge-unfpa avatar timriffe avatar

Watchers

 avatar  avatar  avatar  avatar

adultcoverage's Issues

Confusion about coverage returned by ggb

The item called "coverage" returned by ggb is the coverage relative to census 2 or 1/b. The TDE workbook (http://demographicestimation.iussp.org/content/generalized-growth-balance-method) gives the completeness relative to the average (geometric) coverage of the two censuses. This is the completeness of the intercensal death rates.

The completeness listed in the graph title of ggbChooseAge does not match either of these values. I haven't looked at the code for that.

Adjustment for net migration

Hi Tim,

Thank you so much for developing this tool.

I was wondering if it were possible to make an adjustment for net migration before estimating completeness?

Thanks,
Nick

Error in slopeint function: intercept is wrong

When I looked at the slopeint function the formula for the intercept is wrong:

(1) intercept <- with(codi,mean(leftterm[age %in% agesfit]) * (1/slope) - mean(rightterm[age %in% agesfit]))

I think it should be:

(2) intercept <- with(codi,mean(leftterm[age %in% agesfit]) - mean(rightterm[age %in% agesfit])) * slope

checklist

  • add RMSE of age selection to output (mig branch)
  • add r2 of line fit to ggb() output (mig branch)
  • implement migration in ggb() (mig branch)
  • implement migration in seg() (mig branch)
  • ensure valid use of mig in ggbseg() (both stages should use mig and in the same way) (mig branch)
  • use mig.summed flag (mig branch)
  • warn, if total deaths < 25% of population change then either the deaths.summed flag is being used wrong or the user should consider not using the method. Use either message() or warning() (mig branch)
  • add more example data (PJ will send spreadsheets)
  • implement seg delta method, but think of a better name for it see RD spreadsheet AM_SEG_South Africa_males_3_1.xlsx, calibrate to spirit not values. To iterate or not to iterate?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.