GithubHelp home page GithubHelp logo

ohdsi / selfcontrolledcohort Goto Github PK

View Code? Open in Web Editor NEW
7.0 40.0 4.0 2.41 MB

An R package for performing self-controlled cohort analyses, a method to estimate risk by comparing time exposed with time unexposed among the exposed cohort.

Home Page: http://ohdsi.github.io/SelfControlledCohort

R 98.57% Perl 0.60% Shell 0.83%
hades

selfcontrolledcohort's Introduction

SelfControlledCohort

Build Status codecov.io

SelfControlledCohort is part of HADES.

Introduction

This package provides a method to estimate risk by comparing time exposed with time unexposed among the exposed cohort.

Features

  • Extracts the necessary data from a database in OMOP Common Data Model format.
  • Supports stratification by age, gender, and index year.

Example

library(SelfControlledCohort)

connectionDetails <- createConnectionDetails(dbms = "postgresql",
                                             user = "joe",
                                             password = "secret",
                                             server = "myserver")
                                             
sccResults <- runSelfControlledCohort(connectionDetails,
                                     cdmDatabaseSchema = "cdm_data",
                                     exposureIds = c(767410, 1314924, 907879),
                                     outcomeIds = 444382,
                                     outcomeTable = "condition_era")

summary(sccResults)

Technology

SelfControlledCohort is an R package.

System Requirements

Requires R. Libraries used in SelfControlledCohort require Java.

Getting Started

  1. See the instructions here for configuring your R environment, including Java.

  2. In R, use the following commands to download and install SelfControlledCohort:

install.packages("remotes")
remotes::install_github("ohdsi/SelfControlledCohort")

User Documentation

Documentation can be found on the package website.

PDF versions of the documentation are also available:

Support

Contributing

Read here how you can contribute to this package.

License

SelfControlledCohort is licensed under Apache License 2.0

Development

SelfControlledCohort is being developed in R Studio.

Development status

Beta

selfcontrolledcohort's People

Contributors

ablack3 avatar azimov avatar msuchard avatar schuemie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

selfcontrolledcohort's Issues

Remove i argument from computeIrr function

It appears that the argument i in the computeIrr function is never assigned a value unless I'm missing something. In any case it seems that the code gives identical results if the i argument is removed.

library(SelfControlledCohort)
#> Loading required package: DatabaseConnector

computeIrrs <- function(estimates) {
  computeIrr <- function(i, numOutcomesExposed, numOutcomesUnexposed, timeAtRiskExposed, timeAtRiskUnexposed) {
    test <- rateratio.test::rateratio.test(x = c(numOutcomesExposed[i],
                                                 numOutcomesUnexposed[i]),
                                           n = c(timeAtRiskExposed[i],
                                                 timeAtRiskUnexposed[i]))
    return(c(test$estimate[1], test$conf.int))
  }
  irrs <- mapply(computeIrr,
                 numOutcomesExposed = estimates$numOutcomesExposed,
                 numOutcomesUnexposed = estimates$numOutcomesUnexposed,
                 timeAtRiskExposed = estimates$timeAtRiskExposed,
                 timeAtRiskUnexposed = estimates$timeAtRiskUnexposed)
  estimates$irr <- irrs[1, ]
  estimates$irrLb95 <- irrs[2, ]
  estimates$irrUb95 <- irrs[3, ]
  return(estimates)
}


computeIrrs2 <- function(estimates) {
  computeIrr <- function(numOutcomesExposed, numOutcomesUnexposed, timeAtRiskExposed, timeAtRiskUnexposed) {
    test <- rateratio.test::rateratio.test(x = c(numOutcomesExposed,
                                                 numOutcomesUnexposed),
                                           n = c(timeAtRiskExposed,
                                                 timeAtRiskUnexposed))
    return(c(test$estimate[1], test$conf.int))
  }
  irrs <- mapply(computeIrr,
                 numOutcomesExposed = estimates$numOutcomesExposed,
                 numOutcomesUnexposed = estimates$numOutcomesUnexposed,
                 timeAtRiskExposed = estimates$timeAtRiskExposed,
                 timeAtRiskUnexposed = estimates$timeAtRiskUnexposed)
  estimates$irr <- irrs[1, ]
  estimates$irrLb95 <- irrs[2, ]
  estimates$irrUb95 <- irrs[3, ]
  return(estimates)
}



# test difference on some simulated data
res <- replicate(10, {
  lambdas <- rgamma(4, max(1, rnorm(1, 10, 2)), scale = runif(1, .5, 3))
  estimates <- data.frame(numOutcomesExposed   = pmax(1, rpois(1000, lambdas[1])),
                          numOutcomesUnexposed = pmax(1, rpois(1000, lambdas[2])),
                          timeAtRiskExposed    = pmax(1, rpois(1000, lambdas[3])),
                          timeAtRiskUnexposed  = pmax(1, rpois(1000, lambdas[4])))

  identical(computeIrrs(estimates), computeIrrs2(estimates))
})

all(res)
#> [1] TRUE


suppressPackageStartupMessages(library(dplyr))
# test difference on Eunomia
estimates <- runSelfControlledCohort(Eunomia::getEunomiaConnectionDetails(),
               cdmDatabaseSchema = "main",
               exposureIds = '',
               outcomeIds = '') %>%
  {.$estimates}
#> Connecting using SQLite driver
#> Retrieving counts from database
#> Executing SQL took 0.898 secs
#> Computing incidence rate ratios and exact confidence intervals
#> Performing SCC analysis took 2.2 secs

estimates <- estimates %>%
  select(numOutcomesExposed, numOutcomesUnexposed, timeAtRiskExposed, timeAtRiskUnexposed)


identical(computeIrrs(estimates), computeIrrs2(estimates))
#> [1] TRUE

Created on 2021-01-08 by the reprex package (v0.3.0)

Addition of time at risk distributions for exposure/outcome cohorts

For some time I have had the requirement to compute some statistics for the time at risk calculations for subjects in the exposure/outcome cohorts computed in the SCC analysis. Essentially, the calculations are the distributions of exposure times. E.g. this is informative for comparison between the length of exposure times between drug users with the outcome and drug users without the outcome of interest.

The code also computes the distribution of the absolute difference between exposure start and outcome start. I'm not sure if this is as useful or a particularly informative statistic but I was asked to compute and include it.

This code lives elsewhere in my own project but I have made a branch and PR that may be of use here. The primary benefit for me of having it live in this package is that I won't have to compute the risk windows twice.

@schuemie Let me know if this is useful and if the PR should live here, or there are any additions or suggestions that can go along with it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.