yqzhong7 / aipw Goto Github PK

View Code? Open in Web Editor NEW

20.0 20.0 8.0 3.52 MB

R Package: Augmented Inverse Probability Weighted (AIPW) Estimation for Average Causal Effect

Home Page: https://yqzhong7.github.io/AIPW/

License: GNU General Public License v3.0

R 100.00%

causal-inference machine-learning r robust-estimators

aipw's People

Contributors

Stargazers

Watchers

Forkers

ainaimi gconzuelo borishouenou ehsanx muntasirtiash hopewordian drchinmay25

aipw's Issues

Add a new class that allow user input nuisance function

Inherit from AIPW_base class
Something like AIPW_manual$new(A, Y, mu0, mu1, mu, raw_p_score, verbose)
Warning about sample splitting

Categorical Exposure

Supporting categorical exposure by using missing outcome mechanism. (Chapter 6, Gruber, S. and Van der Laan, M.J., 2011. tmle: An R package for targeted maximum likelihood estimation.)

Support the conditional average treatment effects on the treated and controls (ATT and ATC)

ATT
ATC
Output the risks by exposure status

Kennedy, E.H., Sjölander, A. and Small, D.S., 2015. Semiparametric causal inference in matched cohort studies. Biometrika, 102(3), pp.739-746.

Warning messages

Missing covariates or exposure
k_split
- = 1 : no sample splitting is being used
- = 2 : Not allowed

2 tests fail on PowerPC: `Error in `is.nan(A)`: default method not implemented for type 'list'`; `Error in `trim_logit(X)`: 'list' object cannot be coerced to type 'double'`

@yqzhong7 Could this be addressed, please?

R version 4.3.1 (2023-06-16) -- "Beagle Scouts"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: powerpc-apple-darwin10.0.0d2 (32-bit)

> library(testthat)
> library(AIPW)
> 
> test_check("AIPW")
[ FAIL 2 | WARN 0 | SKIP 0 | PASS 206 ]

══ Failed tests ════════════════════════════════════════════════════════════════
── Error ('test-stratified_fit.R:75:3'): AIPW stratified_fit: sl3 & k_split ────
Error in `is.nan(A)`: default method not implemented for type 'list'
Backtrace:
     ▆
  1. └─aipw$stratified_fit() at test-stratified_fit.R:75:3
  2.   └─private$.f_lapply(...)
  3.     └─future.apply::future_lapply(...)
  4.       └─future.apply:::future_xapply(...)
  5.         ├─future::value(fs)
  6.         └─future:::value.list(fs)
  7.           ├─future::resolve(...)
  8.           └─future:::resolve.list(...)
  9.             └─future (local) signalConditionsASAP(obj, resignal = FALSE, pos = ii)
 10.               └─future:::signalConditions(...)
── Error ('test-tmle_support.R:53:3'): AIPW_tmle class: tmle3 ──────────────────
Error in `trim_logit(X)`: 'list' object cannot be coerced to type 'double'
Backtrace:
    ▆
 1. └─tmle3::tmle3(or_spec, data = df, node_list, learner_list) at test-tmle_support.R:53:3
 2.   └─tmle_spec$make_initial_likelihood(tmle_task, learner_list)
 3.     └─tmle3::point_tx_likelihood(tmle_task, learner_list)
 4.       └─likelihood_def$train(tmle_task)
 5.         └─delayed_fit$compute(job_type = sl3_delayed_job_type(), progress = verbose)
 6.           └─scheduler$compute()
 7.             └─self$compute_step()

[ FAIL 2 | WARN 0 | SKIP 0 | PASS 206 ]
Error: Test failures
Execution halted

Balance summary / Variance

How can we have the balance summary across each covariates for the propensity score ? (To calcul the Standardized Mean Difference for example)
Which estimator is use to compute the variance (and thus 95%CI) ?

Print IP-weights in addition to propensity scores

Output summary statistics of IP-weights
ggplot of IP-weights

Support missing outcome

Missing outcome:

Add warning message: Missing outcome is detected, assuming missing at random (MAR)
Outcome model: Q(A,W) = E(Y | W, A = a, delta = 1)
Exposure model: g(W) = P(A | W) P(delta | A, W)

Submit to CRAN

Packages on CRAN require that their dependencies are also on CRAN. Since sl3 and tmle3 are not on CRAN at this time, we are planning to:

Create a separate branch without supporting sl3 and tmle3
Submit this package to CRAN
Revise per CRAN's comments if any
Resubmission

Repeated cross-fitting to reduce randomness

use different splits of cross-fitting for $\psi$ estimation
summarise $\psi$ with median (Chernozhukov 2018, Section 3.4).
Newey, W. K., & Robins, J. R. (2018). Cross-fitting and fast remainder rates for semiparametric estimation. arXiv preprint arXiv:1801.09138.
Chernozhukov V, Chetverikov D, Demirer M, et al. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal 2018;21(1):C1–C68. doi:10.1111/ectj.12097. Publisher: Oxford Academic.

Asymmetric propensity score truncation

If only one g.bound is provided, truncation >0.5 is not allowed (provide error message)
If two g.bound is provided, asymmetric ps truncation is supported (e.g., g.bound = c(0.05, 0.95))
Not allow number of g.bound >=3

std error of RR(Risk Ratio)

In line 278 of AIPW/R/AIPW_base.R, the std error of RR may calculate -(2*sigma_covar[1,2]/(mean(aipw_eif1)*mean(aipw_eif0))) twice, which occur NaNs in sqrt function.

Presentation of results for continuous outcomes

Hi, thanks for the fantastic package! I have a small suggestion. Would you be able to change the text included with the results output to something like 'exposure mean', 'control mean' and 'mean difference' when the outcome is continuous? Otherwise it can be a bit confusing seeing 'risk difference' etc for a continuous outcome.

Here is a quick reprex to demonstrate the issue:

A <- rbinom(100, 1, 0.5)
W <- rnorm(100)
Y <- rnorm(100)

aipw_out <- aipw_wrapper(
  Y = Y,
  A = A,
  W = W,
  Q.SL.library = "SL.mean",
  g.SL.library = "SL.mean",
  k = 1
)

                 Estimate    SE 95% LCL 95% UCL   N
Risk of exposure  -0.2493 0.133  -0.511  0.0124  51
Risk of control   -0.0778 0.126  -0.325  0.1697  49
Risk Difference   -0.1714 0.183  -0.531  0.1879 100

yqzhong7 / aipw Goto Github PK

aipw's People

Contributors

Stargazers

Watchers

Forkers

aipw's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs