Comments (4)
Hey Aayush, these separate csvs are random draws of the same data distribution. These are used to compute the standard error for the ATE. If combined, we won't be able to do that.
from dragonnet.
If I understand correctly, since separate csvs are draws from the same distribution, the combined csv would also belong to the same distribution right? If yes, could you help and elaborate why we won't be able to compute standard error on ATE when csvs are combined?
EDIT: As a clarification, I am referring to "combining" as appending the original csvs one below the other thus creating one big csv.
from dragonnet.
to be clear, the IHDP dataset is a semi-synthetic dataset based on a real dataset. The real dataset has 747 observations. It includes covariates information such as age, gender, social economical status. This paper originally introduced and explained the dataset: https://www.tandfonline.com/doi/abs/10.1198/jcgs.2010.08162
The NPCI package uses these fixed observations and simulates treatments and outcomes, e.g, y = f(x) + epsilon. each replication of the IHDP dataset essentially has different epsilons.
There are a few reasons we don't want to pool the datasets. A practical reason is that we will only have an x file with 747 observations, and a Y file with 747*50 observations.
We could replicate the X files as well, but then the data points won't be independent.
lastly, the ATE is an average causal effect --- there's one ATE for a population. if we only have one dataset, we won't be able to get a standard error, because we only get one estimate.
from dragonnet.
That makes sense! Thank you!
from dragonnet.
Related Issues (13)
- Interested in the Table Results
- Precisions concerning the ITE computation HOT 2
- ihdp data indices HOT 2
- about GPIO with wiringpi
- Code does not match with description in paper HOT 3
- Question about binary/real valued outcome
- Multiple treatments HOT 4
- Query about the IHDP data folder HOT 2
- Demo notebook on simulated examples to check correctness of implementation? HOT 1
- Query about NPCI data HOT 2
- Correct test_size in train_test_split of ihdp_main.py to reproduce in-sample and out-sample paper results HOT 3
- Upgrade for imports and functions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dragonnet.