GithubHelp home page GithubHelp logo

ruoqi-liu / deepipw Goto Github PK

View Code? Open in Web Editor NEW
91.0 91.0 20.0 7.98 MB

Code for paper "A deep learning framework for drug repurposing via emulating clinical trials on real world patient data" (Accepted to Nature Machine Intelligence).

License: MIT License

Python 98.82% Shell 1.18%

deepipw's People

Contributors

ruoqi-liu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

deepipw's Issues

Question about synthetic data

Hi Ruoqi, I found the paper to be interested, and thank you for making the codes available. Before I planned on applying the model to our real data problem, I was running your code with the simulation data you have provided. While I have followed the direction provided (including using same version of the libraries), I have been getting several errors while running the Bash/python command: "bash run_lstm.sh" (I have also tried to change the pkl files in the FILES, but did not had any luck). I was wondering if you happened to get any errors when you run it? Another question I had was if I understood your code and paper properly, it's okay to add and modify demographic information (and changing the pre_demo.py) would make the code to work properly. Is this correct, or would the model only work in this particular frame for real data?

Thanks very much to your response or help!

Questions about the user/non-user cohort generation step and the bootstrapping step

Hi Ruoqi -

I have couple questions related to the user/non-user cohort generation step and the bootstrapping step. I appreciate it if you could help me understand your paper better.

In the user/non-user cohort generation step, looking at the 1,353 unique drugs, for each drug, you look at the ingredients of the drug, let's say a drug has ingredients A, B, and C, if there are other drugs containing at least one of these ingredients, then the drug would be a potential drug for the user cohort, and all those other drugs would be the corresponding alternative drugs for the non-user cohort? Otherwise, if there are no other drugs containing at least 1 of the ingredients A, B, and C, then this drug will not be considered as a potential drug for the user cohort? Basically, I'm wondering when you narrowed down your search space from 1,353 drugs to 55 drugs, if the above logic is one of the criteria in that process (I know you have other criteria here: CAD initialization date < first prescription date or index date, after drug's index date at least 1 more prescription in the follow-up period, two prescriptions at least 30 days apart, patients have at least 1 year of history before index date and 2 years of history after index date, both cohort sizes must be larger than 500. Hope I get all of the these correctly here...)

In the bootstrapping step, in the paper you wrote "For each candidate ingredient, we repeatedly generate multiple different control drugs via random sampling with replacement, and the analysis is repeated in each bootstrap sample." I'm confused here. Could you please help me understand what you are doing here with bootstrapping? Again, let's assume user cohort drug has 3 ingredients, A, B, and C, let's say there are 3 other drugs that have ingredient A, 5 other drugs that have ingredient B, and 10 other drugs that have ingredient C. So basically, all the patients that have taken one of the 3+5+10=18 drugs after CAD initialization date would be placed in the corresponding non-user cohorts? And in each bootstrapping iteration, you randomly took x, y, z drugs with replacement from the 3, 5, 10 drugs and their associated patients to form a new non-user cohort sample and calculate the ATE? And if so, how you decide x, y, and z here? I'm not sure how exactly you got each bootstrapping sample and its size.

Sorry I have a lot of questions here... Thank you in advanced!!!

Best,
Hao

error while running the main python code

Hello Ruoqi,
I was trying to run the DeepIPW model using the python command,
but for some reason I keep getting this error:

args: Namespace(data_dir='../user_cohort/', pickles_dir='pickles', treated_drug_file=None, controlled_drug='random', controlled_drug_ratio=3, random_seed=128, batch_size=50, diag_emb_size=128, med_emb_size=128, med_hidden_size=64, diag_hidden_size=64, learning_rate=0.001, weight_decay=1e-06, epochs=10, save_model_filename='tmp/1346823.pt', outputs_lstm=None, outputs_lr=None, save_db=None, cuda=False, device=device(type='cpu')) Traceback (most recent call last): File "C:\Users\nabil\Study\Research\code\DeepIPW-master\deep-ipw\main.py", line 326, in <module> main(args=parse_args()) File "C:\Users\nabil\Study\Research\code\DeepIPW-master\deep-ipw\main.py", line 47, in main output_lstm = open(args.outputs_lstm, 'a') TypeError: expected str, bytes or os.PathLike object, not NoneType

I'm not sure what's wrong

thank you very much,

How to understand the causal inference in this method?

I have a confusion. In my memory, LSTM can only extract correlation. How to understand the causal inference in this method? Quote from the paper: "Building upon well-established causal inference and deep learning methods, our framework emulates randomized clinical trials for drugs present in a large-scale medical claims database. "

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.