mattlewissf / mimic Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 1.0 930 KB

Python 100.00%

mimic's People

Contributors

Watchers

Forkers

jcrudy

mimic's Issues

Create project blog

I recommend the following steps, but there are many alternatives.

Create a new github repository to host the blog
Install the Nikola package (requires Python 3 I think)
Create the blog following the Nikola manual
Deploy to github

Assigning to @RobertShultz. Bobby, feel free to reassign back to me if you'd rather I do the initial set up.

Build and validate 30 day readmission model

Extract features and fit a model to predict 30 day readmission. For validation, I suggest using k-fold cross validation and creating ROC plots and calibration plots.. There is helper code in sklearntools that might be of use, but perhaps better to do it by hand in scikit-learn to start.

Figure out how to deal with censoring

@mattlewissf, you know the most about this issue. The question is, how do we know the left and right censoring dates for each person. This may be and unsolvable issue if that information simply isn't available. If so, we'll have to come up with some workarounds.

Change from sqlalchemy to oreader

This change will be almost entirely in mapper.py. You can find oreader here. I would suggest looking at this test file to get started with oreader. In particular, look at the test_read_write function. It shows a basic example.

Create event rate feature extractor

It should take some definition of an event (such as based on a grouper), a time period, and person. It should return a non-negative number that indicates the number of such events per unit of time (probably per day or per year would be best-make sure to be consistent). To fit the design of other features, this will look like a function that takes an event definition (which is itself a functions) and returns a function that operates on a member and time period (which is an intervalset of dates).

@mattlewissf, am I correctly describing how feature extraction currently works in your system (or at least how you intend it to work)?

Use unsupervised learning to develop new groupers

This is probably a sparse matrix problem of some kind. There are some methods in scikit-learn that accept sparse data. There are also methods that accept precomputed distance matrices. I had some success using topic modeling with gensim.

To get started, you'll have to iterate over people and, for each one, get all of the icd-9 codes and whatever other relevant data. You'll then transform these data into whatever form the method you're trying to use accepts, apply the method, and validate the results. One way to validate the results is to fit a 30-day readmission model (or some other outcome: death, admission in the next year, heart attack, spending. There are many options) and see whether prediction improves (the AUC is a good metric here). There are also many other ways to assess the quality of a clustering or dimension reduction which you can look into. I'm not expert on that topic.

Implement Charlson comorbidity index

It will have to work with a concept of events from issue #2. See this paper.

mattlewissf / mimic Goto Github PK

mimic's People

Contributors

Watchers

Forkers

mimic's Issues

Create project blog

Build and validate 30 day readmission model

Figure out how to deal with censoring

Change from sqlalchemy to oreader

Create event rate feature extractor

Use unsupervised learning to develop new groupers

Implement Charlson comorbidity index

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs