GithubHelp home page GithubHelp logo

mimic's People

Contributors

jcrudy avatar mattlewissf avatar

Watchers

 avatar  avatar

Forkers

jcrudy

mimic's Issues

Create project blog

I recommend the following steps, but there are many alternatives.

  1. Create a new github repository to host the blog
  2. Install the Nikola package (requires Python 3 I think)
  3. Create the blog following the Nikola manual
  4. Deploy to github

Assigning to @RobertShultz. Bobby, feel free to reassign back to me if you'd rather I do the initial set up.

Figure out how to deal with censoring

@mattlewissf, you know the most about this issue. The question is, how do we know the left and right censoring dates for each person. This may be and unsolvable issue if that information simply isn't available. If so, we'll have to come up with some workarounds.

Create event rate feature extractor

It should take some definition of an event (such as based on a grouper), a time period, and person. It should return a non-negative number that indicates the number of such events per unit of time (probably per day or per year would be best-make sure to be consistent). To fit the design of other features, this will look like a function that takes an event definition (which is itself a functions) and returns a function that operates on a member and time period (which is an intervalset of dates).

@mattlewissf, am I correctly describing how feature extraction currently works in your system (or at least how you intend it to work)?

Use unsupervised learning to develop new groupers

This is probably a sparse matrix problem of some kind. There are some methods in scikit-learn that accept sparse data. There are also methods that accept precomputed distance matrices. I had some success using topic modeling with gensim.

To get started, you'll have to iterate over people and, for each one, get all of the icd-9 codes and whatever other relevant data. You'll then transform these data into whatever form the method you're trying to use accepts, apply the method, and validate the results. One way to validate the results is to fit a 30-day readmission model (or some other outcome: death, admission in the next year, heart attack, spending. There are many options) and see whether prediction improves (the AUC is a good metric here). There are also many other ways to assess the quality of a clustering or dimension reduction which you can look into. I'm not expert on that topic.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.