GithubHelp home page GithubHelp logo

chattg1 / summit-health-machine-learning Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ibm/example-health-machine-learning

0.0 2.0 0.0 777 KB

This code pattern shows you how to train a machine learning model to predict type 2 diabetes using synthesized patient health records.

Home Page: https://developer.ibm.com/patterns/machine-learning-using-synthesized-patient-health-records/

License: Apache License 2.0

Jupyter Notebook 100.00%

summit-health-machine-learning's Introduction

DISCLAIMER: This notebook is used for demonstrative and illustrative purposes only and does not constitute an offering that has gone through regulatory review. It is not intended to serve as a medical application. There is no representation as to the accuracy of the output of this application and it is presented without warranty.

Machine learning using synthesized patient health records

This notebook explores how to train a machine learning model to predict type 2 diabetes using synthesized patient health records. The use of synthesized data allows us to learn about building a model without any concern about the privacy issues surrounding the use of real patient health records.

When the reader has completed this Code Pattern, they will understand how to:

  • Prepare data using Apache Spark.
  • Visualize data relationships using Pixiedust.
  • Train a machine learning model and publish it in the Watson Machine Learning (WML) repository.
  • Deploy the model as a web service and use it to make predictions.

Flow

flow

  1. Log in to IBM Watson Studio
  2. Load the provided notebook into Watson Studio
  3. Load data in the notebook
  4. Transform the data with Apache Spark
  5. Create charts with PixieDust
  6. Publish and deploy model with Watson Machine Learning

Prerequisites

This project is part of a series of code patterns pertaining to a fictional health care company called Summit Health. This company stores electronic health records in a database on a z/OS server. Before running the notebook, the synthesized health records must be created and loaded into this database. Another project, https://github.com/IBM/summit-health-synthea, provides the steps for doing this. The records are created using a tool called Synthea, transformed and loaded into the database.

If required, set up the Secure Gateway service to provide you with a secure way to access your on-premise data source.

Steps

Sign up for Watson Studio

Sign up for IBM Watson Studio.

Create a project

  • Click the Create a project tile.
  • A list of project types appears. Click the Data Science project type.
  • Provide a name for the project (e.g. "diabetes-prediction") and click the Create button.
  • The project is saved in a lite object storage instance in your account.

Create a Watson Machine Learning instance

  • Click on the Settings tab of your project.
  • Scroll down to Associated Services.
  • Click Add service and select Watson from the drop-down menu.
  • Click Add on the Machine Learning tile.
  • Select the lite plan and click the Create button.

Add the notebook to your project

Run the notebook

  • Click on Cell in the menu bar and select All Output > Clear to clear out the existing notebook output.

  • Move your cursor to each code cell and run the code in it. Read the comments for each cell to understand what the code is doing. When the code in a cell is still running, the label to the left changes to In [*]:. Do not continue to the next cell until the code is finished running.

  • There are a couple of cells which you have to update to provide your credentials.

    • At the top of the notebook is a cell for your database credentials.
    • Further on you will encounter a cell for your Watson Machine Learning credentials. In order to find these, click on the hamburger menu at the top left of the screen and select Watson Services. Click on your machine learning instance and then click on the Service Credentials tab. Click on View Credentials.

Sample output

The notebook uses Pixiedust to visualize relationships between the data. Here are examples of scatter plots that it can produce.

  • HDL/LDL cholesterol for diabetics vs non-diabetics. The diabetes simulation in Synthea uses a distinct range of HDL readings for diabetic vs. non-diabetic patients. This makes the correlation of cholesterol readings to diabetes abnormally high.

cholesterol-chart

  • Systolic/diastolic blood pressure for diabetics vs non-diabetics. The diabetes simulation in Synthea increases the chance of high blood pressure (hypertension) for diabetics but the non-diabetic patients also can have high blood pressure. Therefore the correlation of high blood pressure to diabetes isn't very strong.

bloodpressure-chart

  • Body mass index for diabetics vs non-diabetics. The diabetes simulation in Synthea does not change the weight of any diabetic patients so BMI has no correlation.

bmi-chart

License

This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.

Apache License FAQ

summit-health-machine-learning's People

Contributors

stevemart avatar gregdritschler avatar loafyloaf avatar

Watchers

James Cloos avatar chattg1 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.