GithubHelp home page GithubHelp logo

isabella232 / healthcare-data-harmonization Goto Github PK

View Code? Open in Web Editor NEW

This project forked from googlecloudplatform/healthcare-data-harmonization

0.0 0.0 0.0 16.72 MB

This is an engine that converts data of one structure to another, based on a configuration file which describes how. There is an accompanying syntax to make writing mappings easier and more robust.

Home Page: https://cloud.google.com/solutions/healthcare-life-sciences

License: Apache License 2.0

Shell 1.28% Go 70.31% ANTLR 0.51% Dockerfile 0.51% JavaScript 0.55% Python 7.74% TypeScript 8.88% CSS 0.01% Jupyter Notebook 3.53% Java 6.69%

healthcare-data-harmonization's Introduction

Google HCLS Data Harmonization

Summary

This is an engine that converts data of one structure to another, based on a configuration file which describes how.

The configuration file can be written in either the native protobuf format or a condensed Whistle Data Transformation Language which is transpiled to protobuf configs for you.

The engine accepts data in JSON format and outputs it in JSON format. For information on the mapping configuration, look at the protobuf files in the proto directory.

Overview

This repository is organized into several packages that together enable you to author Whistle configs, extend existing mapping configurations, and test configs within a Jupyter notebook environment.

Getting Started

We highly recommend that you start by setting up your Jupyter Notebook environment using the published docker images and executing the example notebook. Once setup, work through the Whistle Data Transformation Language Codelab to get yourself familiar with Whistle. As you author more Whistle configs, use the Whistle Data Transformation Language Reference to deepen your understanding of the language.

Details

This project consists of three components, the mapping engine, the mapping language, and Jupyter notebook UI extensions and magic commands. If you want to build the mapping engine and mapping language packages:

Make sure you have installed and added to PATH

  1. Golang (>= 1.13)
  2. Java JDK (>= 8)
  3. Protobuf Compiler protoc (>= 3.11.4)

Then run build_all.sh.

This command will build and run the tests of the above packages. In addition, there are a set of JupyterLab UI extensions and magic commands that simplify the authoring workflow. The extensions are packaged into a set of pre-built and published docker images that contain and Jupyter notebook extensions/magic commands and does not require you to build the mapping engine and mapping library packages. For more details about each package, please refer to their individual READMEs for more information.

Language Reference

A language reference is available: Whistle Data Transformation Language Reference

Codelab

Please refer to the Whistle Data Transformation Language Codelab for instructions on how to run the mapping engine and for getting familiar with the mapping language.

Sample pipelines

Whistle configs can be executed in Apache Beam. Please refer to the Whistle Dataflow Pipelines Repo for sample pipelines.

Feedback

Want to help the Google Cloud Healthcare and Life Sciences team improve Whistle? Please email: [email protected] to connect with the Whistle team for a further discussion on your experience with Whistle.

License

Apache License, Version 2.0

healthcare-data-harmonization's People

Contributors

bebinu avatar floraliuyf avatar lastomato avatar rpolyano avatar sujy-name-is-su avatar toby-hu avatar yeweidaniel avatar ygupta89 avatar yinyanghu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.