GithubHelp home page GithubHelp logo

varun-projects / bignn Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bircatmcri/bignn

0.0 2.0 0.0 7.99 MB

bigNN: an open-source big data toolkit focused on biomedical sentence classification

Java 100.00%

bignn's Introduction

bigNN: an open-source big data toolkit focused on biomedical sentence classification

Every single day, a large amount of text data is generated by different medical data sources, such as scientific literature, medical web pages, health related social media posts, clinical notes, and drug reviews. Processing this data in an efficient manner is a really daunting task without the help of clever computational strategies, and it makes text classification as an imperative and a major operation to big data text analytics. In this contribution, we developed an open-source software for big data text classification called bigNN. It implements a word2vec neural network model over Apache Spark to aim at big data sentence classification in a timely fashion. The software offers a graphical user interface, and it facilitates reproducible research in sentence analysis by allowing users to configure different sets of Apache Spark and word2vec neural network parameters. Furthermore, we introduce application of bigNN in medical informatics domain. bigNN is fully documented and it is publicly and freely available at https://github.com/bircatmcri/bigNN.

The bigNN includes the following packages:

Package Name Description
edu.mfldclin.mcrf.bignn.gui Implementation of the graphical user interface
edu.mfldclin.mcrf.bignn.setting Implementation of pre-defined and user-defined settings required to the system
edu.mfldclin.mcrf.bignn.learning Implementation of text pre-processing and neural network learning model
edu.mfldclin.mcrf.bignn.evaluation It evaluates the neural network predictive model

Requirements:

  • Apache Spark 2.10
  • Java2SE 8

bigNN software architectural model:

The bigNN software architectural model is shown in includes the following figure.

alt text


Collaborators:

  1. Ahmad P. Tafti (Marshfield Clinic Research Institute)
  2. Ehsun Behravesh (IEEE Member)
  3. Mehdi Assefi (University of Georgia)
  4. Eric LaRose (Marshfield Clinic Research Institute)
  5. Jonathan Badger (Marshfield Clinic Research Institute)
  6. John Mayer (Marshfield Clinic Research Institute)
  7. AnHai Doan (University of Wisconsin-Madison)
  8. David Page (University of Wisconsin-Madison)
  9. Peggy Peissig (Marshfield Clinic Research Institute)

Acknowledgment:

The project described was supported by the Clinical and Translational Science Award (CTSA) program, through the NIH National Center for Advancing Translational Sciences (NCATS), grant UL1TR000427. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Publications:

The workflow and architectural model of the bigNN is fully explained in [1]. Any publication using the bigNN would encourage to cite the two following papers. Thanks!

[1] Tafti, A.P., Behravesh, E., Assefi, M., LaRose, E., Badger, J., Mayer, J., Doan, A., Page, D., Peissig, P. 2017. bigNN: an open-source big data toolkit focused on biomedical sentence classification. IEEE BIG DATA 2017. [Paper]

[2] Tafti, A.P., Badger, J., LaRose, E., Shirzadi, E., Mahnke, A., Mayer, J., Ye, Z., Page, D. and Peissig, P., 2017. Adverse Drug Event Discovery Using Biomedical Literature: A Big Data Neural Network Adventure. JMIR medical informatics, 5(4), p.e51. [Paper]

bignn's People

Contributors

aptafti avatar bircatmcri avatar ehsun7b avatar mehdiassefi avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.