GithubHelp home page GithubHelp logo

mit-lcp / mimic-iii-paper Goto Github PK

View Code? Open in Web Editor NEW
73.0 11.0 38.0 10.81 MB

Repository for the paper describing MIMIC-III

Home Page: http://www.nature.com/articles/sdata201635

Jupyter Notebook 97.56% TeX 2.44%
mimic-iii physionet icu

mimic-iii-paper's Introduction

Paper describing the MIMIC-III critical care database

This repository contains the content (Latex, code) used to create the official citation for the MIMIC-III Critical Care Database. The citation is:

MIMIC-III, a freely accessible critical care database. Johnson AEW, Pollard TJ, Shen L, Lehman L, Feng M, Ghassemi M, Moody B, Szolovits P, Celi LA, and Mark RG. Scientific Data 3:160035 doi: 10.1038/sdata.2016.35 (2016). http://www.nature.com/articles/sdata201635

For more information on MIMIC-III, see: http://mimic.physionet.org/

mimic-iii-paper's People

Contributors

alistairewj avatar tompollard avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mimic-iii-paper's Issues

Modify the title of your submission

We must ask that you modify the title of your submission in order to bring it in line with the style and standards of the journal. We do not permit the use of colons or parentheses in titles as these are reserved to denote specific article types (e.g. corrigenda) and for the purposes of clarity we prefer to avoid the use of acronyms. With these guidelines in mind we would like to suggest the following alternative:

"MIMIC-III, a public access database of critical medical care from the Beth Israel Deaconess Medical Center"

The full definition of the acronym can be included in the abstract. Please feel free, though, to suggest another alternative that you feel well-represents this dataset and fits our formatting standards.

Discuss efforts towards common data models

More broadly re: the above, there have been efforts to convert MIMIC to other data models (e.g. https://github.com/shamsbayzid/mimic-cdm). This would seem to be a valuable aspect that could be touted in this paper. Perhaps in the future, MIMIC could offer a version of the dataset in such format so as to leverage predictive models, analytics tools etc generated organizations such as OHDSI.

The collaborative research approaches described in the conclusion are appreciated MIMIC-III will undoubtedly be a highly valuable asset to the research community.

Range

My research involves exploring computational architectures in particular neural networks to find, express and extract information to improve clinical decision making and patient outcomes by analysing ICU variables.

I am currently working on a subset of the medical data (MIMIC III) and would like to know if you have the average normal ranges for the variables used in the database. For clarity, the ranges I am referring to are the normal ranges (that is for patients not in ICU), for example, the normal range for temperature is 97°F (36.1°C) to 99°F (37.2°C).

I am unable to obtain reliable ranges for all the variables from one source and I am hoping to obtain the ranges directly from your team as you are the source of the data and this reduces any discrepancies.

Submission of a 'cover image' to highlight article on the Scientific Data homepage

We have the option of submitting a cover image for the article:

We would welcome the submission of a 'cover image' to highlight your Data Descriptor on the Scientific Data homepage. Images should relate to the content of your Data Descriptor, but need not be contained within the paper. Images should be at least 1400px x 400px in size and of as high resolution as possible (ideally 300dpi), with the focus of the image (if applicable) to the right of centre. Unfortunately, we cannot promise that your suggestion will be used.

Does anyone have a suggestion for an image that might be suitable (e.g. a non-sensitive photo from the ICU or an existing data visualisation)? If not then I'll submit Figure 1 (https://github.com/MIT-LCP/mimic-iii-paper/blob/master/MIMICData.png).

Provide tables detailing Data Source -> Class of data ->Table name for each Table

As you may know, Scientific Data provides full structured metadata records to document the provenance, manner of generation, and location in public repositories of the datasets linked to our Data Descriptors.

I have begun to look at your article, but in order to proceed I require the following information.

Please provide tables detailing Data Source -> Class of data ->Table name for each Table in the PhysioNet archive.

Data Source will be one of the sources listed in 'database development', ie. one of:

  1. archives from critical care information systems
  2. hospital electronic health record databases
  3. Social Security Administration Death Master File

'Class of data' will be as listed in Table 3 and 'Table name' as listed in Table 4.

Each 'class of data' may have several 'Data Sources' and each 'Table name' may contain several 'classes of data'.

Details of deidentification process

The de-identification process referenced may have been rigorously evaluated (ref to 2008 study), but has this validation been repeated. Were additional efforts made to confirm de-identification of the current dataset? If there have been additional automated advances or manual effort at the deID process, they should be noted.

Given the frequent use of MIMIC for research, references to some of these studies is recommended

In this paper, the authors report on the MIMIC-III database, providing characteristics of its generation and content. As MIMIC has served as an important and frequently used research database, the authors' ongoing work is appreciated and of great value to the community. I do have some suggestions however regarding the paper and the database.

Page 2 - -- Given the frequent use of MIMIC for research, references to some of these studies is recommended in the background section to convey the value of this dataset

List of data tables may be more appropriately displayed in a Table

The authors describe an updated release of the MIMIC data. MIMIC and MIMIC-II have been an essential and singular source of electronic health record data to support informatics, data science, and healthcare research.

MIMIC-III is well described. I have three comments. First, the list of data tables may be more appropriately displayed in a Table rather than a list in the manuscript.

Mapping of medication concepts to RxNorm or other standard terminology

The authors do not mention any mapping of medication concepts to RxNorm or other standard terminology. Similarly they do not mention the mapping of observations to LOINC codes. My supposition is that this is not mentioned because these mappings not exist within MIMIC. But the authors should be explicit if this if the case. Given that other researchers have mapped MIMIC to standard terminologies (eg., https://lhncbc.nlm.nih.gov/project/discoveries-clinical-data), it would be valuable to incorporate directly.

Figure numbers were missing from the in-text figure references in the PDF

Please note that figure numbers were missing from the in-text figure references in the provided PDF. In any case, the figures themselves will need to be removed from the LaTeX document and uploaded separately with your revised manuscript, so you may find that you need to hard code the figures references into the LaTeX. Figures legends should be provided at the end of the document.

At present Data Citation 1 is not referenced

At present Data Citation 1 is not referenced. To maximise the visibility of your dataset please ensure that this is cited in at least one place within the text. An ideal location for this may be the Data Records section.

Check ISA (Investigation, Study, Assay) formatted metadata records

I have now completed the metadata records for your article. These records follow the ISA (Investigation, Study, Assay) format (www.isa-tools.org), and in accordance with the Scientific Data specification of these guidelines I have generated Investigation, Study, and Assay files to accompany your manuscript (please see the attached ZIP archive for the complete file package).

The investigation file provides an overview of the work. The study file describes the samples and subjects assayed in each work, and the assay file the manipulations performed on a set of samples to generate one or more related datasets. Please note that many of the fields in the investigation file are currently empty, as these will be automatically filled in during the publication process.

I have been able to obtain most of the necessary information from your manuscript and the data records PhysioNet. However, the datasource table you sent had two additional classes that are not defined in Table 3. Therefore please could you add 'Interventions' and 'Dictionary' classes to Table 3 and send an updated table 3 back to us.

Please could you also perform a sanity check of the metadata files, in case I have mistaken any details of the experimental workflow or if any fields contain erroneous information.

In addition, you may be aware that Scientific Data articles are published with a structured summary table which appears after the article abstract. This table is generated from the machine readable metadata files, and therefore uses standardized structured vocabulary terms. The table for your article is below, please confirm whether the terms used here are factually correct.

Design Type(s) data integration objective
Measurement Type(s) Demographics • clinical measurement • intervention • Billing • Medical History Dictionary • Pharmacotherapy • clinical laboratory test • medical data
Technology Type(s) Electronic Medical Record • Medical Record • Electronic Billing System • Medical Coding Process Document • Free Text Format
Factor Type(s)
Sample Characteristic(s) Homo sapiens

ISA-Tab_Pollard_20160503_1462277300_1.zip

Expire flag ?

When ? how many years after 2013 ? 2014 ? 2012 ? has it been most recently updated ?
Is it updated now for all patients ?
How about the social security deaths, is it safe to assume they are all updated at least 3 years after the MIMIC III database data migration was closed ?

Another questions is,
How the ITEMID 225811 and 225059 get into the database ?
Who provides this info ?
Is it part of EMR ? that physicians add ?
Or is it a part of post processing of H&P ?
Or maybe nursing notes ?
This ITEMIDS are past medical history...

Also, the HEMO PD under the item IDs ITEMID 225811 and ITEMID 225059 mean pt is End stage renal disease on hemodialysis ?
I'd assume ?

Figures are not of sufficient quality to be used for final publication

Thank you again for submitting your Data Descriptor, "MIMIC-III, a freely accessible critical care database", for consideration at Scientific Data. Your manuscript is now entering the final evaluation stage, and in anticipation that this work could be accepted for publication, we are performed a series of final checks to ensure that the production process may proceed as rapidly as possible post-acceptance.

Unfortunately, the current figures are not of sufficient quality to be used for final publication.

  • Line-art figures of this kind must be provided in a manner that retains the text and lines as vector-graphics. You may need to remake these figures in an appropriate vector-graphics program like Illustrator or Inkscape. Please then save the images directly in the EPS or PDF formats.
  • Final figures must be formatted to fit easily on a single portrait-oriented page, and the text must be easily readable when printed. We feel that the text of Figure 2, in particular, will be too small to read when reduced to fit to the width of single page (accounting for margins).

Please submit your finalised figures by email to the editorial office at [email protected], clearly stating the manuscript tracking number (SDATA-16-00042A) in the subject header. We hope to be able to render a decision on your paper in the very near future.

SDATA-16-00042A Initial Quality Check

Thank you for submitting the revised version of your manuscript to Scientific Data. Before passing your submission on the the editor it was checked for formatting and compliance with journal style. We ask that you please address the following issues before we proceed:

  1. The MIMIC-III data citation should be cited as Data Citation 1 throughout, not as reference 15. Please update these citations.
  2. Please remove the title from Data Citation 1.
  3. The second link in the Code Availability section leads to a 404 page. Could you please either correct this link or confirm that the page will be made live upon publication, in the event of acceptance.

Past Medical History ITEMIDs

How the ITEMID 225811 and 225059 get into the database ?
Who provides this info ?
Is it part of EMR ? that physicians add ?
Or is it a part of post processing of H&P ?
Or maybe nursing notes ?
This ITEMIDS are past medical history...

Also, the HEMO PD under the item IDs ITEMID 225811 and ITEMID 225059 mean pt is End stage renal disease on hemodialysis ?
I'd assume ?

Unification of tables would be in line with evolution of shared dataset models

The MIMIC-III data structure is obviously a reflection of the underlying data sources, and an effort to continue with some consistency from MIMIC-II. But ideally such a dataset would unify procedures, for example, into a single table rather than having a CPT events table and ICD procedure table. Such unification would require a more robust vocabulary approach but would be more in line the evolution of shared dataset models (e.g., OMOP CDM). Similarly, DATETIMEEVENTS seem like a complicating table that may have better logical homes for its constituent data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.