GithubHelp home page GithubHelp logo

biosimulators / biosimulators_utils Goto Github PK

View Code? Open in Web Editor NEW
5.0 7.0 6.0 12.66 MB

Utilities for building standardized command-line interfaces for biosimulation software packages

Home Page: https://docs.biosimulators.org/Biosimulators_utils

License: MIT License

Python 99.85% CSS 0.15%
systems-biology computational-biology mathematical-modeling simulation sed-ml combine omex biosimulators kisao python

biosimulators_utils's Introduction

Latest release PyPI CI status Test coverage Binder All Contributors

BioSimulators utils

Command-line application and high-level utilities for reading, writing, validating, and executing COMBINE/OMEX format files that contain descriptions of simulations in Simulation Experiment Description Markup Language (SED-ML) format with models in formats such as the BioNetGen Language (BNGL) and the Systems Biology Markup Language (SBML).

Installation

Requirements

  • Python >= 3.7
  • pip (latest)

Optional requirements

  • Docker: required to execute containerized simulation tools
  • Java: required to parse and validate NeuroML/LEMS files
  • Perl: required to parse and validate BioNetGen files
  • RBApy: required to parse and validate RBA files
  • XPP: required to parse and validate XPP files

Install latest release from PyPI

pip install biosimulators-utils

Install latest revision from GitHub

pip install git+https://github.com/biosimulators/Biosimulators_utils.git#biosimulators_utils

Installation optional features

To use BioSimulators utils to validate BNGL models, install BioSimulators utils with the bgnl option:

pip install biosimulators-utils[bgnl]

To use BioSimulators utils to validate CellML models, install BioSimulators utils with the cellml option:

pip install biosimulators-utils[cellml]

To use BioSimulators utils to validate LEMS models, install Java and then install BioSimulators utils with the lems option:

pip install biosimulators-utils[lems]

To use BioSimulators utils to validate NeuroML models, install BioSimulators utils with the neuroml option:

pip install biosimulators-utils[neuroml]

To use BioSimulators utils to validate SBML models, install BioSimulators utils with the sbml option:

pip install biosimulators-utils[sbml]

To use BioSimulators utils to validate SBML models, install BioSimulators utils with the smoldyn option:

pip install biosimulators-utils[smoldyn]

To use BioSimulators utils to convert Escher metabolic maps to Vega flux data visualizations, install BioSimulators utils with the escher option:

pip install biosimulators-utils[escher]

To use BioSimulators utils to execute containerized simulation tools, install BioSimulators utils with the containers option:

pip install biosimulators-utils[containers]

To use BioSimulators utils to log the standard output and error produced by simulation tools, install BioSimulators utils with the logging option:

pip install biosimulators-utils[logging]

Dockerfile and Docker image

This package is available in the ghcr.io/biosimulators/biosimulators Docker image. This image includes all of the optional dependencies and installation options.

To install and run this image, run the following commands:

docker pull ghcr.io/biosimulators/biosimulators
docker run -it --rm ghcr.io/biosimulators/biosimulators

This image includes this package, as well as standardized Python APIs for the simulation tools validated by BioSimulators. Because this image aims to incorporate as many simulation tools as possible within a single Python environment, this image may sometimes lag behind the latest version of this package.

The Dockerfile for this image is available here.

Tutorials

Command-line interface

A tutorial for the command-line interface is available here.

Python API

Interactive tutorials for using BioSimulators-utils and Python APIs for simulation tools to execute simulations are available online from Binder here. The Jupyter notebooks for these tutorials are also available here.

API documentation

API documentation is available here.

License

This package is released under the MIT license.

Development team

This package was developed by the Karr Lab at the Icahn School of Medicine at Mount Sinai in New York and the Center for Reproducible Biomedical Modeling with assistance from the contributors listed here.

Contributing to BioSimulators utils

We enthusiastically welcome contributions to BioSimulators utils! Please see the guide to contributing and the developer's code of conduct.

Funding

This work was supported by National Institutes of Health award P41EB023912.

Questions and comments

Please contact the BioSimulators Team with any questions or comments.

biosimulators_utils's People

Contributors

alexpatrie avatar allcontributors[bot] avatar b-tao avatar bilalshaikh42 avatar codebydrescher avatar eagmon avatar jcschaff avatar jonrkarr avatar luciansmith avatar trellixvulnteam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

biosimulators_utils's Issues

Latest version does not install on Ubuntu 20.04 due to missing `python-libcombine==0.2.9`

I just tried to install the latest version on Ubuntu, but the python-libcombine==0.2.9 package does not exist for ubuntu.

(sbmlsim) mkoenig@trip3:~/git/sbmlsim$ pip install biosimulators-utils==0.1.76
Collecting biosimulators-utils==0.1.76
  Using cached biosimulators_utils-0.1.76-py2.py3-none-any.whl (125 kB)
Requirement already satisfied: requests in /home/mkoenig/envs/sbmlsim/lib/python3.8/site-packages (from biosimulators-utils==0.1.76) (2.25.1)
Requirement already satisfied: python-dateutil in /home/mkoenig/envs/sbmlsim/lib/python3.8/site-packages (from biosimulators-utils==0.1.76) (2.8.1)
Collecting termcolor
  Downloading termcolor-1.1.0.tar.gz (3.9 kB)
Collecting h5py
  Downloading h5py-3.2.1-cp38-cp38-manylinux1_x86_64.whl (4.5 MB)
     |████████████████████████████████| 4.5 MB 1.8 MB/s 
Requirement already satisfied: cement in /home/mkoenig/envs/sbmlsim/lib/python3.8/site-packages (from biosimulators-utils==0.1.76) (3.0.4)
Requirement already satisfied: mpmath in /home/mkoenig/envs/sbmlsim/lib/python3.8/site-packages (from biosimulators-utils==0.1.76) (1.2.1)
Requirement already satisfied: matplotlib in /home/mkoenig/envs/sbmlsim/lib/python3.8/site-packages (from biosimulators-utils==0.1.76) (3.3.4)
Requirement already satisfied: lxml in /home/mkoenig/envs/sbmlsim/lib/python3.8/site-packages (from biosimulators-utils==0.1.76) (4.6.3)
Requirement already satisfied: pandas in /home/mkoenig/envs/sbmlsim/lib/python3.8/site-packages (from biosimulators-utils==0.1.76) (1.2.2)
Collecting networkx
  Using cached networkx-2.5.1-py3-none-any.whl (1.6 MB)
Collecting evalidate
  Using cached evalidate-0.7.7.tar.gz (6.3 kB)
Requirement already satisfied: appdirs in /home/mkoenig/envs/sbmlsim/lib/python3.8/site-packages (from biosimulators-utils==0.1.76) (1.4.4)
ERROR: Could not find a version that satisfies the requirement python-libcombine>=0.2.9 (from biosimulators-utils)
ERROR: No matching distribution found for python-libcombine>=0.2.9

Best Matthias

Allow datasets to have different shapes through padding with nan

Motivation: stochastic simulations can can have lengths which are stochastic (e.g., because they terminate when a condition is reached, rather than terminating at a specific predetermined time point). It should be possible to build a report that includes predictions from multiple simulations that have different lengths.

Incorporate validation of libOmexMeta files

@jhgennari @CiaranWelsh I'd like to incorporate validation of OMEX meta files here as part of comprehensive validation of COMBINE archives. Once integrated here, it will become part of validation deployed at https://run.biosimulations.org/validate and it will be integrated into each simulation tool.

  • Can you provide a Python snippet to use pyomexmeta to validate a file?
  • What is the format URL that you're using in conjunction with manifests of COMBINE archives?

Support new features introduced in SED-ML L1V4

  • Combinations of targets and symbols (formerly handled with implicit XPATHs)
  • Simple repeated task
  • Remaining dimensions
  • Plots styling
  • New types of plots
  • References from variables for specific models involved in repeated tasks

Add commandline output for licence

According to GPL:

If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:

{project}  Copyright (C) {year}  {fullname}
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.

The hypothetical commands show w' and show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".

We can add a field to main.py that contains any needed licence text and options for displaying the licence

Incorporate validation of CellML files

Can be done once there's a libCellML distribution for Python3.9/Linux (could also be done by compiling from source, but a precompiled distribution is easier).

Annotate axes of HDF5 outputs

Use SIO to annotate semantic meaning of each implicit dimension (e.g., time, space, replicates)

  • Dataset: SIO_000921 (dependent variable)
  • Time: SIO_000418 (time instant)
  • X-axis: SIO_000452 (x-axis)
  • Y-axis: SIO_000453 (y-axis)
  • Z-axis: SIO_000454 (z-axis)
  • Replicates: SIO_001419 (collection of replicates)

This requires some discussion

  • Reports can combine results from multiple simulations. In this case, the implicit dimensions don't have clear meanings. Should we limit reports to only have the results of a single simulation?

To implement this

  • Upon collection of the results of each data generator/dataset, collect the meanings of the implicit dimensions from the simulators
    - UniformTimeCourse/non-spatial: time
    - UniformTimeCourse/spatial: time; x, y, z coordinates
  • Pass this information to the ReportWriter

Support fully not resetting models during repeated tasks

  • When RepeatedTask.resetModel=False and the model of the first task of an iteration is the same as the last model of the last task of the previous iteration, execute the simulation starting from the final state of the previous simulation (i.e., copy the final simulation state of the previous simulation to the initial conditions of the next simulation).
  • When consecutive sub-tasks reference the same model, execute the second simulation starting from the final state of the previous simulation.

Allow model.source to be an id of another model

SED-ML allows the following additional ways of defining the sources of models

For compatibility with the spirit of COMBINE archives, I think SED-ML files in COMBINE archives should be self-contained, and not reference external entities via URLs and identifiers.

Improve error messaging

  • Add SBML validation to BoolNet, COPASI
  • More flexible recognition of SED URNs for model languages
  • Stricter validation of unique ids

Allow subtasks of a repeated task to use different models

This is mostly allowed with a couple exceptions

  • Validation of XPath targets of variables of data generators needs to be appropriately handled for instances of Task
    • Check that targets are valid with the language of at least one subtask
  • Handle errors in task executer not being able to generate all variables
  • Add test for this to test suite

Expand capabilities of execution status logging

BioSimulators utils

  • Execute each SED document, task, and output
    • Catch exceptions so that the execution doesn't stop on the first failure
    • Collect all exceptions and raise them at the end of the execution of the entire archive
    • When there's at least one exception, terminate with non-zero exit code
    • Generate as much of each output as possible
  • For each archive, task, and output capture:
    • Exception
      • Type
      • Message
    • Reason for skip
    • Stdout/err
      • Stdout/err merged together
      • ANSI formatting codes
    • Walltime (duration)
  • For each task capture
    • The KiSAO id for the algorithm that was executed
    • Additional simulator-specific details of the execution such as the method and its arguments
  • Make CLI print errors in red
  • Print warnings in yellow

tellurium

  • Apply changes to exec_sed_doc in biosimulators_tellurium

Add warnings

  • Inconsistent curve/surface axes
  • Duplicate data set labels
  • Tasks that don't contribute to outputs
  • Data generators that don't contribute to outputs
  • Data generators for reports/plots may have inconsistent shapes because from different kinds of tasks
    • Different kinds of simulations (e.g., time course vs steady-state)
    • Different kinds of tasks (basic vs. repeated)
  • Data sets of reports, x/y of curves, x/y/z of surfaces have consistent shapes
    • Different kinds of simulations (e.g., time course vs steady-state)
    • Different kinds of tasks (basic vs. repeated)
  • Subtasks have different shapes

Annotate the dimensions and slice of reports

It would be helpful to be able to capture information about the semantic meaning of each dimension and slice of a report. Presently, this is challenging to do in a general way because SED reports can mix results from multiple tasks of multiple simulations of multiple models. This would be possible if a report was restricted to contain the outputs of a single task (or repeated task)/

Add method of executing simulations with Singularity images

Example:
singularity run -B out:/root image.sif -i /root/Lotka-Volterra.omex -o /root

Incorporate as options of

  • biosimulators_utils.simulator.exec.exec_sedml_docs_in_archive_with_containerized_simulator
  • biosimulators_utils.__main__.ExecuteModelingProjectController

Should reports be disallowed from containing datasets with different time scales?

Example: A simulation experiment involves two tasks, one in minutes and one in seconds. Both tasks have the same number_of_points. Their results are combined into separate datasets within a single report. Doing this, obscures the meaning of the second (time) dimension of the report.

Should BioSimulators utils continue to support this, or raise an exception?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.