csdms / bmi Goto Github PK

View Code? Open in Web Editor NEW

49.0 21.0 16.0 742 KB

The Basic Model Interface is a standardized set of functions allowing coupling of models to models and models to data

Home Page: https://bmi.readthedocs.io

License: MIT License

TeX 63.36% Python 36.64%

bmi csdms geosciences c fortran python cxx java interface javascript

bmi's Introduction

The Basic Model Interface

The Basic Model Interface (BMI) is a standardized set of functions that allows coupling of models to models and models to data.

The Basic Model Interface (BMI), developed by the Community Surface Dynamics Modeling System (CSDMS), is a standardized set of control and query functions that, when added to a software element such as a model or a dataset, makes that software easier to couple with other software that also exposes a BMI.

A BMI makes a model self-describing and fully controllable by a modeling framework or application. By design, the BMI functions are straightforward to implement in any language, using basic data types from standard language libraries. Also by design, the BMI functions are noninvasive. This means that a model's BMI does not make calls to other components or tools and is not modified to use any framework-specific data structures. A BMI, therefore, introduces no dependencies into a model, so the model can still be used in a stand-alone manner.

The BMI is expressed in the Scientific Interface Definition Language (SIDL). From bmi.sidl, CSDMS has derived BMI specifications for five languages--C, C++, Fortran, Java, and Python. For each language, links to the specification and an example implementation are listed in the table below.

BMI languages

Language	Specification	Example implementation
C	bmi-c	bmi-example-c
C++	bmi-cxx	bmi-example-cxx
Fortran	bmi-fortran	bmi-example-fortran
Java	bmi-java	bmi-example-java
Python	bmi-python	bmi-example-python

Detailed instructions for building the specifications and examples are given at each link above. Alternatively, the specifications can be installed through conda (C, C++, Fortran, Python) or Maven (Java). See the links above for details.

While CSDMS currently supports the languages listed above, a BMI specification can be written for any language. BMI is a community-driven standard; contributions that follow the contributor code of conduct are welcomed, and are acknowledged.

The table below lists community-contributed language specifications and examples for two languages, Javascript and Julia.

Community-contributed BMI languages

Language	Specification	Example implementation
Javascript	bmi-js	bmi-example-js
Julia	bmi-julia	bmi-example-julia

The default branch of this repository reflects the current state of development for the BMI. When implementing a BMI, please use the latest release listed in the right sidebar; currently this is Basic Model Interface 2.0. For more information on implementing a BMI, see the documentation.

BMI is open source software released under the MIT License. BMI is an element of the CSDMS Workbench, an integrated system of software tools, technologies, and standards for building and coupling models.

The Community Surface Dynamics Modeling System is supported by the National Science Foundation.

bmi's People

Contributors

Stargazers

Watchers

Forkers

siggyf platipodium cmshobe r-barnes connectedsystems deltaresprojects mwtoews kbarnhart mgalloy mshemuni ajkhattak lj-cug rolfhut philmiller scrasmussen mdpiper

bmi's Issues

how to run ctest in bmi-example-fortran?

The readme file refers to ctest how to run units tests and examples, but I'm not familiar how to run this. perhaps the commandline how to run this would be useful in this readme file.
I did learn something new: ctest is something that comes with cmake

Disambiguate get_grid_x, y, and z

The get_grid_x, get_grid_y, and get_grid_z BMI functions return the location of grid nodes for rectilinear, structured quadrilateral, and unstructured grids. The behavior of these functions changes depending on the grid type. Take, for example, get_grid_x: for a structured quad or unstructured grid, this function returns a vector with length equal to the number of nodes in the grid, whereas for a rectilinear grid, the return has a length equal to the number of columns in the grid.

It may be confusing to a user that these function have different behaviors depending on the grid type. We should consider using different functions for the two cases (1=rectilinear, 2=structured quad + unstructured).

There can be only one (Fortran BMI file)

Follow the pattern set by @mcflugen in the Python bindings of including the BMI file (here, bmi.f90) from the bmi repo as a submodule in the bmi-fortran repo. Currently, there's a copy of this file in each repo, which could lead to inconsistencies.

Mapping of variables to standardnames

eWatercycle II: When used for model coupling, BMI expects variable names of the corresponding models to be mapped to standard names. In practice, modelers are not using these standard names and refer to the original variable names. Is the mapping of variables desirable? If not, how could model coupling, i.e., selecting the right variables, be supported otherwise.

Instead of mapping variable names, we could annotate variables with metadata that specifies the context and meaning of the variable.

Allow variable units to vary with context

Currently, get_var_units returns a static unit for a variable, like "m" or "kg". However, some models have variables with units that vary with context; see, e.g., discussion here.

This is a request to explore how to include context-varying units.

Add function to write+read restart file

eWatercycleII: add a function that writes a restart file at the current model time, and read the file to actually do the restart. Starting from a restart should still be handled to feed the appropriate configuration file to the initializer function.

Define model states

eWatercycle II: Currently it is not clear at what point e.g. querying the grid or model variables should happen (before or after initialization). Guidance is needed about what is allowed in the different states (before init/after init/ after run/ after finalize). Also important for online model initialization or model changes.

Variables associated to shapes

eWatercycle II: Lumped hydrological models, ungridded surface dynamics models and hydraulic structure control models often have no grid but rather geospatial shapes associated to certain variables. How to deal with this within BMI? Should we seek extensions.

Allow negative values for grid spacing

This is a request to allow grid spacing to have negative values.

The current documentation for the get_grid_spacing function suggests that grid spacing is a distance between nodes of a grid, so a positive value.

However, it may be helpful to allow a negative value for a grid spacing. For example, geospatial imagery typically defines the top left corner as the origin. A negative y value for the grid spacing would indicate the grid is built from this origin in the negative y direction.

get_var_location doesn't represent a scalar grid

The possible return values from the get_var_location function:

node
edge
face

don't apply to the scalar grid type.

Currently, for scalar grids, the return from this function is ignored. We should add a new return, such as scalar or none.

How to test compatibility with BMI API?

Merge R version

eWatercycle II: S. Verhoeven has created an R version and we would like to see this merged back.

Merge Matlab version

eWatercycle II: R Hut has created a Matlab version and we would like to see this merged back.

Read the docs is broken on https://github.com/csdms/bmi-python

Package BMI examples

The BMI examples (C, C++, Fortran, and Python) should be packaged and distributed; e.g., through conda-forge, or trivially through the csdms-stack conda channel.

From @mcflugen:

This would be particularly helpful for testing and demonstrating the babelizer.

Grid ID

@mdpiper and @mcflugen another question motivated by making the terrainbento bmi:

I see that there are a number of functions in the bmi.py file that indicate needing a grid identifier. The best/most relevant part of the documentation that I can find was this section on Variables[sic?] Grids. It does not give me enough information.

Reading between the lines I think that what this is saying is that I'm allowed to define multiple grids within my model. If I do this, I need to give them unique IDs that are integers and allow them to be accessed in this way. Correct?

Pass metadata from/to a model

eWatercycle II: We would like some way to pass metadata from/to a model, such as Author, settings, etc. We prototyped an extention called “attributes”, and would love feedback from the community.

PyMT has this functionality. This may be implemented/extended in BMI

Add a "none" grid type

This is a request to add a new BMI grid type for a variable that could have multiple values, but doesn't have grid information associated with it.

For example, I'd like to list a GDAL GeoTransform as an output variable. This is a fairly well-defined format--it's an array of six float values. But what grid type should be used? It seems like "vector" or perhaps "unstructured" could be used, but the GeoTransform doesn't have any grid associated with it, it's just an array of values; so, e.g., get_grid_x wouldn't be defined, but get_grid_rank and get_grid_shape could be.

Formalize N-step initialization

Use cases:

Set MPI_COMMUNICATOR before initialize
Change model settings (e.g. add sources/sinks, simulation quantities) and then initialize those arrays and subsequently set the values

package bmic not found (in a non-standard prefix install)

I first installed bmic:

mkdir _build && cd _build
cmake .. -DCMAKE_INSTALL_PREFIX=/tmp/opt
make install

and then the example-C in the same fashion:

mkdir _build && cd _build
cmake .. -DCMAKE_INSTALL_PREFIX=/tmp/opt

when I get "No package 'bmic' found"

Is there a flag in cmake one gives to say that bmic is installed in a non-standard place, or did I not follow all install instructions.

openjournals/joss-reviews#2317

Pointers in C and Fortran versions

eWatercycle II: The BMI description for these languages have a C pointer or custom type in each function that is supposed to hold all model data. This presents a great burden to the modeler which usually has all model data globally defined in these languages. We should discuss whether a change in the API is necessary, or which workarounds are the preferred ones.

BMI for Octave

We should define a standard BMI for Octave and include it here along with an example. Lucky for us, @RolfHut and @wknoben have already done this for MARRMoT. 🎉

Add Java language specification

Currently, there are BMI language specifications for C, C++, Fortran, and Python. This is a request to add a Java specification to this list.

Adding Java would be the first step in supporting environments for agent-based modeling such as NetLogo and Repast Simphony, allowing coupling between agent-based and physical models.

Checklist for BMI release

Adding new functions in a BMI release sets off a series of updates for downstream products. This issue lists those updates.

Complete all tasks for the current milestone
Update documentation
Update existing language specifications (C, C++, Fortran, Python, Java), including conda-forge recipes
Update existing example implementations (C, C++, Fortran, Python, Java), including new tests
Create (a specification and) an example implementation for any newly supported languages
Add new BMI functions to the language templates in the babelizer (this may also require a new --bmi-version option)
Add new BMI functions to the bmi-tester (and update --bmi-version option)
Add new BMI functions to pymt

Wireformat

eWatercycle II: Although too soon for standardizing anything, we would like a discussion on if it makes sense to collectively think about a network/wire/serialization format for BMI.

Formalize the mechanism to support extensions to the core BMI

As an extension to version numbering (#8) formalize the definition of a core BMI (most Basic MI) and extended (less Basic MI)

Handle geospatial data in the BMI

The BMI should handle data with a coordinate reference system (CRS), allowing such data to be exchanged between models with a BMI. The BMI should handle geospatial data in the form of vectors (points, lines, polygons) and raster grids.

How to deal with values at e.g. boundaries, hydraulic structures?

Alternative 1:

set_value(“boundaries:hoekvanholland:water_level”, 2.0)

Alternative 2:

get_value("boundaries")
identify boundary index
set_value_at_index("boundary_water_level", 2.0)

Programmatically generate language specifications

The language specification files for C, C++, Fortran, and Python should be programmatically generated from the same source (such as the BMI SIDL file). They are currently created manually.

SIDL reminders in generated files?

If certain files (e.g. bmi-c/bmi.h) are generated via the bmi.sidl and some procedure, this fact should be a comment in those files, as to yet again remind authors of the proper workflow.

openjournals/joss-reviews#2317

Harmonize MODFLOW 6 and CSDMS BMI implementation approaches

I'm opening this issue to facilitate discussion of how we can harmonize the BMI implementation techniques used by the MODFLOW development team and CSDMS.

The MODFLOW development team of @jdhughes-usgs and @langevin-usgs at USGS, working with @mjr-deltares at Deltares, has developed a BMI implementation for MODFLOW 6 that includes extensions (the AMI) that allow internal MODFLOW components to be tightly coupled within a time step as they iterate to a solution. The goal of this interstitial coupling is to eventually allow other models, such as PRMS or MetaSWAP, to be coupled with MODFLOW. Further, this team has developed a Python package, amipy, that wraps both BMI and AMI and allows MODFLOW 6 to be called from and run within Python (example).

The CSDMS development team of @mcflugen and @mdpiper has a slightly different approach that yields similar results. A model developer implements the Fortran BMI specification for their model. At CSDMS, we then wrap the BMI with an interoperability layer (similar to what's integrated into the BMI/AMI above), then compile and link the Fortran code into a C library through Cython, creating a Python package that can be called standalone or from pymt. This process has been templated and automated, so that once a model with a BMI is provided to CSDMS, it can be quickly processed into a Python package. The PRMS Surface component provides an example of this process. Here are the key points:

Here is @rmcd-mscb's BMI implementation (see the src directory). Once Rich wrote his BMI, he was able to hand it over to CSDMS.
The tool we call the cookiecutter provides Cython templates for BMIs written in C, C++, and Fortran. It also provides the Fortran interoperability layer. All Fortran BMIs get this same interoperability layer. All of these are in this directory of the repo.
Here is the resulting PRMS Surface Python package output from the cookiecutter. See especially the Cython class at its core and the setup.py file which compiles and links the Cython extension.
In this repo I've included simple examples of using PRMS Surface as a standalone Python package and as a pymt component. Rich has also created a Jupyter Notebook example.

Add version number to API

On behalf of eWaterCycle II

Fix error in docs for get_grid_[xyz] functions

In the docs for each of the BMI functions get_grid_[xyz], there's this pair of sentences:

The length of the resulting one-dimensional array depends on the grid type. (It will have either get_grid_rank or get_grid_size elements.)

The part about get_grid_rank is wrong--for rectilinear grids, the length will be a value from get_grid_shape. This needs to be fixed here, as well as in the Model grids page.

Add support for Julia

The BMI should support Julia by including a language specification, a sample implementation, and documentation.

Add function to get grid units

The BMI should have a function (or functions) to get the units of a grid.

Add project documents to repository

In particular, we should include the following:

AUTHORS
CITATION
CODE_OF_CONDUCT
CONTRIBUTING

I'm basing these documents on what, e.g., AstroPy and pymt have.

Match SIDL file with docs

The SIDL file is a useful resource because it gives a compact, yet complete, overview of the BMI. It's out of sync with the docs, though. Here are three items we should fix:

Match types between SIDL and docs (see below)
Add BMI version number to SIDL file (see also #8)
Add the get_var_location method for BMI v1.1

Here are the type mismatches:

BMI function	type in SIDL	type in docs	type in bmi-tester
`get_grid_spacing`	double	int	float
`get_grid_origin`	double	float
`get_grid_x`	double	float	float
`get_grid_y`	double	float
`get_grid_z`	double	float

Update CONTRIBUTING document to describe RFC process

The CONTRIBUTING document should be updated to describe the Request For Comment (RFC) process that will be used to propose and approve substantive (non-bugfix) changes to the BMI.

Add a changelog

The BMI repository should have a changelog that's updated with each pull request and release.

Examples folder empty when cloned

😢

I was going to post the BMI template I made for terrainbento before I put the terrainbento specific things in it... then I realized you guys have a great examples folder.

But something about the symlinks doesn't persist through cloning.

I can see examples online and cloning creates the folder structure but no files.

Support grids of rank > 3

Currently, to obtain the nodes of rectilinear, structured quadrilateral, and unstructured grids, the BMI has functions

int get_grid_x(in int grid, in array<double, 1> x);
int get_grid_y(in int grid, in array<double, 1> y);
int get_grid_z(in int grid, in array<double, 1> z);

representing grids of up to three dimensions.

The BMI should also support grids of these types with rank greater than three.

Global grid administration for partitioned domains

eWatercycle II: MPI parallel BMI: provide global grid administration through BMI for partitioned domains. This is necessary for the caller to reconstruct the global grid more easily.

Reconstruction is model specific, so that would require a tool set

Add function denoting steady state model or data

Some models generate steady-state solutions. Similarly, some datasets represent a single time, or an averaged time period (e.g., ISRIC SoilGrids or USGS NLCD). In each case, the purpose of the BMI update function, as well as all the BMI time functions, become ambiguous. (How do you update a steady-state solution? What's the time step of the solution?)

This is a request to provide a BMI function that flags a model or dataset that doesn't change with time. When this flag is set we can set, or at least suggest, policies for how to implement the BMI update and time functions.

This issue arose from a recent discussion between @gantian127, @gregtucker, and @mcflugen.

Questions related to making the terrainbento BMI

@mdpiper @mcflugen

I'm working on the PyMT compatible BMI for terrainbento (PR 137 in that repo). I'm basing it off of

conversations with @mcflugen
the bmi.py file in this repo (based on recommendations from @mdpiper )

I have a couple of questions...

Is it reasonable to expect that eventually this repository will be packaged and distributed so that I can include it as a dependency and ensure I am always using an up-to-date bmi.py via something like from bmi import BMI (where bmi is the installed package)?
Based on conversations with @mcflugen I was under the impression that it was necessary to expose information about the input parameters (as opposed to state variables) to each model (as well as default values, reasonable ranges, etc). But I don't see anything like this in bmi.py. Can I get a recommendation?
If I create a bmi that inherits from bmi.py is it by definition compatible with PyMT. Or are there additional things I should expect to do?
- If the answer involved additional things, should I have been able to find this in documentation?

Pre-submission comments on BMI docs

A few comments and suggested edits on BMI docs:

"If the model’s state variables don’t change in time, then they can be computed by the initialize function and this function can just return without doing anything": Would an appropriate alternative design pattern, for situations where there is no time-stepping but a user might conceivably want to re-run a component (for example, after changing one or more variables using set_value) be to have update() re-run the calculations? If so, it's probably worth stating that after the above sentence in the docs.
get_input_item_count(): might be worth adding that this is the # of variables that can be set via set_value (if that's correct); similarly for get_value on get_output_item_count
"A model may have no input [or output] variables": could be mis-read as meaning "may not" equals "not allowed"; suggest "might" instead of "may", and/or adding something like "in which case the return value is empty"
get_var_units: is there an identifier for units that vary depending on context? If so, might be worth a mention (even just to note that this is also covered in UDUNITS, if it is)
get_var_location: the 3 possible returns don't include a scalar (i.e., single value), which makes me think that the variable information functions apply only to grid-based variables. So you could not, for example, use get_value to query the value of a single-valued parameter. I could imagine users getting confused about this, so the working definition of variable is probably worth explaining in the intro to the Variable Information Functions section.
"If the model doesn’t define an end time, a large number (e.g., 1.0e6) is typically chosen." - This makes sense, but I can imagine situations where this could cause issues---e.g., if a model's time units are seconds and it runs for more than a year (~pi x 10^7 sec). Would it make sense to suggest a default equal to the highest floating point number or something like that? Or maybe just an even bigger number: IEEE double-precision max exponent, according to wikipedia, is 308, so could suggest 1.0e308.
"Avoid using years as a unit, if possible, since a year is difficult to define precisely." - a lot of geologic-time models use years... Does UDUNITS define different forms of year, e.g., 365 days, 365.25 days, or a certain number of seconds for a proper astronomical year? If so, could suggest that here as an alternative for those models that insist on using years.
get_value: people might be confused by the reference to "buffer"; call it "the array parameter" instead?
"One grid could be a uniform rectilinear grid on which temperature is defined. A second grid could be a scalar, on which a constant thermal diffusivity is defined." - oh, that's cool! So maybe that's the answer to the question about scalar parameters: define a second grid that contains such-and-such variables. Not sure whether that's a practice we want to recommend, though...?
get_grid_type and following: "This function is needed for every grid type." - not sure what this means
get_grid_rank: is this the same as the # of dimensions in the grid arrays, ie 1d, 2d, or 3d? is it zero for a scalar grid?
get_grid_origin: what if you grid is 2d but its arrays are flat? Seems like it should be rank 1, but the origin should still have two values. I expect this would happen with unstructured grids, for example; might be worth a note somewhere about how this would normally be handled.
The reader has no way of knowing that the cool material under "Additional Topics" exists until they get to the bottom of the function description section. Suggest moving these links, or copying them, under the heading Basic Model Interface, between the end of the text here and Table 2. One way to organize would be a sub-head "BMI Functions" for the existing text, and then "Additional topics" with the links.
Grids: "The grid shape is the number of nodes..." - suggest "refers to the number of nodes", since the shape itself would be # rows and cols (eg). Or specify "number of rows and columns of nodes, as opposed to other types of element (such as cells or faces)"
Great illustrations in the grid section!
CSDMS Modeling Framework: great to refer to this, but I suspect if someone searched for what this is and where to try it out, they wouldn't find anything, because they don't know that CMF basically means "pymt" (these days). How about a sentence pointing them to pymt and its tutorials?
Also, suggest adding a "How to contact us for help incorporating a component" type of thing at the end of the CMF page.

Add link to bmi-forum.rtfd.org from csdms/bmi

There should be a link to the bmi docs in the README on csdms/bmi.

preprocessor conditionals missing in C++ header

To prevent multiple definitions of the same symbols during the linking stage, it is standard practice to wrap every C/C++ header file into its own preprocessor conditionals. For the bmi.cxx this means we have to use something like

#ifndef BMI_HXX_INCLUDED
#define BMI_HXX_INCLUDED
...contents of current bmi.hxx...
#endif

csdms / bmi Goto Github PK

bmi's Introduction

The Basic Model Interface

bmi's People

Contributors

Stargazers

Watchers

Forkers

bmi's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs