escomp / cam-sima Goto Github PK

View Code? Open in Web Editor NEW

3.0 11.0 12.0 26.61 MB

Community Atmosphere Model - System for Integrated Modeling of the Atmosphere

cam-sima's Introduction

CAM-SIMA

Community Atmosphere Model - System for Integrated Modeling of the Atmosphere

NOTE: Only developmental code exists at the moment. This README will be updated once production code becomes available.

Current code status:

How to checkout and use CAM-SIMA:

The instructions below assume you have cloned this repository and are in the repository directory. For example:

git clone https://github.com/ESCOMP/CAM-SIMA.git
cd CAM-SIMA

To use unsupported CAM-SIMA development code:

NOTE: This is unsupported development code and is subject to the CESM developer's agreement.

git checkout development
bin/git-fleximod update

Good luck, and have a great day!

cam-sima's People

Contributors

Stargazers

Watchers

Forkers

nusbaume gold2718 peverwhee mwaxmonsky mattldawson cacraigucar katetc kuanchihwang gdicker1 jtruesdal briandobbins jimmielin

cam-sima's Issues

Add user_nl_cam file, or replace with something else?

Has there been any discussion on modifying the "user_nl_XXX" method for changing namelist variables (say, replacing it with an XML file)? If so, should we think about implementing that now?

Otherwise, we will need to add the "user_nl_cam" file to the "cime_config" directory in order for it to be properly copied to one's case after running the "case_setup" script.

Set "Filepath" directly in buildlib script

In the current version of CAM, the "buildcpp" script creates a "Filepath" text file, which is then read in by "buildlib" to create another "Filepath" file in the actual build directory (/bld/atm/obj).

However, I believe the Filepath file can be created internally in "buildlib" and written only once to the build directory. This is especially true given that with creation of the data object as outlined in issue #8, there would be no information that buildcpp would have that buildlib wouldn't. Given this, I would personally recommend making this change for two main reasons:

It reduces the total number of files being created, particularly files that are just copies of other files.
Creating the "Filepath" file internally in buildlib would match what most (all?) of the other CESM component models are doing.

Of course, if there are any concerns or objections to this plan please let me know. Thanks!

Implement diagnostic capability

Implement diagnostic (history) capability in CAM. This capability requires a re-implementation of CAM's history infrastructure because physics, the source of many diagnostic output calls, is now CCPP-compliant and cannot require any CAM-specific calls. This task requires several steps:

Split CAM history configuration, restart, and output from output processing (i.e., outfld)
Implement output processing functionality as independent library (possibly as part of CCPP Framework)
Implement configuration (i.e., addfld equivalent) of diagnostic variables in CCPP metadata
Implement new build-time infrastructure to turn CCPP diagnostic information into code and data to be able to process namelist input and set up the necessary buffers.

CAM needs registry

Need to create a registry for CAM to contain all state, tendency, and surface flux information. Each field is to be accompanied by complete metadata.
The registry can be organized in to hierarchical sections (e.g., file, DDT).
Part of the registry is a pre-build script (to be called from buildcpp) which will be responsible for:

Creating Fortran files for defining the data along with allocation, deallocation, and any other required interfaces.
Creating an associated CCPP metadata file for each Fortran file

Need to change standard names of "zi" and "zm" in CAMDEN registry

The variables zi and zm in CAMDEN's registry (registry.xml) are geopotential height, not geopotential. Thus the standard names must be modified to reflect this, as otherwise the CCPP-framework will be unable to properly generate the interface routines for physics suites that use geopotential height (such as Kessler).

CAMDEN doesn't run when built with NAG

Although CAMDEN compiles successfully with NAG (on Izumi), attempting to run the model with the kessler physics suite results in the following error:

ERROR: cam_ccpp_physics_initialize: pres_to_density_dry_init: attempt to set cpair to inconsistent value

This error also only seems to appear with NAG (Intel and PGI run without an issue).

CAMDEN doesn't work "out of the box" with python 3.7 or 3.8

For python 3.7, attempting to generate the registry results in this error:

 File "/home/runner/work/repoForTestingCAM/repoForTestingCAM/src/data/generate_registry_data.py", line 1502, in gen_registry
    error_on_noxmllint=error_on_no_validate)
  File "/home/runner/work/repoForTestingCAM/repoForTestingCAM/ccpp_framework/scripts/parse_tools/xml_tools.py", line 200, in validate_xml_file
    result = call_command(cmd, logger)
  File "/home/runner/work/repoForTestingCAM/repoForTestingCAM/ccpp_framework/scripts/parse_tools/xml_tools.py", line 53, in call_command
    stderr=subprocess.STDOUT)
  File "/opt/hostedtoolcache/Python/3.7.6/x64/lib/python3.7/subprocess.py", line 483, in run
    raise ValueError('stdout and stderr arguments may not be used '
ValueError: stdout and stderr arguments may not be used with capture_output.

which appears to be a bug in the CCPP framework.

For python 3.8, running checkout_externals fails, and requires updating manage_externals to "manic-v1.1.8" or greater in order to work as expected.

Implement required field functionality

In general, it will be helpful for CAM to know which physics fields need to be initialized so that it can perform a check and attempt to read in any field not already initialized.
For the FPHYStest compset, this is especially critical because all data is read in from an ncdata file.

This feature requires several interacting pieces:

The CCPP framework (capgen) must return a list of all required fields (standard names) for each suite.
The registry should provide a variable (initialized to .false.) for each field that indicates whether that field has been initialized.
The registry should provide a logical for every field that indicates whether it is required. This information should include a default value which can be a literal or another standard name (so that default x = y means that x is required if y is. The default default value is .false..
The initialized flag for a field should be set in read_field (called from physics_read_data).

project test issue -JN

Some text to ignore

Null dycore needs to read state on every timestep

Dycore needs to read items in its "state" during init and then at run time (probably dyn_run1).
dyn_init should get ncdata and bnd_topo file pointers as inputs and store them as module variables.
Can null dycore use physgrid? Can it use infrastructure in physics_data (e.g., read_field).

Add ability to specify meta-data file in registry

It would be beneficial if the registry could specify meta-data files, which could then be read in order to set variable name and initialization arrays with variables that are required, but which may not need to be included directly in the registry itself.

fifth issue

issue number 5

Generate input file variable initialization subroutine using registry

Although the list of variable initial condition input names and CCPP-required variables is generated automatically using the registry and CCPP physics suites, the actual subroutine (physics_read_data) that initializes these variables at run-time using the input files is entirely hard-coded, which would require constant modification in the future, and could result in a potentially long and inefficient fortran subroutine.

To alleviate this problem, the input file initialization subroutine could be generated directly using the registry at build-time. This would require future users to only modify the registry when adding new variables, and could also allow the subroutine to remain as small and efficient as required.

Along with this, during the code generation an additional variable array could be created which contains all previously-initialized variables, which will allow the subroutine to skip any variable that doesn't need to be initialized from the initial conditions file(s).

Need "definition.xml" file, or a replacement plan.

In the current version of CAM, much of the meta-data and default values of the CAM configuration options and variables are stored in a "definition.xml" file, currently located in the "bld/config_files" directory. It would be good to keep these meta-data and default values in CAMDEN, and I see at least three ways we can do so:

Keep the "definition.xml" file, and read-in the file directly into buildcpp or buildnml. If we go this route, I woul recommend moving it from "bld" to "cime_config", where it is actually being used. It might also make sense to add an XML Schema (xsd) file as well, so that "definition.xml" matches other CIME-based xml files.
Move the "definition.xml" data to a different xml file, like CAM's "config_component.xml" file, which would also be read directly into buildcpp and/or buildnml. This would allow us to reduce the total number of xml files, but might require us to modify CIME itself (as the schema file for the "config" xml files is under CIME, not CAM).
Set all of the meta-data and default values directly in the buildcpp and/or buildnml scripts. This would also reduce the number of xml files, and avoid modifications to CIME, but will likely make the overall process more obtuse, and thus possibly harder to modify in the future.

What do you all think? Any thoughts or opinions anyone has would certainly be appreciated.

Port the Eulerian dycore from CAM

Port the Eulerian dycore from the CAM cam_development branch.

Bring in MUSICA chemistry

Add the MUSICA Chemistry as an external. The test for this functionality will be both a standalone terminator chemistry test and the CAM-SE Held-Suarez terminator chemistry test.

Buildcpp/Configure filles need added, re-written, or modified.

The CAMDEN code is now at a point where it will run through "Buildcpp", but of course dies without any "configure" script present. So, I personally see at least three possible ways forward:

Copy the current CAM configure script into CAMDEN directly. This will be the quickest way to fix this issue, but will not improve this infrastructure in any meaningful way.
Write a new configure script, ideally in python. This will allow for the configure script to be cleaned-up and potentially improved (and reduce the reliance on perl). The disadvantage, of course, is that this will take time.
Add all the needed processes/actions into the "Buildcpp" script directly. This will provide the same benefits as option 2, but will also remove the "configure" script entirely. The dis-advantage of course is that this will also take time, and could make "buildcpp" a very large script (which could make readability worse).

What do you all think? I am happy to work on whatever option is decided on, but I didn't want to assign myself just yet in case someone else wants to tackle it instead.

Namelist definitions and defaults

In the current version of CAM, namelist variable definitions and defaults are stored in two XML files (namelist_definition.xml and namelist_defaults_cam.xml) which are located in "cam/bld/namelist_files". These XML files are designed to be read using the "build-namelist" perl script, which is likelly being removed (see issue #9). Plus there are arguably more efficient ways to store this information. Given this, I propose replacing this current set-up with one of two possible alternatives:

replace the files with a CIME namelist definition XML file. This is what is done in several other CESM component models, such as CICE and RTM/MOSART, and would allow for CAM to use CIME's "NamelistGenerator", which would do much of the work of creating the actual namelist for us. Also the organization of the CIME-based XML file allows for the combination of the definitions and defaults into the same file, reducing the total number of files required.
Add the namelist definitions/defaults to the new CAM "configure" python data structure described in issue #8. This would remove all namelist XML files from the "cime_config" directory, and would potentially allow for a code-level unification with the buildcpp configure data. However, the disadvantage is that it could make the data structure substantially more complicated, and possibly harder to follow when reading the code itself.

Any thoughts or opinions? I am personally leaning towards option 1, but could be convinced otherwise if option 2 is preferred by others.

Thanks!

Set CAM physics configuration options via CPF XML file(s).

Much of the CAM physics configurations done in buildcpp and elsewhere could simply be set by the CPF physics suite file(s). This would help ensure that the actual CAM configuration is consistent with the user-specified physics suite(s), and would allow for additional physics configurations to be added with little to no increase in the complexity of the configuration (buildXXX) scripts themselves.

CAMDEN errors out when NAG non-debug is used

Need to track down why NAG non-debug is causing the program to error out

percent symbol (%) currently not allowed in CIME namelist definition ids

There appear to be several SILHS namelist variable names (ids) that contain a percent symbol (%), as can be found in the "namelist_definition_cam.xml" file. For example, the namelist variable:

subcol_silhs_hmp2_ip_on_hmm2_ip_slope%rr

The problem is that the regular expression used in the CIME namelist scripts doesn't recognize the % symbol, and thus causes the namelist build to fail. There are two long-term solutions I can see to this problem:

Re-name the variable ids so that they don't contain a percent symbol, for example:

subcol_silhs_hmp2_ip_on_hmm2_ip_slope_rr

Modify the CIME regular expression to allow for percent (%) symbols.

Option 1 is the easiest on my end, but given that I haven't worked with SILHS at all I am not sure what impact it might have on that code, or if we want to have % symbols in namelist variables in the future.

Any thoughts or opinions?

Another test

another test to test automation

Implement online comparison tool

An online comparison tool will compare expected vs. computed tendencies and/or updated state.
The tool needs:

A namelist variable with a filename containing expected results
Some way to access a list of field names and associated computed fields for comparison
A read method to read in expected values
A comparison method which can check B4B or 'close' matches.
A way to generate local and/or global statistics on misses (similar to QNEG3 processing).

Need to add "null" grid option to CIME

The current strategy for CAMDEN is to set the dynamical core using the user-specifed horizontal grid. For the physics testbed, this requires a "null" or "none" grid option, not only to set the dynamical core to "none", but to also allow for the grid in CAM to be defined by the meteorological input fields being used.

However, in order to do so, CIME must be modified to allow for a "null" grid option, which it currently rejects. This will also require a CAMDEN-specific CIME fork and/or branch to be created, at least until the modifications can be merged back into CIME itself.

buildlib logger doesn't write debug statements

When trying to do some debugging on Cheyenne, I found that any text within a logger.debug statement in CAM's buildlib script was not being written to any file I was aware of (I searched through every file in the case, build, and run directories). This is true even if "DEBUG" is set to "TRUE" in env_build.xml, or if the --debug flag is used when running case.build, or both.

However, all logger.info statements work just fine. Thus my assumption is that some log (or handler) level set somewhere else in CIME is blocking the debug statements. Given this, I imagine that the solution is either to determine where else the log level is being set and adjust it to allow for debug statements, or to simply use info-level statements, but within if-statements that check if the "DEBUG" variable is set to "TRUE" in the case.

Add horiz_grid.xml file, or derive cppdefs directly from gridname.

A number of horizontal grid-related cppdefs, like the number of latitudes and longitudes, are currently determined by matching the user-specified grid name with the equivalent variable in a "horiz_grid.xml" file, which currently isn't present in CAMDEN. So, I see two possible options for fixing this issue:

Add "horiz_grid.xml", which will then be read-in by the "buildcpp" script.
Have "buildcpp" derive the related CPP defs from the user-provided grid-name itself.

The advantage of option 2 is that it would remove an extra XML file which seems to only be used by CAM. The dis-advantage, though, is that in the future if a user wants to add a new grid, or dycore, they may need to explicitly add code to buildcpp, as opposed to just adding some XML variables.

Any preferences, or other ideas?

Need CAM xml files required for "create_newcase"

Need to add xml files in order for a case to be created, which is also required for "buildcpp" to run. Files needed are:

config_component.xml
config_compsets.xml
config_pes.xml

which are all stored in CAM's "cime_config" directory. Ideally these files will be pared down to only include the physics testbed compsets, or any other compset that will be used first to test the new CAM infrastructure, with additional compsets added later.

Port the SE dycore from CAM

Port the SE dycore from the CAM cam_development branch. Some work items:

fix buildlib
Rewrite dyn_grid interfaces for new physgrid interface
Merge stepon code into dyn_comp
Need interfaces for addfld, add_default, and outfld.
Reimplement dp_coupling using new data structures.
Write generic dp_move and pd_move routines for phys_grid.
Separate out SE namelist and implement multi-namelist functionality in buildnml

Replace "config_cache.xml" with internal python data structure?

The file "config_cache.xml" appears to only be used by the "build-namelist" perl script to generate the CAM namelist. This could be done instead by simply having "buildcpp" output a python data structure (like an object or list) directly to "buildnml", which could then use that information as-is, by-passing the need to read/write an XML file.

Of course, the dis-advantage is that the "config_cache.xml" file does potentially provide additional information/meta-data to users, which may be hidden if everything is passed in python internally. This will also require "buildcpp" to be run every-time "buildnml" or "build-namelist" is run, which could make certain script calls by the user take longer.

Does anyone have any thoughts or opinions on this? I figured I would leave this as just a pure discussion for now, as it isn't "required" to get CAMDEN up and running, although it might be easier to implement sooner rather than later.

Implement chemistry interface

Implement a way to configure and run a chemistry suite as a separate step, i.e., after running a physics suite. Some required steps:

Add a configuration (--chem) to CAM_CONFIG_OPTS which will point to a chemistry SDF or SDFs.
Add a a chemistry suite to the namelist
Add CCPP calls in phys_comp to call the chemistry suite, if configured.
Add the chemistry suite to the suites checked for input variable processing.

FPHYStest not handling lat/lon input data

CAMDEN is not correctly managing the FPHYStest compset for structured (lon/lat) input data files.

CAMDEN doesn't compile with NTHRDS > 1

CAMDEN currently fails during compilation with Intel (on cheyenne at least) if "NTHRDS" is greater than 1 when using the Kessler physics suite with the following error:

ccpp_kessler_cam_cap.F90(59): error #6404: This name does not have a type, and must have an explicit type. [OMP_GET_THREAD_NUM]

This may be an issue with the CCPP framework, but it is not completely clear at the moment.

Trouble reading lat / lon grids

CAMDEN sometimes has trouble reading input data (ncdata) from a lat/lon input file (as opposed to an unstructured grid input file).
This may be due to how latitude coordinates are read in.

Complete registry

Currently, the CAMDEN registry only has (most?) state and tendency variables. At the very least, these need to be supplemented with the surface flux fields (from cam_in, cam_out).
Note that some of these fields are conditionally allocated. For now, indicate these with comments which include the logical variable name used to control allocation -- we will work out formal syntax later.
These structures should probably go into a new module (cam_surface_exchange_types?).

Have "buildnml" completely replace "build-namelist"?

Just like the "buildcpp" and "configure" scripts discussed in issue #6, it looks like the "buildnml" script in the "cime_config" directory is mostly just a wrapper for the larger "build-namelist" perl script in CAM's "bld" directory.

Given this, is there any interest in eliminating the "build-namelist" script, and instead having the "buildnml" script do the work of actually creating the CAM namelist? Doing this would also potentially help implement issue #8 as well, if that is also of interest.

Create CAM Physics Testbed compset

Need to create a new "F" compset for the CAM physics testbed. The compset will be named:

FPHYStest6

And will default to cam6 physics and year 2000 boundary conditions (similar to the "PORT" compsets), unless there are any objections or preferred alternatives.

CAM registry unit tests have failures

While performing unit tests on CAMDEN, I found that there were five failures that exist in the current version of the repo when running test_registry.py on Cheyenne and Izumi that I am not sure how to fix, at least currently. These failures are:

======================================================================
ERROR: test_parameter (main.RegistryTest)
Test a registry with a parameter.

Traceback (most recent call last):
File "test_registry.py", line 244, in test_parameter
error_on_no_validate=True)
File "/glade/work/nusbaume/SE_projects/new_cam_sandbox/CAMDEN/src/data/generate_registry_data.py", line 1504, in gen_registry
write_registry_files(registry, dycore, config, outdir, indent, logger)
File "/glade/work/nusbaume/SE_projects/new_cam_sandbox/CAMDEN/src/data/generate_registry_data.py", line 1419, in write_registry_files
files.append(File(section, known_types, dycore, config, logger))
File "/glade/work/nusbaume/SE_projects/new_cam_sandbox/CAMDEN/src/data/generate_registry_data.py", line 1198, in init
logger)
File "/glade/work/nusbaume/SE_projects/new_cam_sandbox/CAMDEN/src/data/generate_registry_data.py", line 457, in init
raise CCPPError(emsg.format(local_name))
CCPPError: parameter, 'pver', does not have an initial value

======================================================================
FAIL: test_good_ddt_registry (main.RegistryTest)
Test code and metadata generation from a good registry with a DDT.

Traceback (most recent call last):
File "test_registry.py", line 160, in test_good_ddt_registry
shallow=False), msg=amsg)
AssertionError: /glade/work/nusbaume/SE_projects/new_cam_sandbox/CAMDEN/test/unit/tmp/physics_types_ddt_fv.F90 does not exist

======================================================================
FAIL: test_good_ddt_registry2 (main.RegistryTest)
Test code and metadata generation from a good registry with DDTs

Traceback (most recent call last):
File "test_registry.py", line 203, in test_good_ddt_registry2
shallow=False))
AssertionError: False is not true

======================================================================
FAIL: test_good_simple_registry (main.RegistryTest)
Test that a good registry with only variables validates.

Traceback (most recent call last):
File "test_registry.py", line 119, in test_good_simple_registry
self.assertTrue(filecmp.cmp(in_meta, out_meta, shallow=False))
AssertionError: False is not true

======================================================================
FAIL: test_unknown_dimensions (main.RegistryTest)
Test a registry with a variable with an unknown dimension.

Traceback (most recent call last):
File "test_registry.py", line 418, in test_unknown_dimensions
error_on_no_validate=True)
AssertionError: ValueError not raised

Ran 17 tests in 0.137s

FAILED (failures=4, errors=1)

Hopefully these tests will either be fixed or removed in a future PR.

Port the FV dycore from CAM

Port the FV dycore from the CAM cam_development branch.

test issue

Simply to test automation

Automatic unit-testing in CAMDEN via github actions

There are numerous different python unit tests that currently exist in CAMDEN. Although ideally they would all be run manually before a push or pull request, it can be easy to forget. Thus it would be beneficial to have a github action set-up for the CAMDEN repo that automatically runs these unit tests whenever a push or pull request occurs.

Doing so will help ensure that the tests are always run, and thus (hopefully) decrease the odds of a python bug slipping in to the main CAMDEN code base (as well as other relevant branches and forks).

Add initialization logical array and subroutine in physics_types.F90

In order to check that a variable has already been initialized before attempting to read it in via a physics input file, an array is needed that is the same length as the variable standard name array, but which contains True/False values. This array can then be modified whenever a subroutine initializes a variable, so that the only variables that are "False" are the ones that do in fact need to be read from an input file.

third test

third test comment

Need to replace comma with another symbol in "--physics_suites" config option.

It was found that using a comma to separate different physics and chemistry suites in CAM_CONFIG_OPTS results in an error when a user runs CIME's xmlchange command like so:

./xmlchange CAM_CONFIG_OPTS="--physics_suites suite1,suite2"

Thus to avoid this error, the comma separator should be replaced with another symbol, like & or ;, in cam_config.py.

Keep camconf namelist files?

There are several namelist files that are created and saved in a case's "Buildconf/camconf" directory whenever the namelist is built. These files include "namelist", "drv_flds_in", "docn_in", and "atm_in". The files "atm_in", "drv_flds_in", and "docn_in" have copies in the case's run directory, while the "namelist" file stores a few unique variables that could likely be output to the user via the build scripts themselves.

Give this, is there any reason to keep these files around? In other words, do you (or others you know of) look at these files in "Buildconf/camconf" when working with a case? If not, then I would probably vote to remove them, and keep everything either internally in the build scripts, or in the run directory copies. However, if there is a desire to keep these files in the "camconf" directory then I can just leave them as-is.

Thanks in advance for any thoughts or opinions!

Implement dycores

Port the dycores from the CAM cam_development branch.
This is probably best done after or in conjunction with development of the new dycore to physics grid interface (the interface currently used in the NULL dycore).

Need to add "none" option to "CAM_DYCORE" XML variable

In order to have a "none" option for a dynamical core (which is what we want for the physics testbed), we will need to add it as a valid value to the "CAM_DYCORE" XML variable in the "config_component.xml" file, located in CAM's "cime_config" directory. Otherwise the "buildXXX" scripts may not run properly, nor will the case specific "env_build.xml" file be set properly.