GithubHelp home page GithubHelp logo

fesom / fesom2 Goto Github PK

View Code? Open in Web Editor NEW
45.0 45.0 46.0 134.05 MB

Multi-resolution ocean general circulation model.

Home Page: http://fesom.de/

License: GNU General Public License v3.0

CMake 0.21% Shell 0.26% Makefile 0.11% Fortran 20.85% C 16.32% C++ 0.03% Batchfile 0.01% Python 4.17% Jupyter Notebook 58.01% Slice 0.01% Io 0.01% NASL 0.03%

fesom2's People

Contributors

cwekerle avatar dguibert avatar dsidoren avatar goord avatar hegish avatar helgegoessling avatar janstreffing avatar koldunovn avatar mandresm avatar ogurses avatar patrickscholz avatar pgierz avatar qiangclimate avatar rakowsk avatar suvarchal avatar trackow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fesom2's Issues

Character Lengths of Paths

Hi Fesom devs,

Several of the paths in namelist.config have predefined maximum lengths. I would recommend that they become arbitrarily long so the users can do what they want. e.g. this

character(5)           :: runid='test1'       ! a model/setup name
...
character(100)         :: MeshPath='./mesh/'

could be replaced with something that just reads the entire string.

That all happens in gen_modules_config.F90

FESOM2 data not fully reproducible on ollie

#146

Problem:

  • FESOM2.0 simulations on AWI ollie HPC were not always reproducible. Runs with identical namelists, init-conditions ... could lead to sligthly different results. The bias produced by ollie looked like...
    plot_ctrl-orig_cyl_sst_rec_i0_
  • shows sst bias on ollie with core2 after 1 month of simulation
  • i was able to track down this bias to problems with the exponential funktion on ollie

Solution from Natalja Rakowsky:

  • exchange compiler flag -fast-transcendentals against -no-fast-transcendentals (slower not vectorizable) or better -fimf-use-svml (should be faster is vectorizeable)

What happens from Natalja Rakowsky:
1.) IEEE legt sich bei transzendenten Funktionen nicht fest.

2.) Vektor-, Skalar- und alternativ etwas ungenauere, schnellere Implementierungen (mit -no-fast-transcendentals erlaubt) können drei verschiedene Ergebnisse liefern. In heutigen Systemen offenbar: Es ist nicht deterministisch, welcher Weg beschritten wird.

3.) Mit "-no-fast-transcendentials" legt man fest, dass immer die skalare Implementierung benutzt wird. Damit sind Schleifen mit exp, sin, cos,... nicht vektorisierbar und damit teils deutlich langsamer.
Intel hat seit Version 18 darum alternativ im Angebot, mit "-fimf-use-svml" exp + co immer deterministisch durch die Vektor-Implementierung zu ziehen. Das ist skalar kaum langsam, vektorisiert deutlich schneller, und sollte bitte, bitte auch deterministische Ergebnisse liefern.

Impact of "recent" kpp_Kv0 update

As noted by @mandresm on slack, there was a "recent" change to the namelist.cvmix parameter kpp_Kv0 36ea7ce

The the commit message reads:

change back to traditional value 0.05 --> 0.005 of kpp maximum diffusivity of shear driven mixing --> suggestion by Qiang --> reduced biases

I branched off for the AWICM3 DECK simulations before this commit and so used the old value. Could you illuminate what type and strength of bias reduction you saw? @qiangclimate @patrickscholz

async_threads_cpp: PGI requires `-D__GCC_HAVE_SYNC_COMPARE_AND_SWAP_{1,2,4}`

I try to compile master branch or pgi-fixes branch or #57, I got this error

[...]
  --> found calendar attr. in time axis: |noleap|
[spartan416:57866] *** Process received signal ***
[spartan416:57866] Signal: Segmentation fault (11)
[spartan416:57866] Signal code:  (128)
[spartan416:57866] Failing at address: (nil)
[spartan416:57866] [ 0] /usr/lib64/libpthread.so.0(+0xf630)[0x2b38ed947630]
[spartan416:57866] [ 1] /usr/lib64/libstdc++.so.6(+0xb506d)[0x2b38eb76606d]
[spartan416:57866] [ 2] /usr/lib64/libpthread.so.0(+0x7ea5)[0x2b38ed93fea5]
[spartan416:57866] [ 3] /usr/lib64/libc.so.6(clone+0x6d)[0x2b38ee7da8cd]
[spartan416:57866] *** End of error message ***

The error disappears when this keyword is added when compiling with PGI compilers in src/async_threads_cpp/CMakeLists.txt.

if(${CMAKE_CXX_COMPILER_ID} STREQUAL PGI )
   target_compile_options(${PROJECT_NAME} PRIVATE -D__GCC_HAVE_SYNC_COMPARE_AND_SWAP_{1,2,4})
 endif()

@goord and @gemoulard can you confirm this is still required?
I can't figure out why.

conflict in fesom.mesh.diag.nc file

There is a conflict in your mesh.diag file.
The variable is called ‘elem’ and the dimension is called ‘elem’, as well.
It works when I set the dimension to ‘elem_n’.

Runoff interpolation issue when using constant runoff (CORE2)

Hello everyone,

I think I found a bug when using constant runoff, in my case CORE2. The interpolation onto the model grid produces some weird results that does not look like the original runoff data at all. When looking at the Kara Sea or the Amazon runoff the large runoff in the original runoff file vanishes in the interpolated field and instead there is very weak and randomly distributed runoff all over the ocean.

runoff_amazon

runoff_kara_sea

Using runoff_data_source = 'CORE2' in namelist.forcing triggers a special case in gen_surface_forcing.F90 where read_other_netCDF is called to load and interpolate the runoff.

! runoff    
      if (runoff_data_source=='CORE1' .or. runoff_data_source=='CORE2' ) then
         ! runoff in CORE is constant in time
         ! Warning: For a global mesh, conservative scheme is to be updated!!
         call read_other_NetCDF(nm_runoff_file, 'Foxx_o_roff', 1, runoff, .false., mesh) 
         runoff=runoff/1000.0_WP  ! Kg/s/m2 --> m/s
      end if

Maybe there is something wrong in the interpolation routine? I initially tried to load regional wind anomaly fields using the subroutine read_other_netcdf and those were randomly scattered all over the grid as well in the interpolated field.

I ran the model on CORE grid for two month with daily output. The runoff output is constant in time.
In namelist.forcing i used:

&nam_sbc
...
   nm_runoff_file     ='/work/ollie/clidyn/forcing/JRA55-do-v1.4.0/CORE2_runoff.nc'
   runoff_data_source ='CORE2'	!Dai09, CORE2
   runoff_climatology =.false.
...
/

Any ideas whether I did something wrong or if this is a bug are very much appreciated.

Best,
Finn

Makefiles, can we clean them up?

Currently we have a lot of old Makefiles hanging around. In the root we have:

Makefile
Makefile.in
Makefile.in_dkrz
Makefile.in_hlrn3_crayintel
Makefile.in_ollie

In the src:

Makefile
Makefile_hlrn
Makefile_ollie

And also folders dkrz, hlrn and ollie.

Can we delete some of them? @rakowsk I know you are using Makefiles, maybe someone else?

FESOM2 too much noise

Dear FESOM2 developers,
The output velocity of FESOM2 seems to be too perturbed even if it is averaged for 50 years (please check figure 1 the 30S Atlantic section and figure 2 the 1000m horizontal velocity distribution, averaged from 1960-2009). I have checked the plotting routine and I didn't find bugs in them.

This run is identity with all the settings in the repository and is run with COREII forcing from 1948-2009 (only one cycle).
namelist dir: /home/ollie/psong/esm-tools/esm-master/fesom-2.0/config_notide1
output dir: /work/ollie/psong/FESOM2/fesom2_NOTIDE_compare1
plot ipynb dir:/home/ollie/psong/post_spy_fesom2/ipynb_code/contour_v/test_linfs.ipynb

section
horizontal

Running sequentially (on 1 processor)

Is it possible to run FESOM on 1 processor?
In fvom_init, it is not possible to generate a dist_1 directory. Can we disable this to run/profile on 1 processor, will fesom be ok?

Missing nod_in_elem2D data in mesh diag

The variable nod_in_elem2D in fesom.mesh.diag.nc which , I guess, is intended to provide indices of elements that contain a given node is filled with just zeros.

I am coming from recently updated pi-grid test data in pyfesom2, so I wonder if this might be applicable to other FESOM2 simulations a well?

To reproduce:

FESOM2 writing restart files slowly on some machines

After running AWI-CM3 on Aleph for the first time I found that we have an issue with writing restart files extremely slowly there. Are are looking at write speed of less than 3 MB/s. Some maschines (e.g Ollie and Mistral) apparently have their NetCDF IO layer configured differently and do this for us, but on Juwels and Aleph we need to order the dimensions correctly ourselves.

I had previously found and fixed this for Juwels with these two commits:
9bce28a
d710062

Shortly after I made these changes @hegish merged his substantial changes and improvements to regular io into the master branch. Merging what I did for Juwels became quite difficult thereafter and we never attempted it. Now I think we have to.

To recap from the old gitlab issue: https://gitlab.dkrz.de/FESOM/fesom2/-/issues/19

We are writing restart files like this (now outdated):

        do lev=1, size1
           laux=id%var(i)%pt2(lev,:)
           t0=MPI_Wtime()
           if (size1==nod2D  .or. size2==nod2D)  call gather_nod (laux, aux)
           if (size1==elem2D .or. size2==elem2D) call gather_elem(laux, aux)
           t1=MPI_Wtime()
           if (mype==0) then
              id%error_status(c)=nf_put_vara_double(id%ncid, id%var(i)%code, (/lev, 1, id%rec_count/), (/1, size2, 1/), aux, 1); c=c+1
           end if
           t2=MPI_Wtime()
           if (mype==0 .and. size2==nod2D) write(*,*) 'nvar: ', i, 'lev: ', lev, 'gather_nod: ', t1-t0
           if (mype==0 .and. size2==nod2D) write(*,*) 'nvar: ', i, 'lev: ', lev, 'nf_put_var: ', t2-t1
        end do

We are holding the first dimension, lev. Since we are writing from Fortran, NetCDF transposes the values in the output file, compared to what we tell it to write here. Order as shown via ncdump:
double u(time, elem, nz_1) ;
During writing, this means we hold fix, lev, the dimension that changes most often in the original data in memory and NetCDF has to start a new write process each nod2D block. In 2D imagine if you want to write one row of 100000 values, but what you do is write 100000 columns of length one. This is confirmed by using the darshan IO logging tool, showing the number of seek and write access for writing a single restart file, as well as the size of the write accesses. You can find the full logfile attached.
Before: streffin_fesom.x_id2406075_6-29-66061-10726804095939863824_1.darshan_1_.pdf
Screenshot from 2021-04-19 17-06-48

As you can see, we are generating 84 million accesses with just a CORE2 mesh restart. On Juwels and Aleph this is painfully slow (~25 min for a CORE2 restart).

For the old routines I had found a working solution on Juwels in chunking the netcdf data in io_restart.F90:

     if (n==1) then
        id%error_status(c)=nf_def_var_chunking(id%ncid, id%var(j)%code, NF_CHUNKED, (/1/)); c=c+1
     elseif (n==2) then
        id%error_status(c)=nf_def_var_chunking(id%ncid, id%var(j)%code, NF_CHUNKED, (/1, id%dim(1)%size/)); c=c+1
     end if

and io_meandata.F90:

  if (entry%ndim==1) then
     entry%error_status(c) = nf_def_var_chunking(entry%ncid, entry%varID, NF_CHUNKED, (/1/)); c=c+1
  elseif (entry%ndim==2) then
     entry%error_status(c) = nf_def_var_chunking(entry%ncid, entry%varID, NF_CHUNKED, (/1,  entry%glsize(1)/)); c=c+1
  endif

After: streffin_fesom.x_id2412833_7-3-8403-13876901149393710826_1.darshan.pdf
Screenshot from 2021-04-19 17-09-45

This increased the output speed on a CORE2 restart on Juwels from 25 minutes to 40 seconds.

I think we have to revisit this issue and work on an implementation within the new IO scheme. @hegish @dsidoren

Mesh Rotation (or, our never-ending and very favorite "bug")

Hi guys,

I've been working on consolidating the various tools we have for mesh generation and subsequent input creation for ECHAM6/JSBACH. Part of the process runs the Triangle program with a few scripts that Sven Harig has provided for us.

Two questions arise from that:

  1. Are the meshes produced by the triangle program rotated or unrotated? How do I determine that?
  2. How do I rotate (or unrotate) them?

From my understanding, the new default should be unrotated (please correct me if I'm wrong)

Ghost values in the output when timestep is changes

When we write the output, the timestamp is something like output_period - time_step. So for monthly mean output we get the following:

'1978-01-31T23:50:00.000000000', 
'1978-02-28T23:50:00.000000000',
'1978-03-31T23:50:00.000000000', 
'1978-04-30T23:50:00.000000000',
'1978-05-31T23:50:00.000000000'

If at some point we change the timestep, and don't delete the output file (for very large simulations this is not really possible, since files are created on yearly basis), we might end up with the following records:

'1978-01-31T23:50:00.000000000',
'1978-01-31T23:55:00.000000000',
'1978-02-28T23:55:00.000000000', 
'1978-03-31T23:55:00.000000000',
'1978-04-30T23:55:00.000000000',
'1978-05-31T23:55:00.000000000'

So, there is an extra record from the previous run with larger time step.

I am not sure how to deal with it not to make things too complicated, but it becoming a problem in the data analysis, so some additional filtering of the output is required.

Corners for SCRIPR/CONSERV 1st order conservative remapping

Currently we are using global conservative flux residual redistribution for AWICM (1,2 & 3). Never versions of OASIS3 (MCT3 & MCT4) are offering a locally conservative remapping. I had a short discussion with the EC-Earth community and their idea is to switch from GAUSWGT remapping mit conservative redistribution to CONSERV remapping. @helgegoessling was also interested

  1. I think this would in principle also be of interest to us. Maybe @dsidoren, can comment.
  2. Similar to how we already write out the center lon/lat (coord_nod2D(1, i) & coord_nod2D(2, i)) for the current remapping schemes we would need corner lon/lats for CONSERV.

Would this from oce_mesh be the right vector? mesh%x_corners(n,mesh%nod_in_elem2D_num(n))

For reference, this is what the OASIS manual states:

If the SCRIPR/CONSERV remapping is specified, longitudes and latitudes for the source and target grid corners must also be available in the grids.nc file as double precision REAL arrays dimensioned(nx,ny,4) or (nbrpts,1,4) where 4 is the number of corners (in the counterclockwize sense, starting by any corner). The names of the arrays must be composed of the grid prefix and the suffix “.clo” or “.cla” for respectively the grid corner longitudes or latitudes. As for the other grid information, the corners can be provided in grids.nc before the run by the user or directly by the component code through specific calls (see section 2.2.4). Longitudes must be given in degrees East in the interval -360.0 to 720.0. Latitudes must be given in degrees North in the interval -90.0 to 90.0. Note that if some grid points overlap, it is recommended to define those points with the same number (e.g. 360.0 for both, not 450.0 for one and 90.0 for the other) to ensure automatic detection of overlap by OASIS3-MCT. The corners of a cell cannot be defined modulo 360 degrees. For example, a cell located over Greenwich will have to be defined with corners at -1.0 deg and 1.0 deg but not with corners at 359.0deg and 1.0 deg.

Mistral: ("/usr/include/c++/4.4.7/bits/unique_ptr.h") cannot be referenced -- it is a deleted function"

Hello everyone, I wanted to test out the output throughput improvements with the async output. I ran into this error trying to compile the latest master branch (4f7ec3e) on mistral. I assume some mistake in the environment?

[  0%] Building CXX object src/async_threads_cpp/CMakeFiles/async_threads_cpp.dir/ThreadsManager.cpp.o
/pf/a/a270092/frontiers/awicm-3.1/fesom-2.0/src/async_threads_cpp/ThreadsManager.cpp(33): error: more than one instance of overloaded function "std::to_string" matches the argument list:
            function "std::to_string(long long)"
            function "std::to_string(unsigned long long)"
            function "std::to_string(long double)"
            argument types are: (const int)
        string name = std::to_string(index_id); // todo: we do not seem to need a string here, use int in the map
                      ^

/pf/a/a270092/frontiers/awicm-3.1/fesom-2.0/src/async_threads_cpp/ThreadsManager.cpp(48): error: more than one instance of overloaded function "std::to_string" matches the argument list:
            function "std::to_string(long long)"
            function "std::to_string(unsigned long long)"
            function "std::to_string(long double)"
            argument types are: (const int)
        string name = std::to_string(index_id);
                      ^

/pf/a/a270092/frontiers/awicm-3.1/fesom-2.0/src/async_threads_cpp/ThreadsManager.cpp(66): error: more than one instance of overloaded function "std::to_string" matches the argument list:
            function "std::to_string(long long)"
            function "std::to_string(unsigned long long)"
            function "std::to_string(long double)"
            argument types are: (const int)
        string name = std::to_string(index_id);
                      ^

/usr/include/c++/4.4.7/bits/stl_pair.h(73): error: function "std::unique_ptr<_Tp, _Tp_Deleter>::unique_ptr(const std::unique_ptr<_Tp, _Tp_Deleter> &) [with _Tp=AWI::FortranCallback, _Tp_Deleter=std::default_delete<AWI::FortranCallback>]" (declared at line 214 of "/usr/include/c++/4.4.7/bits/unique_ptr.h") cannot be referenced -- it is a deleted function
        _T2 second;                ///< @c second is a copy of the second object
            ^
          detected during:
            implicit generation of "std::pair<_T1, _T2>::pair(const std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>> &) [with _T1=const std::string, _T2=std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>]" at line 136 of "/usr/include/c++/4.4.7/bits/stl_tree.h"
            instantiation of class "std::pair<_T1, _T2> [with _T1=const std::string, _T2=std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>]" at line 136 of "/usr/include/c++/4.4.7/bits/stl_tree.h"
            instantiation of "std::_Rb_tree_node<_Val>::_Rb_tree_node(_Args &&...) [with _Val=std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>, _Args=<const std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>> &>]" at line 111 of "/usr/include/c++/4.4.7/ext/new_allocator.h"
            instantiation of "void __gnu_cxx::new_allocator<_Tp>::construct(__gnu_cxx::new_allocator<_Tp>::pointer, _Args &&...) [with _Tp=std::_Rb_tree_node<std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>>, _Args=<const std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>> &>]" at line 395 of "/usr/include/c++/4.4.7/bits/stl_tree.h"
            instantiation of "std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::_Link_type std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::_M_create_node(_Args &&...) [with _Key=std::string, _Val=std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>, _KeyOfValue=std::_Select1st<std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>>, _Compare=std::less<std::string>,
                      _Alloc=std::allocator<std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>>, _Args=<const std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>> &>]" at line 881 of "/usr/include/c++/4.4.7/bits/stl_tree.h"
            instantiation of "std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::iterator std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::_M_insert_(std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::_Const_Base_ptr, std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::_Const_Base_ptr, const std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::value_type &) [with _Key=std::string, _Val=std::pair<const std::string, std::unique_ptr<AWI::FortranCallback,
                      std::default_delete<AWI::FortranCallback>>>, _KeyOfValue=std::_Select1st<std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>>, _Compare=std::less<std::string>, _Alloc=std::allocator<std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>>]" at line 1215 of "/usr/include/c++/4.4.7/bits/stl_tree.h"
            instantiation of "std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::iterator std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::_M_insert_unique_(std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::const_iterator, const std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::value_type &) [with _Key=std::string, _Val=std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>,
                      _KeyOfValue=std::_Select1st<std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>>, _Compare=std::less<std::string>, _Alloc=std::allocator<std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>>]" at line 540 of "/usr/include/c++/4.4.7/bits/stl_map.h"
            instantiation of "std::map<_Key, _Tp, _Compare, _Alloc>::iterator std::map<_Key, _Tp, _Compare, _Alloc>::insert(std::map<_Key, _Tp, _Compare, _Alloc>::iterator, const std::map<_Key, _Tp, _Compare, _Alloc>::value_type &) [with _Key=std::string, _Tp=std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>, _Compare=std::less<std::string>, _Alloc=std::allocator<std::pair<const std::string, std::unique_ptr<AWI::FortranCallback,
                      std::default_delete<AWI::FortranCallback>>>>]" at line 450 of "/usr/include/c++/4.4.7/bits/stl_map.h"
            instantiation of "std::map<_Key, _Tp, _Compare, _Alloc>::mapped_type &std::map<_Key, _Tp, _Compare, _Alloc>::operator[](const std::map<_Key, _Tp, _Compare, _Alloc>::key_type &) [with _Key=std::string, _Tp=std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>, _Compare=std::less<std::string>, _Alloc=std::allocator<std::pair<const std::string, std::unique_ptr<AWI::FortranCallback, std::default_delete<AWI::FortranCallback>>>>]" at line 42 of
                      "/pf/a/a270092/frontiers/awicm-3.1/fesom-2.0/src/async_threads_cpp/ThreadsManager.cpp"

compilation aborted for /pf/a/a270092/frontiers/awicm-3.1/fesom-2.0/src/async_threads_cpp/ThreadsManager.cpp (code 2)
make[4]: *** [src/async_threads_cpp/CMakeFiles/async_threads_cpp.dir/ThreadsManager.cpp.o] Error 2

PATH:

/sw/rhel6-x64/hdf5/hdf5-1.8.18-intel14/bin/:/sw/rhel6-x64/gcc/binutils-2.24-gccsys/bin:/sw/rhel6-x64/devtools/cmake-3.5.2-
gcc48/bin:/sw/rhel6-x64/devtools/autoconf-2.69-gccsys/bin:/sw/rhel6-x64/intel/impi/2018.1.163/compilers_and_libraries/linux
/mpi/bin64:/sw/rhel6-x64/intel/intel-18.0.1/bin:/sw/rhel6-x64/jdk-1.8.0_20/bin:/sw/rhel6-x64/netcdf/netcdf_c-4.3.2-gcc48/bin:
/sw/rhel6-x64/nco/nco-4.7.5-gcc64/bin:/sw/rhel6-x64/cdo/cdo-1.9.0-gcc48/bin:/pf/a/a270092/.local/bin:/pf/a/a270092/miniconda2
/envs/pyn_env_py2/bin:/pf/a/a270092/miniconda2/condabin:/pf/a/a270092/miniconda2/bin:/usr/lib64/qt-3.3/bin:/bin:/usr/bin:
/usr/local/sbin:/usr/sbin:/sbin:/pf/a/a270092/ecmwf/grib-api/bin:/mnt/lustre01/sw/rhel6-x64/devtools/fcm-2017.10.0/bin/

Modules

Currently Loaded Modulefiles:
  1) cdo/1.9.0-gcc48        3) netcdf_c/4.3.2-gcc48   5) intel/18.0.1           7) autoconf/2.69
  2) nco/4.7.5-gcc64        4) jdk/1.8.0_20           6) intelmpi/2018.1.163    8) cmake/3.5.2

Does this look familiar? Anyone else have problems on mistral?

Default FESOM2 repository

During our weekly esm_tools meeting the question of the default FESOM2 repository came up. At the moment the default in esm_tools is gitlab, with github as an optional.

  • Is still this reflective of the policy of the FESOM developers? Has the default changed to github?
  • Are there plans to automatically sync the two, such that we would not even have to make a choice?
  • Or are there branches on gitlab that are not supposed to go public?

MLD1 and MLD2 definitions

Quick question: I guess these two are different mixed layer depth definitions. There are dozens of those out there. Which ones are used in FESOM2 for these two fields?

Cheers, Jan

MPI routine used before intialization of MPI, for coupled setups in ALEPH

Hi FESOM developers,

@JanStreffing and I are trying to run AWICM3 in ALEPH computer (a Cray system) and we are experiencing some issues involving MPI during run time.

Branches used

  • FESOM: fesom-2.0-frontiers
  • OASIS3MCT: awicm-3-frontiers

Machine details

Type: Cray
Batch system: PBS
Scheduler: ALPS (aprun)

Compilation details

We are using ftn compiler as it is the one recommended by the system admins for Cray. We are also using mpich.

Cases tested

  • FESOM standalone (passes MPI initialization)
  • FESOM coupled for AWICM3 has the error described below (setting the coupled flags on FESOM_COUPLED ON and OIFS_COUPLED ON. OASIS libraries are compiled before FESOM and linked).

Detailed description of the problem

We are experiencing the following issue during runtime of FESOM coupled in AWICM3:

Attempting to use an MPI routine before initializing MPICH
Attempting to use an MPI routine before initializing MPICH
Attempting to use an MPI routine before initializing MPICH
...
Attempting to use an MPI routine before initializing MPICH
[NID 00150] 2021-04-16 20:00:33 Apid 423459: initiated application termination
Fri Apr 16 20:00:33 2021: [PE_120]:inet_recv:inet_recv: recv error (fd=7) Connection reset by peer
Fri Apr 16 20:00:33 2021: [PE_120]:_pmi_network_barrier:_pmi_inet_recv from target 0 failed pmi errno -1
Fri Apr 16 20:00:33 2021: [PE_120]:_pmi_barrier:network_barrier failed
[Fri Apr 16 20:00:33 2021] [c0-0c2s6n1] Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(537): 
MPID_Init(246).......: channel initialization failed
MPID_Init(638).......:  PMI2 init failed: 1 
Application 423459 exit codes: 1
Application 423459 exit signals: Killed
Application 423459 resources: utime ~3s, stime ~13s, Rss ~16864, inblocks ~51250, outblocks ~0

However, when we run the exact same version of FESOM-2.0 but compiled for standalone we seem to get further, as MPI reports initialization (we run standalone just to narrow down the problem, the crash that happens after MPI intialization is probably related to something we've done wrong in the configuration of the standalone test):

 MPI has been initialized, provided MPI thread support level: MPI_THREAD_SERIALIZED 2
 Running on  288  PEs
 
 FESOM2 git SHA: f3c62de
 ^[[32m____________________________________________________________^[[0m
 ^[[7;32m --> FESOM BUILDS UP MODEL CONFIGURATION                    ^[[0m

So it seems that the problem we are having is associated to something in the coupling of FESOM and/or in the interaction with OASIS. Has anyone experienced something similar to this? @JanStreffing reported similar problems with Juwels system. Has anyone an idea about what might be going on? @dsidoren

Kind regards

More information in netCDF output

In order to make it possible for cdo to work with our 3D output, they need information on the depth of vertical levels, and explicitly defined axis of out vertical coordinate variable. I have added this, plus decide to put some important information to global attributes. Granted, that this information is only valid for the time when the file is created, but changing parameters in the middle of the year is rare, and usually in this cases we know what we are doing. Here is how headers of 3 main types of variables look like now:

netcdf temp.fesom.1948 {
dimensions:
	nz1 = 47 ;
	nod2 = 3140 ;
	time = UNLIMITED ; // (1 currently)
variables:
	double nz1(nz1) ;
		nz1:long_name = "depth at layer midpoint" ;
		nz1:units = "m" ;
		nz1:positive = "down" ;
		nz1:axis = "Z" ;
	double time(time) ;
		time:long_name = "time" ;
		time:standard_name = "time" ;
		time:units = "seconds since 1948-01-01 0:0:0" ;
		time:axis = "T" ;
		time:stored_direction = "increasing" ;
	double temp(time, nod2, nz1) ;
		temp:description = "temperature" ;
		temp:long_name = "temperature" ;
		temp:units = "C" ;

// global attributes:
		:model = "FESOM2" ;
		:website = "fesom.de" ;
		:git_SHA = "bc0e302" ;
		:MeshPath = "/fesom/pi/" ;
		:ClimateDataPath = "/fesom/phc3/" ;
		:which_ALE = "linfs" ;
		:mix_scheme = "KPP" ;
		:tra_adv_hor = "MFCT" ;
		:tra_adv_ver = "QR4C" ;
		:tra_adv_lim = "FCT" ;
		:use_partial_cell = 0 ;
		:force_rotation = 0 ;
		:include_fleapyear = 0 ;
		:use_floatice = 0 ;
		:whichEVP = 0 ;
		:evp_rheol_steps = 150 ;
		:visc_option = 5 ;
		:w_split = 1 ;
}

netcdf u.fesom.1948 {
dimensions:
	nz1 = 47 ;
	elem = 5839 ;
	time = UNLIMITED ; // (1 currently)
variables:
	double nz1(nz1) ;
		nz1:long_name = "depth at layer midpoint" ;
		nz1:units = "m" ;
		nz1:positive = "down" ;
		nz1:axis = "Z" ;
	double time(time) ;
		time:long_name = "time" ;
		time:standard_name = "time" ;
		time:units = "seconds since 1948-01-01 0:0:0" ;
		time:axis = "T" ;
		time:stored_direction = "increasing" ;
	double u(time, elem, nz1) ;
		u:description = "horizontal velocity" ;
		u:long_name = "horizontal velocity" ;
		u:units = "m/s" ;

// global attributes:
		:model = "FESOM2" ;
		:website = "fesom.de" ;
		:git_SHA = "bc0e302" ;
		:MeshPath = "/fesom/pi/" ;
		:ClimateDataPath = "/fesom/phc3/" ;
		:which_ALE = "linfs" ;
		:mix_scheme = "KPP" ;
		:tra_adv_hor = "MFCT" ;
		:tra_adv_ver = "QR4C" ;
		:tra_adv_lim = "FCT" ;
		:use_partial_cell = 0 ;
		:force_rotation = 0 ;
		:include_fleapyear = 0 ;
		:use_floatice = 0 ;
		:whichEVP = 0 ;
		:evp_rheol_steps = 150 ;
		:visc_option = 5 ;
		:w_split = 1 ;
}

netcdf sst.fesom.1948 {
dimensions:
	nod2 = 3140 ;
	time = UNLIMITED ; // (1 currently)
variables:
	double time(time) ;
		time:long_name = "time" ;
		time:standard_name = "time" ;
		time:units = "seconds since 1948-01-01 0:0:0" ;
		time:axis = "T" ;
		time:stored_direction = "increasing" ;
	double sst(time, nod2) ;
		sst:description = "sea surface temperature" ;
		sst:long_name = "sea surface temperature" ;
		sst:units = "C" ;

// global attributes:
		:model = "FESOM2" ;
		:website = "fesom.de" ;
		:git_SHA = "bc0e302" ;
		:MeshPath = "/fesom/pi/" ;
		:ClimateDataPath = "/fesom/phc3/" ;
		:which_ALE = "linfs" ;
		:mix_scheme = "KPP" ;
		:tra_adv_hor = "MFCT" ;
		:tra_adv_ver = "QR4C" ;
		:tra_adv_lim = "FCT" ;
		:use_partial_cell = 0 ;
		:force_rotation = 0 ;
		:include_fleapyear = 0 ;
		:use_floatice = 0 ;
		:whichEVP = 0 ;
		:evp_rheol_steps = 150 ;
		:visc_option = 5 ;
		:w_split = 1 ;
}

netcdf w.fesom.1948 {
dimensions:
	nz = 48 ;
	nod2 = 3140 ;
	time = UNLIMITED ; // (1 currently)
variables:
	double nz(nz) ;
		nz:long_name = "depth at layer interface" ;
		nz:units = "m" ;
		nz:positive = "down" ;
		nz:axis = "Z" ;
	double time(time) ;
		time:long_name = "time" ;
		time:standard_name = "time" ;
		time:units = "seconds since 1948-01-01 0:0:0" ;
		time:axis = "T" ;
		time:stored_direction = "increasing" ;
	float w(time, nod2, nz) ;
		w:description = "vertical velocity" ;
		w:long_name = "vertical velocity" ;
		w:units = "m/s" ;

// global attributes:
		:model = "FESOM2" ;
		:website = "fesom.de" ;
		:git_SHA = "bc0e302" ;
		:MeshPath = "/fesom/pi/" ;
		:ClimateDataPath = "/fesom/phc3/" ;
		:which_ALE = "linfs" ;
		:mix_scheme = "KPP" ;
		:tra_adv_hor = "MFCT" ;
		:tra_adv_ver = "QR4C" ;
		:tra_adv_lim = "FCT" ;
		:use_partial_cell = 0 ;
		:force_rotation = 0 ;
		:include_fleapyear = 0 ;
		:use_floatice = 0 ;
		:whichEVP = 0 ;
		:evp_rheol_steps = 150 ;
		:visc_option = 5 ;
		:w_split = 1 ;
}

Please have a look and let me know if you are happy with it and maybe see some potential problems with some of the options. Or maybe something should be addes.

@dsidoren @patrickscholz @JanStreffing @pgierz

PR: #18

Model blowup due to high CFL_z values of higher resolved configurations at very small timesteps

Symptoms: blowup due to high CFL_z values, negative layerthicknesses or exceedingly high values in the ssh at very small time steps

Description: When using higher resolved model configuration (i.e AO4, ~1.3M, min resolution 4km) in combination with partial cells the model can tend to blowup in case of zstar due to exceedingly high values in the ssh, not immediately but a after a couple of days also at very small time steps of up to 30 sec. The critical switch seems to be here the partial cells. Partial cells tend to affect the flow velocity especially in the boundary currents and at the bottom and induce more velocity fluctuation which are most likely more prone to blowups especially after the initialization .

Solution: In that case it can help to make a short 1 month spinup with a different w_max_cfl paramter. By default w_max_cfl in namelist.oce is equal 1, that means that the vertical advection through vertical velocity within a vertical CFL value of 1 is treated explicitly the rest above 1 is treated implicitly. The debugging of the AO4 setup revealed that a w_max_cfl parameter of 0.25 within the first month after an initialization can help to overcome these initial fluctuations, afterwards the w_max_cfl parameter can be setted back to 1.

Sea ice temperature advection for AWICM3

@dsidoren pointed out that the way we have implemented sea ice temperature advection for the coupled AWICM3 setup does not look right to him. Rather than advection the heat energy we are advection the temperature directly. In case of convergence your temperatures are added up, in case of divergence they are reduced.

I would like to invite comments from @lzampier @qiangclimate

Small Indice bug in vertical ppm

tv(1)=ttf(1,n)
! tracer at surface+1 layer
! tv(2)=-ttf(1,n)*min(sign(1.0, W(2,n)), 0._WP)+ttf(2,n)*max(sign(1.0, W(2,n)), 0._WP)
tv(2)=0.5*(ttf(1,n)+ttf(2,n))
! tacer at bottom-1 layer
!tv(nzmax-1)=-ttf(nzmax-2,n)*min(sign(1.0, W(nzmax-1,n)), 0._WP)+ttf(nzmax-1,n)*max(sign(1.0, W(nzmax-1,n)), 0._WP)
tv(nzmax-1)=0.5_WP*(ttf(nzmax-2,n)+ttf(nzmax-1,n))
! tracer at bottom layer
tv(nzmax)=ttf(nzmax-1,n)
!_______________________________________________________________________
! calc tracer for surface+2 until depth-2 layer
! see Colella and Woodward, JCP, 1984, 174-201 --> equation (1.9)
! loop over layers (segments)
do nz=3, nzmax-3

Hi Dima is there a small indice bug in vertical ppm ?

there seem to happen nothing in the layer tv(nzmax-2)= ...

can not compile with 'USE_ICEPACK' set to 'ON'

with gfortran 10.2.0 anc cmake 3.20.3 I get this error:

[ 52%] Building Fortran object src/CMakeFiles/fesom.dir/icepack_drivers/icedrv_kinds.F90.o
.../src/icepack_drivers/icedrv_kinds.F90:11:11:

11 | use icepack_intfc, only: char_len => icepack_char_len
| 1
Fatal Error: Cannot open module file 'icepack_intfc.mod' for reading at (1): No such file or directory
compilation terminated.

Can't find fesom_version_info_module while compiling in single thread mode.

Related to #19

The problem initially encountered by @JanStreffing . Current code can be compiled with make install -j nproc --all , but gives an error with simple make install.

I have commented out all the code related to getting the SHA function from io_meandata and now the error is in fvom_main.

/home/ollie/nkolduno/t/ttt/src/fvom_main.F90(25): error #7002: Error in opening the compiled module file.  Check INCLUDE paths.   [FESOM_VERSION_INFO_MODULE]
use fesom_version_info_module
----^
/home/ollie/nkolduno/t/ttt/src/fvom_main.F90(57): error #6404: This name does not have a type, and must have an explicit type.   [FESOM_GIT_SHA]
        print *,"FESOM2 git SHA: "//fesom_git_sha()
------------------------------------^
/home/ollie/nkolduno/t/ttt/src/fvom_main.F90(57): error #6054: A CHARACTER data type is required in this context.   [FESOM_GIT_SHA]
        print *,"FESOM2 git SHA: "//fesom_git_sha()
------------------------------------^
compilation aborted for /home/ollie/nkolduno/t/ttt/src/fvom_main.F90 (code 1)
make[2]: *** [src/CMakeFiles/fesom.dir/fvom_main.F90.o] Error 1
make[1]: *** [src/CMakeFiles/fesom.dir/all] Error 2
make: *** [all] Error 2

@hegish can you, please, have a look at it?

FESOM2 crashes depending on MESH directory

I have problems running AWI-CM-2.0 on mistral depending on which mesh directory I use for FESOM.

If I take /pool/data/AWICM/FESOM2/MESHES/CORE2/ I get the following error:

666:  ***********************************************************
666:  max. CFL_z =    1.44824733863823       mype =          234
666:  mstep      =            1
666:  glon, glat =   -35.5792148127440       -5.49627545561133     
666:  2D node    =        68061
666:  nz         =            2
666:  ***********************************************************
520:  ***********************************************************
520:  max. CFL_z =    1.82566339425423       mype =           88
520:  mstep      =            1
520:  glon, glat =    73.4124462298563       0.408295191792843     
520:  2D node    =        87261
520:  nz         =            2
520:  ***********************************************************
495:  ___CHECK FOR BLOW UP___________ --> mstep=           1
498:  ___CHECK FOR BLOW UP___________ --> mstep=           1
498:   --STOP--> found eta_n become NaN or <-10.0, >10.0
498:  mype        =           66
498:  mstep       =            1
498:  node        =            1
498:  
498:  eta_n(n)    =                      NaN
498:  d_eta(n)    =   0.000000000000000E+000

But if I run the same runscript with this mesh /mnt/lustre02/work/ab0995/a270029/fesom2.0/meshes/core2/ everything is working fine. Here are the links to the experiments:
/work/ba0989/a270124/esm-experiments/awicm_pism/ib_notworking01
/work/ba0989/a270124/esm-experiments/awicm_pism/ib_working02

I did not change anything else than mesh directory and used the same esm-tools (bash version). Could this be related #29 ?

I am using a FESOM2.0 version with iceberg model included, but the error also occurs when I turn the iceberg model off. Could anybody clarify what is happening here and why /pool/data/AWICM/FESOM2/MESHES/CORE2/ is not working? I guess it should be the dafault mesh directory?

Vertical CFL problems when using partial cells in higher resolved configurations

Symptoms: blowup due to high CFL_z values, negative layerthicknesses or exceedingly high values in the ssh at very small time steps

Description: When using higher resolved model configuration (i.e AO4, ~1.3M, min resolution 4km) in combination with partial cells the model can tend to blowup in case of zstar due to exceedingly high values in the ssh, not immediately but a after a couple of days also at very small time steps of up to 30 sec. The critical switch seems to be here the partial cells. Partial cells tend to affect the flow velocity especially in the boundary currents and at the bottom and induce more velocity fluctuation which are most likely more prone to blowups especially after the initialization .

Solution: In that case it can help to make a short 1 month spinup with a different w_max_cfl paramter. By default w_max_cfl in namelist.oce is equal 1, that means that the vertical advection through vertical velocity within a vertical CFL value of 1 is treated explicitly the rest above 1 is treated implicitly. The debugging of the AO4 setup revealed that a w_max_cfl parameter of 0.25 within the first month after an initialization can help to overcome these initial fluctuations, afterwards the w_max_cfl parameter can be setted back to 1.

missing lat, lon fields in fesom.mesh.diag.nc

To make fesom.mesh.diag.nc an efficient replacement for inferring mesh info in analysis compared to current text files (nod2d.out...), it should include lat(nod2), lon(nod2) information.

is zbar_n_bottom in fesom.mesh.diag.nc same as topography information in aux3d.out? if so just having lat, lon info would make mesh diag complete replacement else the topo data should also be included.

Changeing field in restart.oce file

Say I want to crate an experiment where I reduce the temperature in the Labrador sea to simulate a restart from a little ice age. I'm looking for the simplest solution for a master student.

I found that with cdo setgrid,/work/ollie/jstreffi/input/fesom2/mesh_CORE2_finaltopo_mean/CORE2_finaltopo_mean.nc restartin.nc restartout.nc I can turn 2d generic files into unstructured files that can be used for a wide varity of cdo commands.

However this works only with the 2d ice restarts that contain only one grid. Is there a similar process & file for 3d data? Or should I forget trying to make cdo work and just do it in python?

PGI vs. GCC on JUWELS-Booster

PGI seems to be significantly slower in the runtime oce. mix,pres. routines than gfortran (for CORE2, STORM and D4.2), so it would be good to have an idea why; perhaps compile flags can be further optimized, or profiling and compiler reports should give us a clue..

Version info

Discussion related to #10 PR

@pgierz thanks for such a fast reaction with a PR :)

I give it a deeper though and now think that retrieving the version number dynamically is not the best way to do it:

  • I don't lake git dependency. It maybe installed on all relevant machines now, but we don't know where fesom will be used in the future. I can imagine situation when git will be available on login nodes but not on the compute nodes.
  • The idea is to be able to get the version of the model for debugging. If someone copied the fesom folder, then git folder most probably will be copied as well, so the person can get this information. If .git folder is not present, there is no way we can get the info anyway.
  • Also there are use cases when fesom executable, created with some particular version of the model copied to bin folder in another fesom "installation". So version is something that should be inserted at the moment of compilation.

So I see two possible options - somehow insert the git version at the process of compilation (in this case git should be available on any sane system), or just manually put the tag version. The second option has an advantage of being simpler to communicate and not having git as a dependency. But the first option is more precise and probably much more useful.

Would be nice to have @patrickscholz and @dsidoren opinions on this :)

clarify rotated/unrotated mesh files

Hi FESOM team

A wish: is there a way to clarify the rotated versus unrotated mesh mysterium of the fesom1.4 meshes? For example:

$ head -3 /pool/data/AWICM/FESOM1/MESHES/core/nod2d.out
  126859
       1 110.883650498 -66.1483566126        1
       2 257.574958343 -74.2146210526        1

versus

$ head -3 /work/bm0944/input/CORE2_final/nod2d.out
 126859
 1 117.9772 -77.1266 1
 2 -33.0436 -63.7543 1

For the users it is unclear which mesh to use. The two options

rotated_grid=.true. 	  	!option only valid for coupled model case now
force_rotation=.true.		!set to .true. for some unrotated meshes

actually increase confusion.

I dont know if the FESOM2 meshes are organized in a different way but if they are, I would be very happy if this gets more clear.

Thanks!
Chris

fesom release 2.1 via esm_tools

I would like to make simulations that compare fesom release 2.1 with the latest awicm3. Are there any plans to implement release 2.1 into the esm_tools? I'm quite used to the esm_tool now, but when installing fesom2 I get tag: 2.0.2 from 2018.

Or do I need some branch of the esm_tools?

Cheers, Jan

Inconsistent dimension names in mesh diag and model output

Variable (dimension) names that represent same fields in model output and mesh diagnostics are named differently. Apart from consistency reason, having a same naming convention, will simplify post-processing and probably also make it easier for a new FESOM data user to understand.

I don't know of any plans to standardize names but I think at minimal, using same dimension names in mesh diag as in model data is useful.

for instance associations of names in mesh diagnostics to model data:

nod_n  -> nod2
elem_n -> elem
nl -> nz
nl1 -> nz1 

and I think, these level variables may be named same as dimension names harmlessly.

Zbar -> nz
Z -> nz1

I don't know if mesh diag file is used for restarts? and renaming may pose other challenges?

Move to unrotated meshes and new defaults

This is quite large PR (#50), that switch all work to standard unrotated meshes and add capability to create setups using python module (installed separately). It also adds test meshes to the model.

One has to test all those things together, that's why I did it as one PR, sorry @dsidoren and @patrickscholz :)

Changes in defaults (config folder)

  • Switch to JRA55-do forcing by default.
  • Change yearnew to 1958 (start year of JRA55-do).
  • Change paths to meshes and forcing to standard ones, located in /work/ollie/projects/clidyn/FESOM2/
  • Turn zstar and partial cells on by default.
  • Change force_rotation to true, since we going to use only unrotated meshes by default.
  • Turn on mEVP and reduce the number of cycles by default.
  • Remove a lot of additional output from namelist.io (this is something that needs to be discussed).
  • In namelist.oce fix wrong values of gamma0 and Leith_c.

Changes in testing:

  • New docker container for testing with updated mkrun python module.
  • Add channel test run with 20km resolution soufflet channel.
  • Test values can now be changed in the model repository itself, not in the outside python module as it was before.
  • Tests don't use any external data anymore (see below).

Standard model setups (new setups folder)

To use this functionality one has to use additional python module (https://github.com/FESOM/mkfesom). I am going to make it pip installable and write documentation in the next few days.

In short, executing:

mkrun core2_experiment4 core2

will create work_core2_experiment4 folder with all the right paths for the current machine based on the core2 experiment, and results folder with fesom.clock inside. There are some additional options like specifying different build directory for your executable, using different forcing and HPC account.

Setups are just collection of yaml files that define what should be changed in the standard namelists. It is not only useful for automatic creation of model experiments, but also serves as a reference on what one should not forget to change for different meshes and configurations (if done by hand), and where things are stored on different machines.

  • forcings.yaml - define standard values for different forcings. Currenly it has JRA55, CORE2 , ERA5 and test_global (one day of CORE2 forcing, see below).
  • paths.yaml - define standard paths for meshes, climatology and forcings for different machines.
  • core2 (10 years) and test_core2 (1 year, can check values of the output fields afterwards).
  • farc (10 years). I will update values when tests with farc are finished, for now it's mainly default values.
  • pi (10 years) and test_pi (1 day) - small test for global setup (input is included with fesom2, see below).
  • souf (10 years) and test_souf (1 day) - small test for channel setup (input is included with fesom2, see below).

Additional input files:

I have added some basic input files to the test directory. Now it is possible to do some simple tests simulations without downloading any additional files:

  • folder test/input/global/ contains one day of CORE2 forcing, plus runoff and one step of salinity restoring file.
  • folder test/meshes/ contains pi and soufflet (20 km) meshes with partitionings for 2 and 8 cores.

Setups test_pi and test_souf are now can be done without downloading any additional files.

restart problems with step_per_day=288

Dear all,

I get a restart problem with the standalone version of FESOM2 when setting step_per_day=288.

  0:  associating restart file /work/ollie/lackerma/awicm_pism_tests//fesom_esm_t33/run_18500201-18500228/work/fesom.1850.oce.restart.nc
  0:  WARNING: all dates in restart file are after the current date 
  0:  reading restart will not be possible !
  0:  the model attempted to start with the time stamp =      2676600
  0:  current restart counter =            1    
  0:  error counter=           9    
  0:  Error: Unknown Error                                                                        
  0:  Run finished unexpectedly!

ncdump -v time gives time = 2678100

A run with the same settings except step_per_day=48 is working fine and ncdump -v time gives time = 2676600
(/work/ollie/lackerma/awicm_pism_tests/fesom_esm_t32)

I am using the fesom master branch on commit cc4c49bf71f62ca311954b1681a10bac636e98b6 and the esm_tools

+---------------------+-------------+---------------------------------------------------------+------------------+---------------------------+
| package_name        | version     | file                                                    | branch           | tags                      |
|---------------------+-------------+---------------------------------------------------------+------------------+---------------------------|
| esm_calendar        | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_database        | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_environment     | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_master          | 5.0.1       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_motd            | 5.0.1       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_parser          | 5.0.3       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_pism            | 0.0.1.dev12 | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_plugin_manager  | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_profile         | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_rcfile          | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_runscripts      | 5.0.14      | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_tools           | 5.0.11      | /home/ollie/lackerma/esm_tools                          | iceberg_coupling | v5.0.11-13-ge41ee3a-dirty |
| esm_version_checker | 5.1.1       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
+---------------------+-------------+---------------------------------------------------------+------------------+---------------------------+

I am not sure this issue is more FESOM2 or esm_tools related.

Latest FESOM2 not working with esm-tools

Dear developers,
I am running some cases using esm-tools but it is not working with the latest version.
The log says:
4: error in line 549 io_meandata.F90 NetCDF: File exists && NC_NOCLOBBER
4: error in line 97 async_threads_module.F90
Maybe the esm-tools is not working with async_threads_module?
which package does this require? or can I disable this approach in FESOM2?
Many thanks.

Separating calving from river runoff

I added a functionality to OIFS 43 that can distinguish river runoff (Precipitation - Evaporation - Soil storage - Snow layer change) from "calving". Calving in this context is a very simple process by which the accumulating snow that would exceed the maximum snow layer thickness (10m) of the HTESSEL hydrology model is send to a simple routing algorithm and discharged into the ocean.

Think of it as solid instead of liquid river discharge. The idea is to add this mass flux to the sea ice thickness. Thereby the latent heat of melting would be incorporated into the coupled system. My first, probably naive, thought would be to divide the massflux (m³/s) of calving/ice discharge by the discharge area (m²). Then multiply by the timestep (s) and add the resulting value (m) to sea ice thickness here:

!---- temporary new ice thickness [m]
htmp = h + dhice + dhiow

Does this sound plausible to you? Perhaps you can also illuminate how it's done in AWICM1 & 2? Does this all happen on ECHAM6 side?

Automatically initializing missing fields when restarting with different configuration

When changing FESOM2 configurations between restarts, e.g.

FESOM2 standalone 0_layer -> AWICM3 0_layer
FESOM2 standalone 0_layer -> FESOM2 standalone Icepack
AWICM3 0_layer -> AWICM3 Icepack

We run into the situation that the restart files written by the previous configuration do not cover all the required restart fields of the new configuration. In the past I added the missing fields with some initial values via cdo. This is not good practice, since cdo will make new files with arbitrary dimension ordering, which can make reading the files very slow on some machines.

A much better solution would be to catch the missing restart file (talking about the new netcdf parallel restarts by @hegish here where each field has its own file) and if the restart file is not there, to initialize the field with appropriate values.

The first practical application would be @tsemmler05 using a D3 standalone restart for a TCO319L137-D3 coupled simulation. The missing fields would be ice_albedo=0.75 and ice_temp=273.15

@hegish can you point towards an appropriate branch to develop this feature from?

Sea ice accumulation in Canadian Archipelago

Dear model developers,
I found my control run has sea ice thickness>40m over five COREII cycles.

In the first cycle, sea ice thickness reaches 20m, at below red points,
1

The time series of SIC and SIT at these mesh nodes looks like this:
2

The code is in:
/home/ollie/psong/bug_report/calc_seaice_series.ipynb

I think this is caused by the semiclosed area, low resolution, and shallow water depth (only 30m).
But I am still curious about if it is solvable in the CORE mesh.

Compile time error with cray ftn on [email protected]

Good morning,
I'm try to compile the latest FESOM2 master 771dc26 on Aleph: https://ibsclimate.org/research/facilities/aleph-supercomputer/

They use Cray ftn. I got the following error messages:

[ 79%] Building Fortran object src/CMakeFiles/fesom.dir/oce_mesh.F90.o
[ 79%] Building Fortran object src/CMakeFiles/fesom.dir/oce_fer_gm.F90.o

ftn-954 crayftn: ERROR FIND_NEIGHBORS, File = ../../../mnt/lustre/home/awiiccp2/esm/model_codes/awicm3-1.0-deck/fesom-2.0/src/oce_mesh.F90, Line = 1741, Column = 12 
  Procedure "ELEM_CENTER", defined at line 59 (/home/awiiccp2/esm/model_codes/awicm3-1.0-deck/fesom-2.0/src/oce_mesh.F90) must have an explicit interface because one or more arguments have the TARGET attribute.

src/CMakeFiles/fesom.dir/build.make:1031: recipe for target 'src/CMakeFiles/fesom.dir/oce_mesh.F90.o' failed
make[2]: *** [src/CMakeFiles/fesom.dir/oce_mesh.F90.o] Error 1
make[2]: *** Waiting for unfinished jobs....


ftn-969 crayftn: WARNING CPL_DRIVER, File = ../../../mnt/lustre/home/awiiccp2/esm/model_codes/awicm3-1.0-deck/fesom-2.0/src/cpl_driver.F90, Line = 15, Column = 7 
  The compiler is looking for module "MOD_OASIS but could not find "mod_oasis.mod" so is using "MOD_OASIS.mod".


ftn-969 crayftn: WARNING CPL_OASIS3MCT_DEFINE_UNSTR, File = ../../../mnt/lustre/home/awiiccp2/esm/model_codes/awicm3-1.0-deck/fesom-2.0/src/cpl_driver.F90, Line = 164, Column = 9 
  The compiler is looking for module "MOD_OASIS_AUXILIARY_ROUTINES but could not find "mod_oasis_auxiliary_routines.mod" so is using "MOD_OASIS_AUXILIARY_ROUTINES.mod".

Cray Fortran : Version 8.7.5 (20180919174803_65a0fd5d93d142733c36203b4453f8bccc07a569)
Cray Fortran : Mon Mar 29, 2021  18:43:34
Cray Fortran : Compile time:  0.8800 seconds
Cray Fortran : 839 source lines
Cray Fortran : 0 errors, 2 warnings, 0 other messages, 0 ansi
Cray Fortran : "explain ftn-message number" gives more information about each message.


ftn-7212 crayftn: WARNING COMPUTE_VEL_RHS_VINV, File = ../../../mnt/lustre/home/awiiccp2/esm/model_codes/awicm3-1.0-deck/fesom-2.0/src/oce_vel_rhs_vinv.F90, Line = 270 
  Variable "w" is used before it is defined.

CMakeFiles/Makefile2:99: recipe for target 'src/CMakeFiles/fesom.dir/all' failed
make[1]: *** [src/CMakeFiles/fesom.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

@trackow have you encoutered similar issues when compiling on [email protected]? If so, have you fixed them?

Problems with use_cavity in FESOM2 standalone

Dear all,

I have problems running FESOM2_standalone with use_cavity turned on. I want to simulate the effect of a huge ice shelf over the Arctic Ocean and generated all necessary files for the cavitys for this mesh /home/ollie/lackerma/mesh_CORE2_GLAC1D_LGM. I tried different cavity depths. This is the error I get:

  0:  Ice is initialized
  0:  EVP scheme option=           0
 86:  #### MASS MATRIX PROBLEM          86         275   220949872.843201        447382614.025199     
128:  #### MASS MATRIX PROBLEM         128         140   368270116.712415        723321526.483467     
217:  #### MASS MATRIX PROBLEM         217         109   320498442.974243        657513300.304006     
118:  #### MASS MATRIX PROBLEM         118         324   340797393.697621        793265887.245412     
113:  #### MASS MATRIX PROBLEM         113         135   191781847.065723        439032509.434349     
117:  #### MASS MATRIX PROBLEM         117         284   294763974.980881        593329581.080091     
127:  #### MASS MATRIX PROBLEM         127          45   519442329.278821        785726409.024064     
129:  #### MASS MATRIX PROBLEM         129         316   295283387.430305        580452179.993668     
131:  #### MASS MATRIX PROBLEM         131         120   561584732.443697        787225801.515696     
126:  #### MASS MATRIX PROBLEM         126         121   475040104.445925        776672611.059842     
132:  #### MASS MATRIX PROBLEM         132         219   438745473.514910        869348169.476628     
130:  #### MASS MATRIX PROBLEM         130         240   294592961.281773        619380336.278628     
219:  #### MASS MATRIX PROBLEM         219         139   214048238.244584        379428561.288263     
229:  #### MASS MATRIX PROBLEM         229          75   410518011.910956        602887219.978014     
232:  #### MASS MATRIX PROBLEM         232         298   382827751.493645        723690895.488116     
231:  #### MASS MATRIX PROBLEM         231         304   188308859.350739        623007989.689819     
249:  #### MASS MATRIX PROBLEM         249         316   523288302.261941        801729937.804502     
251:  #### MASS MATRIX PROBLEM         251         274   604400331.784630        892249053.550444     
  0:  ==========================================
  0:  MODEL SETUP took on mype=0 [seconds]      
  0:  runtime setup total         2.558401    
  0:   > runtime setup mesh       1.334322    
  0:   > runtime setup ocean     0.3104689    
  0:   > runtime setup forcing   0.7446561    
  0:   > runtime setup ice       6.3586235E-04
  0:   > runtime setup restart   1.5401840E-04
  0:   > runtime setup other     0.1681640    
  0:  ============================================
  0:  FESOM start iteration before the barrier...
  0:  FESOM start iteration after the barrier...
  0:  
  0:  ^[[32m____________________________________________________________^[[0m
  0:  ^[[7;32m --> FESOM STARTS TIME LOOP                                 ^[[0m
125:   --> found NaN in smoothing
125:   mype =          125
125:   n    =           59
125:   nz,uln,nln      =            6           2          14
125:   arr(nz,n)       =                      NaN
125:   work_array(nz,n)=   0.000000000000000E+000
125:   vol(nz,n)       =                 Infinity

(/work/ollie/lackerma/awicm_pism_tests/cav_lgm_glac1d_06/scripts/cav_lgm_glac1d_06_compute_18500101-18501231_10247391.log)

A run with the same setting except for use_cavity=.false. runs fine (/work/ollie/lackerma/awicm_pism_tests/no-cav_lgm_glac1d_02).

I use the FESOM2 master branch on commit e99820a0ef59f0453fa336e767a6402f7b520348 and the esm_tools

(base) [lackerma@ollie0:~/fesom-2.0]$ esm_versions check
+---------------------+-------------+---------------------------------------------------------+------------------+---------------------------+
| package_name        | version     | file                                                    | branch           | tags                      |
|---------------------+-------------+---------------------------------------------------------+------------------+---------------------------|
| esm_calendar        | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_database        | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_environment     | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_master          | 5.0.1       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_motd            | 5.0.1       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_parser          | 5.0.3       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_pism            | 0.0.1.dev12 | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_plugin_manager  | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_profile         | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_rcfile          | 5.0.0       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_runscripts      | 5.0.14      | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
| esm_tools           | 5.0.11      | /home/ollie/lackerma/esm_tools                          | iceberg_coupling | v5.0.11-13-ge41ee3a-dirty |
| esm_version_checker | 5.1.1       | /home/ollie/lackerma/.local/lib/python3.7/site-packages |                  |                           |
+---------------------+-------------+---------------------------------------------------------+------------------+---------------------------+

Fesomstandalone error message

I got this error message in simulating fesomstandlone.
'''

The error I got was basically:
0: Reading restart: timestamps in restart and in clock files do not match
0: restart/ times are: 31534200.0000000 6.267617970012885E-319
0: the model will stop!
0: error counter= 2
0: Error:
0: NetCDF: Index exceeds dimension bound
The path to my experiment is "/work/ba1006/a270151/esm-experiments/core/ scripts/"...
the version is fesom2.0.
"cdo settime: Open failed on >/work/ba1006/a270151/esm-experiments//core/restart/fesom//fesom.1947.oce.restart.nc<
No such file or directory"
'''

PGI compile error

The compilation of fesom-2 rev. 8c6bf9b with PGF90 aborts with the error

PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 68)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 69)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 70)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 71)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 72)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 73)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 75)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 76)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 79)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 85)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 86)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 93)
PGF90-S-0155-Illegal POINTER assignment - pointer target must be simply contiguous (/home/gijstest/FESOM2-CORE2/fesom2/src/associate_mesh.h: 94)

Compile Problems on Mistral

Hi all,

I wanted to test out the master branch with ECHAM-6.3.05p2 Concurrent Radiation to see about starting our AWI-ESM 2.1 PI tests. Unfortunately, I ran into problems when trying to compile on DKRZ:

/mnt/lustre02/work/ba1066/a270077/fesom-freezing-tests/model_codes/awiesm-2.1/fesom-2.0/src/cpl_driver.F90(164): error #7002: Error in opening the compiled module file.  Check INCLUDE paths.   [MOD_OASIS_AUXILIARY_ROUTINES]
    use mod_oasis_auxiliary_routines, ONLY:	oasis_get_debug, oasis_set_debug
--------^
/mnt/lustre02/work/ba1066/a270077/fesom-freezing-tests/model_codes/awiesm-2.1/fesom-2.0/src/cpl_driver.F90(164): error #6580: Name in only-list does not exist or is not accessible.   [OASIS_GET_DEBUG]
    use mod_oasis_auxiliary_routines, ONLY:	oasis_get_debug, oasis_set_debug
------------------------------------------------^
/mnt/lustre02/work/ba1066/a270077/fesom-freezing-tests/model_codes/awiesm-2.1/fesom-2.0/src/cpl_driver.F90(164): error #6580: Name in only-list does not exist or is not accessible.   [OASIS_SET_DEBUG]
    use mod_oasis_auxiliary_routines, ONLY:	oasis_get_debug, oasis_set_debug
-----------------------------------------------------------------^
compilation aborted for /mnt/lustre02/work/ba1066/a270077/fesom-freezing-tests/model_codes/awiesm-2.1/fesom-2.0/src/cpl_driver.F90 (code 1)
make[2]: *** [src/CMakeFiles/fesom.dir/cpl_driver.F90.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [src/CMakeFiles/fesom.dir/all] Error 2
make: *** [all] Error 2
Traceback (most recent call last):
  File "/pf/a/a270077/.local/bin/esm_master", line 8, in <module>
    sys.exit(main())
  File "/pf/a/a270077/.local/lib/python3.7/site-packages/esm_master/cli.py", line 74, in main
    main_flow(parsed_args, target)
  File "/pf/a/a270077/.local/lib/python3.7/site-packages/esm_master/esm_master.py", line 50, in main_flow
    user_task.execute() #env)
  File "/pf/a/a270077/.local/lib/python3.7/site-packages/esm_master/task.py", line 430, in execute
    shell=(command.startswith("./") and command.endswith(".sh")),
  File "/sw/spack-rhel6/anaconda3-2020.02-dqbodz/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['./comp-fesom-2.0-paleodyn_script.sh']' returned non-zero exit status 2.

To reproduce (The link may be give you a 500 error, it is AWI internal only):

$ esm_master get-awiesm-2.1
$ cd awiesm-2.1/fesom-2.0
$ git checkout .  # remove changes made by esm-master to CMakeLists.txt
$ git remote set-url origin [email protected]:FESOM/fesom2.git
$ git fetch
$ git checkout master
$ git pull # probably not needed
$ cd ../../
$ esm_master comp-awiesm-2.1

Any hints @koldunovn? I know coupling isn't really your area of expertise. @JanStreffing I remember reading an issue a while ago that you also had compile issues, might this be related?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.