speedyweather / speedyweather.jl Goto Github PK

Play atmospheric modelling like it's LEGO.

Home Page: https://speedyweather.github.io/SpeedyWeather.jl/dev

License: MIT License

Julia 100.00%

speedyweather.jl's Issues

Overhaul approach to physical constants and model parameters

Currently, there is not a clear distinction between universal physical constants, which never change, and model parameters, which might.

Copying @milankl's comment from PR #82:

What about using "constants" for actual physical constants that nobody would want to change, "parameters" for everything that run_speedy may take as an argument and "constants_runtime" for every value that has to be precalculated, probably from parameters, but is constant at runtime. Meaning we should rename the constants.jl file and constants.jl should then be loaded before default_parameters?

Shortening the Legendre Transform over m

This is an issue to collect ideas around the linear vs quadratic vs cubic truncation when looping over the zonal wavenumbers $m$. With #127 we'll do

for m in 1:min(nfreq:mmax+1)

with nfreq = nlon÷2 + 1 and nlon the number of longitude points at a given latitude ring. So for full grids, the loop goes over all m but for reduced grids this can be shortened. IFS seems to be doing the following (j nlon nfreq) at Tco79

 (   1  20   8) (   2  24  10) (   3  28  12) (   4  32  14) (   5  36  16) (   6  40  18) (   7  44  20) (   8  48  22)
 (   9  52  24) (  10  56  26) (  11  60  27) (  12  64  29) (  13  68  31) (  14  72  33) (  15  76  35) (  16  80  36)
 (  17  84  38) (  18  88  40) (  19  92  41) (  20  96  43) (  21 100  44) (  22 104  46) (  23 108  47) (  24 112  49)
 (  25 116  50) (  26 120  52) (  27 124  53) (  28 128  55) (  29 132  56) (  30 136  57) (  31 140  58) (  32 144  60)
 (  33 148  61) (  34 152  62) (  35 156  63) (  36 160  64) (  37 164  65) (  38 168  67) (  39 172  68) (  40 176  69)
 (  41 180  70) (  42 184  71) (  43 188  72) (  44 192  73) (  45 196  74) (  46 200  75) (  47 204  76) (  48 208  77)
 (  49 212  78) (  50 216  79) (  51 220  79) (  52 224  79) (  53 228  79) (  54 232  79) (  55 236  79) (  56 240  79)
 (  57 244  79) (  58 248  79) (  59 252  79) (  60 256  79) (  61 260  79) (  62 264  79) (  63 268  79) (  64 272  79)
 (  65 276  79) (  66 280  79) (  67 284  79) (  68 288  79) (  69 292  79) (  70 296  79) (  71 300  79) (  72 304  79)
 (  73 308  79) (  74 312  79) (  75 316  79) (  76 320  79) (  77 324  79) (  78 328  79) (  79 332  79) (  80 336  79)
 (  81 336  79) (  82 332  79) (  83 328  79) (  84 324  79) (  85 320  79) (  86 316  79) (  87 312  79) (  88 308  79)
 (  89 304  79) (  90 300  79) (  91 296  79) (  92 292  79) (  93 288  79) (  94 284  79) (  95 280  79) (  96 276  79)
 (  97 272  79) (  98 268  79) (  99 264  79) ( 100 260  79) ( 101 256  79) ( 102 252  79) ( 103 248  79) ( 104 244  79)
 ( 105 240  79) ( 106 236  79) ( 107 232  79) ( 108 228  79) ( 109 224  79) ( 110 220  79) ( 111 216  79) ( 112 212  78)
 ( 113 208  77) ( 114 204  76) ( 115 200  75) ( 116 196  74) ( 117 192  73) ( 118 188  72) ( 119 184  71) ( 120 180  70)
 ( 121 176  69) ( 122 172  68) ( 123 168  67) ( 124 164  65) ( 125 160  64) ( 126 156  63) ( 127 152  62) ( 128 148  61)
 ( 129 144  60) ( 130 140  58) ( 131 136  57) ( 132 132  56) ( 133 128  55) ( 134 124  53) ( 135 120  52) ( 136 116  50)
 ( 137 112  49) ( 138 108  47) ( 139 104  46) ( 140 100  44) ( 141  96  43) ( 142  92  41) ( 143  88  40) ( 144  84  38)
 ( 145  80  36) ( 146  76  35) ( 147  72  33) ( 148  68  31) ( 149  64  29) ( 150  60  27) ( 151  56  26) ( 152  52  24)
 ( 153  48  22) ( 154  44  20) ( 155  40  18) ( 156  36  16) ( 157  32  14) ( 158  28  12) ( 159  24  10) ( 160  20   8)

So that's about 3 fewer m around the poles and less than half at the Equator

julia> cat(ifs[1:80],spd,spd-ifs[1:80],dims=2)
80×3 Matrix{Int64}:
  8  10  2
 10  12  2
 12  14  2
 14  16  2
 16  18  2
 18  20  2
 20  22  2
 22  24  2
 24  26  2
 26  28  2
 27  30  3
 29  32  3
 31  34  3
 33  36  3
 35  38  3
 36  40  4
 38  42  4
 40  44  4
 41  46  5
 43  48  5
 44  50  6
 46  52  6
 47  54  7
 49  56  7
 50  58  8
 52  60  8
 ⋮

and apparently follows the formula nfreq = floor((nlon - 1)/(2 + coslat^2))) - 1. At that resolution this is per hemisphere 4784 loops over m compared to 6400 total (80 rings each 80 orders). Meaning there's a good amount of performance gain possible if we follow something similar. What needs to be checked though is how much this applies to other grids and whether we can directly formulate this as a function of the truncation order.

More information on this apparently in Courtier and Naughton 1994

@hottad @samhatfield

Example GPU kernels

Compute-intensive loops for which we'll define GPU kernels basically fall into one of the three categories (sorted from simple to complex)

for lm in eachharmonic. These kernels loop over the non-zero indices of one or several LowerTriangularMatrixs but only access/write into the $l,m$ harmonic lm on every iteration. No cross dependencies to other harmonics. May include scalar constants. All input arrays are of the same size.
for i,j in eachentry(::Matrix) with vec[j]. These kernels loop over all entries $i,j$ of a matrix (can be LowerTriangularMatrix) but also pull data from vector at index $j$. There's at least two different indices used in the loop. Input matrices are of the same size, but the (precomputed) vectors are (obviously) smaller.
for l,m in eachharmonic with A[l+1,m] and A[l-1,m]. These kernels loop over the non-zero indices $l,m$ of several LowerTriangularMatrixs and access for every $l,m$ also $l+1,m$ and $l-1,m$, meaning there are cross dependencies to other harmonics. These loops usually involve a separate loop for the diagonal (as $l-1,m$ is zero) and for the last row (as $l+1,m$ is out of bounds).
spherical harmonic transforms. Like 2) but with signs depending on odd and even modes, and combined with Fourier transforms.

Examples

for lm in eachharmonic
a) The horizontal diffusion
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/diffusion.jl#L12-L21
b) The leapfrog time integration
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/time_integration.jl#L13-L43
for i,j in eachentry(::Matrix) with vec[j]
a) The vorticity fluxes (in grid-point space)
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/tendencies_dynamics.jl#L287-L313
b) The Bernoulli potential (in grid-point space)
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/tendencies_dynamics.jl#L349-L373
c) The Laplace operator (in spectral space)
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/spectral_gradients.jl#L263-L289
for l,m in eachharmonic with A[l+1,m] and A[l-1,m]
a) The divergence/curl operator
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/spectral_gradients.jl#L68-L99
b) $U,V$ from vorticity and divergence (inverse Laplace combined with horizontal gradients)
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/spectral_gradients.jl#L166-L210
spherical harmonic transforms
a) spectral to grid
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/spectral_transform.jl#L279-L335
b) grid to spectral
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/spectral_transform.jl#L384-L435

@maximilian-gelbrecht

copyto! function of LowerTriangularMatrix

Is it intended that both

https://github.com/milankl/SpeedyWeather.jl/blob/74b3a83eb7460cedec529990710e0c25abaea5c9/src/lower_triangular_matrix.jl#L187-L205

https://github.com/milankl/SpeedyWeather.jl/blob/74b3a83eb7460cedec529990710e0c25abaea5c9/src/lower_triangular_matrix.jl#L171-L185

return L? One should return M, no?

`LowerTriangularMatrix` for spectral coefficients

#110 explores a custom matrix format that only stores the lower triangle of coefficient matrices explicitly. Idea is that many operations just loop over m = 1:mmax+1, l=m:lmax+1, which technically jumps over the explicitly stored zeros in the upper triangle with Matrix. A LowerTriangularMatrix allows for access like [i,j] still, which needs to be translated to [k] in the underlying vector of LowerTriangularMatrix (and returns zero(T) for j>i).

Question remains how to best implement the getindex functions to avoid boundscheck (or propagate an @inbounds, currently this is just

ij2k(i::Integer,j::Integer,m::Integer) = i+(j-1)*m-fibonacci(j)
Base.getindex(L::LowerTriangularMatrix,k::Integer) = getindex(@inbounds L.v[k])

function Base.getindex(L::LowerTriangularMatrix{T},i::Integer,j::Integer) where T
    @boundscheck (i > L.m || j > L.n) && throw(BoundsError(L,(i,j)))
    j > i && return zero(T)
    return getindex(L.v,ij2k(i,j,L.m))
end

and needs some care. Anyway, changing in gridded! from alms[i,j] to alms[k] (as the zero elements are never accessed anyway) this gives similar speeds at half the memory

julia> A = randn(ComplexF32,86,86);
julia> L = LowerTriangularMatrix(A);
julia> sizeof(A)/1000    # in KB
59.168
julia> sizeof(L)/1000    # in KB
29.928
julia> @btime SpeedyWeather.gridded!($map,$A,$m.geospectral.spectral_transform);
  686.543 μs (194 allocations: 12.12 KiB)

julia> @btime SpeedyWeather.gridded!($map,$L,$m.geospectral.spectral_transform);
  685.417 μs (194 allocations: 12.12 KiB)

For large grids a LowerTriangularMatrix is faster, presumably because of contiguous memory access

julia> @btime SpeedyWeather.gridded!($map,$L,$m.geospectral.spectral_transform);
  254.260 ms (1540 allocations: 112.22 KiB)

julia> @btime SpeedyWeather.gridded!($map,$A,$m.geospectral.spectral_transform);
  274.818 ms (1540 allocations: 112.22 KiB)

Physical parameterizations: inconsistencies between speedy.f90 and the Speedy Documentation

The most up-to-date documentation of the original Speedy is here.

With respect to the physical parameterizations, a number of apparent inconsistencies exist between the documentation and the implementation in speedy.f90.

We will record them in this issue.

Precision of Time Output in NetCDF Files

The time variable in the NetCDF files is always in Int32 precision and in hours as a unit. However, one can also have time steps at non-integer hours.

I'd change

https://github.com/milankl/SpeedyWeather.jl/blob/fd963f497cb1cba0d4dad4995bb8e05f6403de64/src/output.jl#L89

to also be in output_NF, or is there anything speaking against that?

Rotating spherical harmonics gif not showing up

https://milankl.github.io/SpeedyWeather.jl/dev/spectral_transform/

Added in #89

There is https://upload.wikimedia.org/wikipedia/commons/1/12/Rotating_spherical_harmonics.gif available?

![Rotating spherical harmonics](https://upload.wikimedia.org/wikipedia/commons/1/12/Rotating_spherical_harmonics.gif)

Consistency between NetCDF output grid and internal grid?

I might misunderstand the way the netCDF is written, but at least for a FullGaussianGrid I expected this to work:

SpeedyWeather.run_speedy(Float32, trunc=ntrunc, σ_levels_half=[0.,0.4,0.6,1.], n_days=5, nlev=3, output=true)

P, D, M = SpeedyWeather.initialize_speedy(NF, initial_conditions=:restart, restart_id=1) # output dt made to fit this ntrunc to save every time step

vor = ncread("run0001/output.nc","vor")

S = M.spectral_transform
G = M.geometry 
Grid = G.Grid

Grid(vor[:,:,1,1]) # yields a 8192-element FullGaussianGrid 

gridded(P.layers[1].leapfrog[1].vor, S) # yields a 4608-element FullGaussianGrid

So half the grid points. What am I missing there?
This also results in things like spectral(Grid(vor[:,:,1,1], S) not working. It would be nice to an easy way to load the netCDF data and then work with the transforms etc from the library.

16-bit FFT

We currently use FFTW.jl, Julia-bindings to the C library of the same name. This library implements functions like fft,ifft,rfft,irfft,plan_fft for Float32 and Float64, but there's currently no 16-bit FFT for Float16. Sure, we can always promote Float16 to Float32 and round down after the (inverse) fourier transform, but we should look out for 16-bit FFT implementations.

I've once started the pure-Julia FFT package coolFFT.jl which is just a very simple implementation of the Cooley-Tukey algorithm and therefore only works with arrays of length 2^n, but we could use a polished version of that as a fallback for arbitrary number formats?

@samhatfield

BarotropicModel and ShallowWaterModel with several layers

At the moment the number of vertical levels nlev can be chosen for BarotropicModel and ShallowWaterModel but the layers aren't actually coupled (BarotropicModel: independently executed, ShallowWaterModel: Only the first layer is part of the time integration). I believe for the BarotropicModel we can do Charney and Phillips, 1953

for every layer. This is interesting because it suddenly couples the layers through stream function, although vorticity remains the only prognostic variable in each layer. I believe that could be easily set up, even for the n-layer system. At the moment we don’t calculate the stream function explicitly, as it’s not used anywhere else, but thanks to #142 the operators are easily available. We could cut the algorithm to get from vorticity to stream function and from stream function to velocities into two parts to get the numerator. That looks like what’s described in section 3 as method B.

@maximilian-gelbrecht

Initial conditions for shallow water

The current initial conditions with implicit_alpha=0.5 produce after some days a crazy gravity wave that bounces from pole to pole, visible here in divergence, that kills the simulation. The problem is solved with implicit_alpha=1, but means we should probably find better initial conditions for the shallow water equations

Float32 vs Float64 performance

At the moment Float64 seems to be considerably faster than Float32, especially at high resolution. I don't know why that is, but just to flag it already

julia> run_speedy(Float64,model=:shallowwater,n_days=10,trunc=85);
Weather is speedy: Time: 0:00:04 (521.07 years/day)

julia> run_speedy(Float32,model=:shallowwater,n_days=10,trunc=85);
Weather is speedy: Time: 0:00:05 (402.84 years/day)

julia> run_speedy(Float32,model=:shallowwater,n_days=2,trunc=170);
Weather is speedy: Time: 0:00:20 (22.93 years/day)

julia> run_speedy(Float64,model=:shallowwater,n_days=2,trunc=170);
Weather is speedy: Time: 0:00:13 (34.85 years/day)

julia> run_speedy(Float64,model=:shallowwater,n_days=0.25,trunc=341);
Weather is speedy: Time: 0:00:22 ( 2.61 years/day)

julia> run_speedy(Float32,model=:shallowwater,n_days=0.25,trunc=341);
Weather is speedy: Time: 0:00:39 (558.52 days/day)

`@simd` in Legendre Transform?

We currently have a @simd annotation in the Legendre Transform. However, following a suggestion from @hottad I've just checked whether this actually makes a difference given that we already use muladd. Using julia 1.8.2, Float32

julia> @btime SpeedyWeather.spectral!($alms,$map,$S);
  43.376 μs (107 allocations: 5.55 KiB)

with @simd and without:

julia> @btime SpeedyWeather.spectral!($alms,$map,$S);
  36.357 μs (107 allocations: 5.55 KiB)

Similar for Float64. So at the moment @simd makes things somewhat slower. I remember having introduced that as I got slower performance with Float32 compared to Float64 which reduced the problem. But maybe in the mean time some of these compiler issues got addressed.

Factory Methods for Testing

For testing, it would be useful to have lightweight and minimal factory methods in the test directory which return a model initialised with realistic values (possibly from a previous run of the model), so you can just do, for example,

prog, diag, model = model_factory()

diag = diag_factory()

Much of this already exists, e.g. the initialize_from_rest method in prognostic_variables.jl, but it would be good to extend and formalise it.

A new grid for SpeedyWeather.jl?

At the moment we use a regular Gaussian grid for grid-point space. This comes with advantages of easy storage in a matrix internally and for output but with the disadvantage that a lot of points are put near the poles that aren't actually needed. Alternatives are

a reduced Gaussian grid (https://confluence.ecmwf.int/display/FCST/Gaussian+grids)
an octahedral Gaussian grid (jmert/CMB.jl#71)
a HEALPix grid (jmert/CMB.jl#68, https://en.wikipedia.org/wiki/HEALPix, https://arxiv.org/pdf/astro-ph/0412607.pdf)

This is an issue to discuss and collect ideas and reasons to implement either of these grids at some point in the future.

Broadcast operations with LowerTriangularMatrix

Broadcasting with LowerTriangularMatrix still returns unexpected results. The biggest problem I found was dotted operations like .*= , see how suddenly a zero appears where it shouldn’t

julia> L = rand(LowerTriangularMatrix,3,3)
3×3 LowerTriangularMatrix{Float64}:
 0.998746  0.0       0.0
 0.185934  0.783976  0.0
 0.181445  0.950264  0.970376

julia> L .*= 2
3×3 LowerTriangularMatrix{Float64}:
 1.99749   0.0  0.0
 0.371868  0.0  0.0
 0.0       0.0  1.94075

This is (I think) because it loops over each index (i,j) with an @inbounds such that the @boundscheck in setindex! is disabled and then converts i,j to a single index k which isn’t safe if j>i… The problem is apparently that the broadcasting of * creates a Matrix which is then fused with .= and writes the entries back into a LowerTriangularMatrix

julia> L = rand(LowerTriangularMatrix,3,3)
3×3 LowerTriangularMatrix{Float64}:
 0.0948286  0.0        0.0
 0.10467    0.821121   0.0
 0.971535   0.0221127  0.744497

julia> L .* 2
3×3 Matrix{Float64}:
 0.189657  0.0        0.0
 0.20934   1.64224    0.0
 1.94307   0.0442254  1.48899

Use logging infrastructure for feedback messages

It may be useful to use the logging infrastructure for the feedback messages. One benefit is that you can actually check the content of logged messages with @test_logs and this doesn't show up when running the tests. Also, you may consider using ProgressMeter.jl for the progress bar.

Float32 high resolution spectral transform

Something is wrong with the spectral transform in Float32 at high resolutions (>=T682), if I start with

julia> alms = randn(ComplexF32,854,854);    # T853
julia> map = gridded(alms);                 # = 2560x1280 grid points

then NaNs are created at mid-latitudes of both hemispheres. @jmert have you seen something like this before and could give us a hint what is going wrong?

in contrast, using Float64 everthing seems to be fine

Pretty printing

I highly vote for getting this functionality build in

Base.show(io::IO, P::PrognosticVariables) = print(io,heatmap(gridded(P.temp[:,:,end])[:,end:-1:1]'.-273.15))

such that we immediately get a nice plot everytime you run the model in the REPL

#58

Inconsistency of pressure variable in convection parametrizations

I noticed surface pressure pres is overloaded to also mean normalized surface pressure in the current convection parameteization
https://github.com/milankl/SpeedyWeather.jl/blob/4b3176764bdd4897978ce2312948bd0d2f03da79/src/convection.jl#L43

this is also confirmed in the tests where a normlised pressure is used intsead
https://github.com/milankl/SpeedyWeather.jl/blob/4b3176764bdd4897978ce2312948bd0d2f03da79/test/convection.jl#L10

I think this should be changed throughout to make clear this is not the surface pressure in hPa.

@milankl separate and perhaps for later but I find pres to mean surface pressure unclear (i.e. when I first saw it I was expending a vector of pressures in hPa!), can this be changed to surf_pres or something like that?

Spectral gradient unit test

Following this document

we should be able to formulate a nice unit test for the spectral gradients.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Prepare input data

speedy.f90 relies on a number of climatological and boundary fields. These are available currently on the T30/96×48 grid but we would like them for all available resolutions.

I will begin with the orography field as I already have the data for this on a high-resolution TCO1279 (2560×1280, 9 km) grid, taken from the IFS.

test with posits

the performance would be awful (cause emulated floating point), but it would be interesting to use https://github.com/milankl/SoftPosit.jl to test posit accuracy at 16 bit.

Parallelism in the vertical

One easy way to parallelise speedy might be to distribute the calculation of the spectral transform across n workers in the vertical, using SharedArrays (documentation here) from Julia's standard library. While this limits us to nx speedups from parallelisation, which might be fine as SpeedyWeather.jl will probably run on small clusters only anyway. Such that for T30 and n=8 levels we can (hopefully) efficiently run on 8 cores, but for n=48,64 levels we could get significant speedups for higher resolution versions of SpeedyWeather.jl (T100-T500). Given the shared memory of this approach, we'll be limited by the 48 cores on A64FX, but that might be absolutely sufficient for now.

Inverse Laplacian

I’d need the inverse laplacian to convert vorticity to streamfunction. I’ve seen that this is available in the code, but commented out in spectral_gradients! as ∇⁻²!. Why is that so? Could we make it functional again? SpectralTransform still has the eigenvalues⁻¹ field that I also only used by ∇⁻²!

UV_from_vor! also somehow inverts the laplacian but I don't understand how. Anyway though, I need to compute the stream function not the wind field.

Discuss general API

This is perhaps not the best place to discuss but I would like to better understand what general API you have in mind for SpeedyWeather.jl. Especially for dynamics and parametrization schemes and passing of variables and parameters. I would like to try avoid pitfalls of other models.

Move parameterisations into folder

Now as there are more and more parameterisations coming in I would suggest to move every parameterization-related to src/parameterizations eventually this could even become it's own module, but I don't think this is needed now.

At the moment I think to move in there

column_variables.jl
convection.jl
large_scale_condensation.jl
thermodynamics.jl
shortwave_radiation.jl
longwave_radiation.jl

so basically everything that is called within parameterization_tendencies! (but I would leave that one as it defines the interface on the dynamics side).

@white-alistair @dmey

Variable initialisation

Variables (e.g. column_variables.jl) are initialised with zero. Is there any reason for doing so? If not why not using NaN for floats and typemax or typemin for ints?

Available resolutions

While it might be nice to have a wide range of resolutions available, in practice this might be difficult for speedy. There might be stability constraints such that we may want to agree on a small subset (n=3...8?) of available resolutions, e.g. T30, T60, ... Speedy's default is

T30, L8, 96x48 grid points (3.75˚ resolution), 40min time step

It seems that ECMWF used operationally in the past decades

1985: T106L16
1986: T106L19
1991: T213L31
1998: T319L31
1999: T319L50

Which sounds like a good subset of resolutions to aim for. @tomkimpson @samhatfield what do you think, and what are other constraints we should be worried about?

Naming "spectral"

At the moment we have the within the GeoSpectral struct a SpectralTransform struct called spectral such that one can do

@unpack spectral = model_setup.geospectral

as well as the function spectral that does a spectral transform, i.e. double naming. @unpacking the spectral overwrites the available function in local scope. In most cases this isn't an issue as we only use the in-place function spectral! within the code hence no conflict, but for clarity and bug prevention, we should resolve that.

Use Celsius instead of Kelvin to avoid rounding errors at low-precision

Copying @milankl's comment from PR #82:

I mean rewriting the entire model such that temperature uses ˚C instead of Kelvin. I expect everything to work very similar, instead of probably radiation, where somewhere the Stefan-Boltzmann ~T^4 should appear. We could then convert from ˚C to Kelvin therein. But from a precision point-of-view using Kelvin is awful as you'll hit massive rounding errors in the time integration, e.g. T = 300K, tendency dT = 0.1K
julia> T = Float16(300)
Float16(300.0)

julia> dT = Float16(0.1)
Float16(0.1)

julia> T+dT
Float16(300.0)

julia> T2 = Float16(300-273.15)
Float16(26.84)

julia> T2+dT
Float16(26.94)
In Kelvin you can't resolve the increment, but in ˚C you can.

Having said that, we obviously do the time stepping in spectral space, meaning that we'd only have that problem on the l=m=0 mode which we could also solve with compensated summation, but I think in general, we should try to use ˚C if possible. If it turns out that ˚C is a bad idea then we can still revert back.

Longitude offset rotation

While not needed for the octahedral grids as the first longitude point on every rings sits right on the prime meridian, the HEALPix grids need a longitude offset rotation. See Justin's SphericalHarmonicTransforms.jl documentation

such that a $\exp(im\phi_0)$ needs to be multiplied after the Legendre transform but before the Fourier transform in spectral->grid and $\exp(-im\phi_0)$ for the inverse. These factors are currently calculated as

julia> S = SpectralTransform(Float64,HEALPixGrid,15,false);
julia> o = S.lon_offsets[2,:]    # m=1
16-element Vector{ComplexF64}:
 0.7071067811865476 + 0.7071067811865476im
 0.9238795325112867 + 0.3826834323650898im
 0.9659258262890683 + 0.25881904510252074im
 0.9807852804032304 + 0.19509032201612828im
 0.9876883405951378 + 0.15643446504023087im
 0.9914448613738104 + 0.13052619222005157im
 0.9937122098932426 + 0.11196447610330786im
 0.9951847266721969 + 0.0980171403295606im
                1.0 + 0.0im
 0.9951847266721969 + 0.0980171403295606im
                1.0 + 0.0im
 0.9951847266721969 + 0.0980171403295606im
                1.0 + 0.0im
 0.9951847266721969 + 0.0980171403295606im
                1.0 + 0.0im
 0.9951847266721969 + 0.0980171403295606im

julia> @. atand(imag(o)/real(o))    # retrieve the angle in degree
16-element Vector{Float64}:
 45.0
 22.500000000000004
 14.999999999999998
 11.25
  9.0
  7.499999999999999
  6.428571428571429
  5.625
  0.0
  5.625
  0.0
  5.625
  0.0
  5.625
  0.0
  5.625

and $|\exp(im\phi_0))| = 1$, which looks all good to me: The point on the first ring starts at 45˚E, on the 2nd ring 22.5˚C etc till the equatorial band alternates between no offset and the offset at $j=8$, the last ring of the polar cap. This can be verified with

julia> [360/4j/2 for j in 1:8]
8-element Vector{Float64}:
 45.0
 22.5
 15.0
 11.25
  9.0
  7.5
  6.428571428571429
  5.625

However (and now coming to the actual issue) when enabling the longitude offset rotation for HEALPixGrid and HEALPix4Grid some tests fail, whereas they pass for no rotation. No rotation is equivalent to rings being rotated to start at the prime meridian instead.

Test Summary:                                                   |    Pass  Fail    Total   Time
Transform: Individual Legendre polynomials (inexact transforms) | 1993714    14  1993728  13.2s
  trunc = 127                                                   |  402422    10   402432   4.1s
    NF = Float32                                                |  201211     5   201216   2.2s
      Grid = HEALPixGrid                                        |   50301     3    50304   0.9s
      Grid = HEALPix4Grid                                       |   50302     2    50304   0.5s
      Grid = FullHEALPixGrid                                    |   50304          50304   0.4s
      Grid = FullHEALPix4Grid                                   |   50304          50304   0.4s
    NF = Float64                                                |  201211     5   201216   1.9s
      Grid = HEALPixGrid                                        |   50301     3    50304   0.6s
      Grid = HEALPix4Grid                                       |   50302     2    50304   0.5s
      Grid = FullHEALPixGrid                                    |   50304          50304   0.4s
      Grid = FullHEALPix4Grid                                   |   50304          50304   0.4s
  trunc = 255                                                   | 1591292     4  1591296   9.0s
    NF = Float32                                                |  795646     2   795648   4.1s
      Grid = HEALPixGrid                                        |  198911     1   198912   1.0s
      Grid = HEALPix4Grid                                       |  198911     1   198912   1.0s
      Grid = FullHEALPixGrid                                    |  198912         198912   1.0s
      Grid = FullHEALPix4Grid                                   |  198912         198912   1.1s
    NF = Float64                                                |  795646     2   795648   5.0s
      Grid = HEALPixGrid                                        |  198911     1   198912   1.1s
      Grid = HEALPix4Grid                                       |  198911     1   198912   1.3s
      Grid = FullHEALPixGrid                                    |  198912         198912   1.3s
      Grid = FullHEALPix4Grid                                   |  198912         198912   1.3s
ERROR: Some tests did not pass: 1993714 passed, 14 failed, 0 errored, 0 broken.

Spectral gradient leakage?

When starting with a random vorticity field in spectral space (randn(Complex{NF}) for l,m in 2:25,2:25 (or other lmax,mmax) one should be able to

[spectral space] Obtain the stream function by inverting the spectral laplace operator
[spectral space] Obtain coslatu, coslatv from the zonal&meridional gradient of the stream function
[spectral->grid space] Obtain coslatu, coslatv in grid space with the spectral transform
[grid space] unscale coslatu, coslatv to obtain u,v without the coslat scaling
[grid->spectral space] Obtain u, v in spectral space via spectral transform
[spectral space] Calculate the zonal gradient of v and the meridional gradient of u
[spectral space] subtract the zonal gradient of v and the meridional gradient of u to obtain vorticity
[spectral->grid->spectral space] unscale vorticity with /coslat in grid space

And hence reobtain the original vorticity field, this works somehow:

However, it's unclear to me how exact this method should be, and whether there's another (simpler) loop one can do to check that all gradients work as they are supposed to. @white-alistair @maximilian-gelbrecht any idea?

Output grids

Given that we have now the flexibility to use various grids, we need to think about how to output gridded data to netcdf. Several issues

The octahedral grids cannot be reshaped/raveled into a matrix, although this is possible for the full grids and the healpix grids
If the latitudes of the compute grid and the output grid don't match then the Legendre polynomials have to be recomputed. That's either expensive or doubles our memory footprint because two sets of Legendre polynomials have to be kept in memory

So I see several possibilities

We always output to FullGaussianGrid (currently), but that's expensive and considerably slows down the simulation
We output to the full grid that corresponds to the compute grid. E.g. OctahedralGaussianGrid-FullGaussianGrid, OctahedralClenshawGrid-FullClenshawGrid meaning we would need to define a full version to the HEALPix grid and create a S_output = SpectralTransform(S,output_grid=...) which changes parameters like nlon but points to the already precomputed Legendre polynomials
Like 2) but we output the HEALPix grids as a matrix by concatenating their basepixel squares.
We don't do some spectral transform onto the output grid and interpolate on the spatial grid instead, but that requires a interpolate!(grid::Grid,lat,lon) function for every Grid<:AbstractGrid, which, however, we may need eventually anyway if we want semi-Lagrangian advection at some point

I don't like 1) what we currently do, hence this issue. I do like 2) and 3) and think that's the simplest to implement. 4) is possible but sounds more like a long-term solution. I reckon it's faster than 2) but not than 3) (which is just memory reordering)

@hottad do you agree?

Number format-flexibility

As raised in #31 and #77, v0.3 is still not fully number format flexible. Main bottleneck is the Fourier Transform as FFTW.jl only supports Float32/64. For other formats we'd need to fall back to GenericFFT.jl. #136 addresses that, but there's more that needs to be done. This issue is to collect efforts to remove remaining type instabilities

julia> using SoftPosit

julia> run_speedy(Posit16)
ERROR: promotion of types Float64 and Posit16 failed to change any arguments
Stacktrace:
  [1] error(::String, ::String, ::String)
    @ Base ./error.jl:44
  [2] sametype_error(input::Tuple{Float64, Posit16})
    @ Base ./promotion.jl:383
  [3] not_sametype(x::Tuple{Float64, Posit16}, y::Tuple{Float64, Posit16})
    @ Base ./promotion.jl:377
  [4] promote
    @ ./promotion.jl:360 [inlined]
  [5] -(x::Float64, y::Posit16)
    @ Base ./promotion.jl:390
  [6] broadcasted(#unused#::Base.Broadcast.DefaultArrayStyle{1}, #unused#::typeof(-), r::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, x::Posit16)
    @ Base.Broadcast ./broadcast.jl:1110
  [7] broadcasted
    @ ./broadcast.jl:1304 [inlined]
  [8] generalised_logistic(x::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, coefs::Main.SpeedyWeather.GenLogisticCoefs{Posit16})
    @ Main.SpeedyWeather ~/git/SpeedyWeather.jl/src/geometry.jl:195
  [9] vertical_coordinates(P::Parameters)
    @ Main.SpeedyWeather ~/git/SpeedyWeather.jl/src/geometry.jl:184

Consistent definition of model

We have a models.jl with the Model struct that should define all model variables e.g.

    P = Parameters(NF=NF,kwargs...)
    C = Constants{NF}(P)
    G = GeoSpectral{NF}(P)

    M = Model(P,C,G)

This needs updating to make it consistent with the rest of the Speedy structure and notation.
The Implicit and HorizontalDiffusion structs need updating in particular. For example, HorizontalDiffusion calls parameters g and $\gamma$ from Parameters which are not defined, Model() redefines the Earth radius

A very basic new Model struct is defined in #39 for the testing of the tendencies calculations. I will fold in a complete updated Model() struct into that PR.

Performance of the spectral transform

This is an issue to track to progress in speeding up the spectral transform. We are starting with

Float64
T30 and 96x48

Fourier transform

rfft and irfft from FFTW, but no fft plan
no memory preallocations and explicit zero-padding

Legendre transform

no @inbounds
but LinearAlgebra dot (probably OpenBLAS)
no memory preallocations

With the following benchmark (back&forth transform)

https://github.com/milankl/SpeedyWeather.jl/blob/5b6e8102a506946508f5e8188eb75f7b5df48a4e/test/benchmark.jl#L1-L16

Timings are

  6.020 ms (4084 allocations: 580.59 KiB)      # Fourier
  131.585 μs (1345 allocations: 684.67 KiB)    # Legendre
  6.434 ms (5429 allocations: 1.24 MiB)        # Both

Forcing and climatology in the shallow water system

What is actually a reasonable warm-up / transient integration time for Speedy?

That's at the moment no straight forward to say. Most simulations I've run so far just used some initial conditions for vorticity. In that case you get a reasonably developed turbulence in 10-20 days even at higher resolution. However, that's decaying turbulence due to the lack of forcing obviously. Hence there's no actually interesting invariant measure the model converges to.

With #144 I've added some forcing for the shallow water equations, even including a seasonal cycle that you can switch on with interface_relaxation = true (default false) and control with interface_relax_time and interface_relax_amplitude.

What's happening there is that one pulls the l=1,m=0 zonal mode of the interface displacement / SSH towards something that mimicks an equatorial heating. Other modes are not affected, so you literally just dump energy into that mode. With the seasonal_cycle = true (default true) that equatorial heating moves up and down with the seasons, similar to an ITCZ but actually all the way to the tropic of cancer back to the equator and down to the tropic of capricorn and back.

My hope was to get an interesting and stable climate also in the shallow water system. But at the moment it still converges to a steady state (or one that closely follows the seasonal cycle). I probably haven't played around with it enough to get an actually nice setup that produces eddies at all resolutions with a proper energy cascade from some large scale forcing to some small scale dissipation

OctahedralGaussianGrid not defined

Installed SpeedyWeather.jl on Julia 1.6.4 for Windows. After executing, as in the example, run_speedy(n_days=30, trunc=63, Grid=OctahedralGaussianGrid, model=:shallowwater, output=true) I get:

ERROR: UndefVarError: OctahedralGaussianGrid not defined
Stacktrace:
 [1] top-level scope
   @ REPL[3]:1

References for the dynamical core

This is an issue to discuss the implementation of the spectral dynamical core. There is

Bourke 1974, Monthly Weather Review. A primitive equation spectral dynamical core on the sphere. Uses streamfunction and velocity potential instead of vorticity and divergence as prognostic variables, but probably doesn't make a difference given they are only a Laplacian away = easy in spherical harmonics.
GFDL idealised global spectral models, part I: Barotropic vorticity equation. Derives the barotropic vorticity equation in spherical harmonics, explains the n=l-m-spectral packing, illustrates the calculation of some gradients in spectral space, outlines the algorithm and provides a simple test case.
GFDL idealised global spectral models, part II: Shallow water model extends the barotropic equations with a Bernoulli-potential like term for the prognostic divergence and introduces the prognostic pressure/geopotential layer thickness.
GFDL idealised global spectral models, part III: full spectral model. Furthermore introduces the thermodynamics, vertical aspects and the implicit handling of terms in the time stepping.
Randall. An Introduction to Numerical Modeling of the Atmosphere, p. 409f. More on the spherical transform and truncations. Little about spherical dynamical cores.

`Float64` `run_speedy` doesn't output correctly

julia> run_speedy(Float64; n_days=20, model=:shallowwater, output=true, trunc=62, )
ERROR: MethodError: no method matching scale_coslat⁻¹!(::SubArray{Float32, 2, Array{Float32, 3}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Int64}, true}, ::Geometry{Float64})
Closest candidates are:
  scale_coslat⁻¹!(::AbstractMatrix{NF}, ::Geometry{NF}) where NF at ~/.julia/packages/SpeedyWeather/UDuyp/src/spectral_gradients.jl:266
  scale_coslat⁻¹!(::AbstractArray{NF, 3}, ::Any...) where NF at ~/.julia/packages/SpeedyWeather/UDuyp/src/distributed_vertical.jl:5

Likely because they are being converted into Float32 while saving but before scaling.

https://github.com/milankl/SpeedyWeather.jl/blob/1881d47ab3f28a85e433ccaaf5fc7218e4792b8d/src/output.jl#L173-L181

Use PsychroLib for psychrometric calculations

This is kind of a placeholder but maybe we should consider using https://github.com/psychrometrics/psychrolib for psychometric calculations as it may come in handy later with some parametizations like land and diagnostics. Let me know if this could be useful in the future as will need to port it to Julia...

Horizontal domain decomposition for parametrizations

One discussions we need to have is whether for the parametrizations we loop in the vert, lat, lon order as the arrays are layed out in memory (and as we do for the dynamics as most dependencies are in the horizontal and only(?) the geopotential couples the vertical levels) or whether for the parametrizations we should do lat,lon,vert because they act on columns hence we could decompose the domain the horizontal, pass on vertical views to the functions and parallelise that way. I believe the get_parametrization_tendencies function could literally just do

@distributed for j=1:nlat,i=1:nlon
    temp_column = view(temp_grid,i,j,:)
    ...
    get_parametrization_tendencies(temp_column, ... )

and hence every grid point would get its own thread and we'd have just one place where the horizontal domain decomposition happens.

@white-alistair copying over from zulip
@samhatfield do you think this is a reasonable idea for speedy?

Solution struct / return the Output struct

While working with SpeedyWeather.jl now for a bit, I noticed something that I would consider unpractical: When running Speedy the output is only to NetCDF in a path that is set during runtime and not saved somewhere in a struct that is returned to the user.

A practical example:

using SpeedyWeather 
p = run_speedy(Float32, output=true)

Now I want to continue to work with this simulation run in the same script. Just from running it like this there's no way of knowing what the path is. If it is the first time I've run Speedy in this folder it is run0001, but maybe I've already run it a few times, then it is something else.

vor =  ncread("run0001/output.nc", "vor") # here I have to fill in the 0001 manually and can't use some value from p
time = ncread("run0001/output.nc", "time")

A way of fixing this would be to either directly return the Output object as well (maybe only optionally via a kwarg) or to create a separate Solution object that can hold all kind of information that is only set during run time / time stepping. A separate Solution object could e.g. also have some wrapper functions around NetCDF.jl that make it mimic the solution objects from DiffEq.jl a bit.

This would also make writing tests for the netCDF output more reliable.

So an easy way to address this would be to just return the outputteralongside the PrognosticVariables instance, if the output==true kwarg is there . A more general way would to create such a Solution object. I would say the latter is preferable if we can also think about other things that are saved / computed during time stepping that are not already saved in the ModelSetup or Output.

Move to SpeedyWeather orga

I've just created github.com/SpeedyWeather but before transferring the ownership of this repository to the orga (which is in the "danger zone") I wanted to make sure that this is the right way to go and that everyone agrees.

@maximilian-gelbrecht @dmey

Pre-allocate grid-point pressure values and use consistent naming

This is a left over from this comment on #82

Implementing various models

To implement several sets of equations (barotropic vorticity equation, shallow water equations, primitive equations) one could define

BaroptropicModel <: ModelSetup
ShallowWaterModel <: ModelSetup
PrimitiveEquationModel <: ModelSetup

and then dispatch to functions like

function get_tendencies(...,M::BarotropicModel) = ...
function get_tendencies(...,M::ShallowWaterModel) = ...

where ever functions should do something different and otherwise

function leapfrog(...,M::ModelSetup) = ...

when they are the same. That would be a Julian way of avoiding if-calls in an effort to support various equations simultaneously. But hard to tell whether later on this will lead to ugly design choices to still enable compatibility with BarotropicModel and ShallowWaterModel ...

Get run id fails if 000?.xyz exists

julia> run_speedy(trunc=341,output=true,output_dt=3,model=:shallowwater)
ERROR: ArgumentError: invalid base 10 digit '.' in "0007.tar.xz"
Stacktrace:
  [1] tryparse_internal(#unused#::Type{Int64}, s::String, startpos::Int64, endpos::Int64, base_::Int64, raise::Bool)
    @ Base ./parse.jl:137
  [2] parse(::Type{Int64}, s::String; base::Nothing)
    @ Base ./parse.jl:241
  [3] parse
    @ ./parse.jl:241 [inlined]
  [4] (::SpeedyWeather.var"#37#39")(id::String)
    @ SpeedyWeather ./none:0
  [5] iterate
    @ ./generator.jl:47 [inlined]
  [6] collect_to!
    @ ./array.jl:782 [inlined]
  [7] collect_to_with_first!
    @ ./array.jl:760 [inlined]
  [8] collect(itr::Base.Generator{Vector{String}, SpeedyWeather.var"#37#39"})
    @ Base ./array.jl:734
  [9] get_run_id_path(P::Parameters)
    @ SpeedyWeather ~/.julia/packages/SpeedyWeather/OJt3A/src/output.jl:14

speedyweather / speedyweather.jl Goto Github PK

speedyweather.jl's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs