speedyweather / speedyweather.jl Goto Github PK
View Code? Open in Web Editor NEWPlay atmospheric modelling like it's LEGO.
Home Page: https://speedyweather.github.io/SpeedyWeather.jl/dev
License: MIT License
Play atmospheric modelling like it's LEGO.
Home Page: https://speedyweather.github.io/SpeedyWeather.jl/dev
License: MIT License
Currently, there is not a clear distinction between universal physical constants, which never change, and model parameters, which might.
Copying @milankl's comment from PR #82:
What about using "constants" for actual physical constants that nobody would want to change, "parameters" for everything that run_speedy may take as an argument and "constants_runtime" for every value that has to be precalculated, probably from parameters, but is constant at runtime. Meaning we should rename the constants.jl file and constants.jl should then be loaded before default_parameters?
This is an issue to collect ideas around the linear vs quadratic vs cubic truncation when looping over the zonal wavenumbers
for m in 1:min(nfreq:mmax+1)
with nfreq = nlon÷2 + 1
and nlon
the number of longitude points at a given latitude ring. So for full grids, the loop goes over all m
but for reduced grids this can be shortened. IFS seems to be doing the following (j
nlon
nfreq
) at Tco79
( 1 20 8) ( 2 24 10) ( 3 28 12) ( 4 32 14) ( 5 36 16) ( 6 40 18) ( 7 44 20) ( 8 48 22)
( 9 52 24) ( 10 56 26) ( 11 60 27) ( 12 64 29) ( 13 68 31) ( 14 72 33) ( 15 76 35) ( 16 80 36)
( 17 84 38) ( 18 88 40) ( 19 92 41) ( 20 96 43) ( 21 100 44) ( 22 104 46) ( 23 108 47) ( 24 112 49)
( 25 116 50) ( 26 120 52) ( 27 124 53) ( 28 128 55) ( 29 132 56) ( 30 136 57) ( 31 140 58) ( 32 144 60)
( 33 148 61) ( 34 152 62) ( 35 156 63) ( 36 160 64) ( 37 164 65) ( 38 168 67) ( 39 172 68) ( 40 176 69)
( 41 180 70) ( 42 184 71) ( 43 188 72) ( 44 192 73) ( 45 196 74) ( 46 200 75) ( 47 204 76) ( 48 208 77)
( 49 212 78) ( 50 216 79) ( 51 220 79) ( 52 224 79) ( 53 228 79) ( 54 232 79) ( 55 236 79) ( 56 240 79)
( 57 244 79) ( 58 248 79) ( 59 252 79) ( 60 256 79) ( 61 260 79) ( 62 264 79) ( 63 268 79) ( 64 272 79)
( 65 276 79) ( 66 280 79) ( 67 284 79) ( 68 288 79) ( 69 292 79) ( 70 296 79) ( 71 300 79) ( 72 304 79)
( 73 308 79) ( 74 312 79) ( 75 316 79) ( 76 320 79) ( 77 324 79) ( 78 328 79) ( 79 332 79) ( 80 336 79)
( 81 336 79) ( 82 332 79) ( 83 328 79) ( 84 324 79) ( 85 320 79) ( 86 316 79) ( 87 312 79) ( 88 308 79)
( 89 304 79) ( 90 300 79) ( 91 296 79) ( 92 292 79) ( 93 288 79) ( 94 284 79) ( 95 280 79) ( 96 276 79)
( 97 272 79) ( 98 268 79) ( 99 264 79) ( 100 260 79) ( 101 256 79) ( 102 252 79) ( 103 248 79) ( 104 244 79)
( 105 240 79) ( 106 236 79) ( 107 232 79) ( 108 228 79) ( 109 224 79) ( 110 220 79) ( 111 216 79) ( 112 212 78)
( 113 208 77) ( 114 204 76) ( 115 200 75) ( 116 196 74) ( 117 192 73) ( 118 188 72) ( 119 184 71) ( 120 180 70)
( 121 176 69) ( 122 172 68) ( 123 168 67) ( 124 164 65) ( 125 160 64) ( 126 156 63) ( 127 152 62) ( 128 148 61)
( 129 144 60) ( 130 140 58) ( 131 136 57) ( 132 132 56) ( 133 128 55) ( 134 124 53) ( 135 120 52) ( 136 116 50)
( 137 112 49) ( 138 108 47) ( 139 104 46) ( 140 100 44) ( 141 96 43) ( 142 92 41) ( 143 88 40) ( 144 84 38)
( 145 80 36) ( 146 76 35) ( 147 72 33) ( 148 68 31) ( 149 64 29) ( 150 60 27) ( 151 56 26) ( 152 52 24)
( 153 48 22) ( 154 44 20) ( 155 40 18) ( 156 36 16) ( 157 32 14) ( 158 28 12) ( 159 24 10) ( 160 20 8)
So that's about 3 fewer m around the poles and less than half at the Equator
julia> cat(ifs[1:80],spd,spd-ifs[1:80],dims=2)
80×3 Matrix{Int64}:
8 10 2
10 12 2
12 14 2
14 16 2
16 18 2
18 20 2
20 22 2
22 24 2
24 26 2
26 28 2
27 30 3
29 32 3
31 34 3
33 36 3
35 38 3
36 40 4
38 42 4
40 44 4
41 46 5
43 48 5
44 50 6
46 52 6
47 54 7
49 56 7
50 58 8
52 60 8
⋮
and apparently follows the formula nfreq = floor((nlon - 1)/(2 + coslat^2))) - 1
. At that resolution this is per hemisphere 4784 loops over m compared to 6400 total (80 rings each 80 orders). Meaning there's a good amount of performance gain possible if we follow something similar. What needs to be checked though is how much this applies to other grids and whether we can directly formulate this as a function of the truncation order.
More information on this apparently in Courtier and Naughton 1994
Compute-intensive loops for which we'll define GPU kernels basically fall into one of the three categories (sorted from simple to complex)
for lm in eachharmonic. These kernels loop over the non-zero indices of one or several LowerTriangularMatrix
s but only access/write into the lm
on every iteration. No cross dependencies to other harmonics. May include scalar constants. All input arrays are of the same size.
for i,j in eachentry(::Matrix) with vec[j]. These kernels loop over all entries LowerTriangularMatrix
) but also pull data from vector at index
for l,m in eachharmonic with A[l+1,m] and A[l-1,m]. These kernels loop over the non-zero indices LowerTriangularMatrix
s and access for every
spherical harmonic transforms. Like 2) but with signs depending on odd and even modes, and combined with Fourier transforms.
Examples
for lm in eachharmonic
a) The horizontal diffusion
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/diffusion.jl#L12-L21
b) The leapfrog time integration
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/time_integration.jl#L13-L43
for i,j in eachentry(::Matrix) with vec[j]
a) The vorticity fluxes (in grid-point space)
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/tendencies_dynamics.jl#L287-L313
b) The Bernoulli potential (in grid-point space)
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/tendencies_dynamics.jl#L349-L373
c) The Laplace operator (in spectral space)
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/spectral_gradients.jl#L263-L289
for l,m in eachharmonic with A[l+1,m] and A[l-1,m]
a) The divergence/curl operator
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/spectral_gradients.jl#L68-L99
b)
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/spectral_gradients.jl#L166-L210
spherical harmonic transforms
a) spectral to grid
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/spectral_transform.jl#L279-L335
b) grid to spectral
https://github.com/milankl/SpeedyWeather.jl/blob/e1c1e79fe43cf5c23603a87039b27c4fc59d4250/src/spectral_transform.jl#L384-L435
Is it intended that both
return L
? One should return M
, no?
#110 explores a custom matrix format that only stores the lower triangle of coefficient matrices explicitly. Idea is that many operations just loop over m = 1:mmax+1, l=m:lmax+1
, which technically jumps over the explicitly stored zeros in the upper triangle with Matrix
. A LowerTriangularMatrix
allows for access like [i,j]
still, which needs to be translated to [k]
in the underlying vector of LowerTriangularMatrix
(and returns zero(T)
for j>i
).
Question remains how to best implement the getindex
functions to avoid boundscheck (or propagate an @inbounds
, currently this is just
ij2k(i::Integer,j::Integer,m::Integer) = i+(j-1)*m-fibonacci(j)
Base.getindex(L::LowerTriangularMatrix,k::Integer) = getindex(@inbounds L.v[k])
function Base.getindex(L::LowerTriangularMatrix{T},i::Integer,j::Integer) where T
@boundscheck (i > L.m || j > L.n) && throw(BoundsError(L,(i,j)))
j > i && return zero(T)
return getindex(L.v,ij2k(i,j,L.m))
end
and needs some care. Anyway, changing in gridded!
from alms[i,j]
to alms[k]
(as the zero elements are never accessed anyway) this gives similar speeds at half the memory
julia> A = randn(ComplexF32,86,86);
julia> L = LowerTriangularMatrix(A);
julia> sizeof(A)/1000 # in KB
59.168
julia> sizeof(L)/1000 # in KB
29.928
julia> @btime SpeedyWeather.gridded!($map,$A,$m.geospectral.spectral_transform);
686.543 μs (194 allocations: 12.12 KiB)
julia> @btime SpeedyWeather.gridded!($map,$L,$m.geospectral.spectral_transform);
685.417 μs (194 allocations: 12.12 KiB)
For large grids a LowerTriangularMatrix
is faster, presumably because of contiguous memory access
julia> @btime SpeedyWeather.gridded!($map,$L,$m.geospectral.spectral_transform);
254.260 ms (1540 allocations: 112.22 KiB)
julia> @btime SpeedyWeather.gridded!($map,$A,$m.geospectral.spectral_transform);
274.818 ms (1540 allocations: 112.22 KiB)
The most up-to-date documentation of the original Speedy is here.
With respect to the physical parameterizations, a number of apparent inconsistencies exist between the documentation and the implementation in speedy.f90.
We will record them in this issue.
The time variable in the NetCDF files is always in Int32
precision and in hours as a unit. However, one can also have time steps at non-integer hours.
I'd change
to also be in output_NF
, or is there anything speaking against that?
https://milankl.github.io/SpeedyWeather.jl/dev/spectral_transform/
Added in #89
There is https://upload.wikimedia.org/wikipedia/commons/1/12/Rotating_spherical_harmonics.gif available?
![Rotating spherical harmonics](https://upload.wikimedia.org/wikipedia/commons/1/12/Rotating_spherical_harmonics.gif)
I might misunderstand the way the netCDF is written, but at least for a FullGaussianGrid
I expected this to work:
SpeedyWeather.run_speedy(Float32, trunc=ntrunc, σ_levels_half=[0.,0.4,0.6,1.], n_days=5, nlev=3, output=true)
P, D, M = SpeedyWeather.initialize_speedy(NF, initial_conditions=:restart, restart_id=1) # output dt made to fit this ntrunc to save every time step
vor = ncread("run0001/output.nc","vor")
S = M.spectral_transform
G = M.geometry
Grid = G.Grid
Grid(vor[:,:,1,1]) # yields a 8192-element FullGaussianGrid
gridded(P.layers[1].leapfrog[1].vor, S) # yields a 4608-element FullGaussianGrid
So half the grid points. What am I missing there?
This also results in things like spectral(Grid(vor[:,:,1,1], S)
not working. It would be nice to an easy way to load the netCDF data and then work with the transforms etc from the library.
We currently use FFTW.jl, Julia-bindings to the C library of the same name. This library implements functions like fft,ifft,rfft,irfft,plan_fft
for Float32
and Float64
, but there's currently no 16-bit FFT for Float16
. Sure, we can always promote Float16
to Float32
and round down after the (inverse) fourier transform, but we should look out for 16-bit FFT implementations.
I've once started the pure-Julia FFT package coolFFT.jl which is just a very simple implementation of the Cooley-Tukey algorithm and therefore only works with arrays of length 2^n, but we could use a polished version of that as a fallback for arbitrary number formats?
At the moment the number of vertical levels nlev
can be chosen for BarotropicModel
and ShallowWaterModel
but the layers aren't actually coupled (BarotropicModel: independently executed, ShallowWaterModel: Only the first layer is part of the time integration). I believe for the BarotropicModel we can do Charney and Phillips, 1953
for every layer. This is interesting because it suddenly couples the layers through stream function, although vorticity remains the only prognostic variable in each layer. I believe that could be easily set up, even for the n-layer system. At the moment we don’t calculate the stream function explicitly, as it’s not used anywhere else, but thanks to #142 the operators are easily available. We could cut the algorithm to get from vorticity to stream function and from stream function to velocities into two parts to get the numerator. That looks like what’s described in section 3 as method B.
The current initial conditions with implicit_alpha=0.5
produce after some days a crazy gravity wave that bounces from pole to pole, visible here in divergence, that kills the simulation. The problem is solved with implicit_alpha=1
, but means we should probably find better initial conditions for the shallow water equations
At the moment Float64 seems to be considerably faster than Float32, especially at high resolution. I don't know why that is, but just to flag it already
julia> run_speedy(Float64,model=:shallowwater,n_days=10,trunc=85);
Weather is speedy: Time: 0:00:04 (521.07 years/day)
julia> run_speedy(Float32,model=:shallowwater,n_days=10,trunc=85);
Weather is speedy: Time: 0:00:05 (402.84 years/day)
julia> run_speedy(Float32,model=:shallowwater,n_days=2,trunc=170);
Weather is speedy: Time: 0:00:20 (22.93 years/day)
julia> run_speedy(Float64,model=:shallowwater,n_days=2,trunc=170);
Weather is speedy: Time: 0:00:13 (34.85 years/day)
julia> run_speedy(Float64,model=:shallowwater,n_days=0.25,trunc=341);
Weather is speedy: Time: 0:00:22 ( 2.61 years/day)
julia> run_speedy(Float32,model=:shallowwater,n_days=0.25,trunc=341);
Weather is speedy: Time: 0:00:39 (558.52 days/day)
We currently have a @simd
annotation in the Legendre Transform. However, following a suggestion from @hottad I've just checked whether this actually makes a difference given that we already use muladd
. Using julia 1.8.2, Float32
julia> @btime SpeedyWeather.spectral!($alms,$map,$S);
43.376 μs (107 allocations: 5.55 KiB)
with @simd
and without:
julia> @btime SpeedyWeather.spectral!($alms,$map,$S);
36.357 μs (107 allocations: 5.55 KiB)
Similar for Float64. So at the moment @simd
makes things somewhat slower. I remember having introduced that as I got slower performance with Float32 compared to Float64 which reduced the problem. But maybe in the mean time some of these compiler issues got addressed.
For testing, it would be useful to have lightweight and minimal factory methods in the test
directory which return a model initialised with realistic values (possibly from a previous run of the model), so you can just do, for example,
prog, diag, model = model_factory()
or
diag = diag_factory()
Much of this already exists, e.g. the initialize_from_rest
method in prognostic_variables.jl
, but it would be good to extend and formalise it.
At the moment we use a regular Gaussian grid for grid-point space. This comes with advantages of easy storage in a matrix internally and for output but with the disadvantage that a lot of points are put near the poles that aren't actually needed. Alternatives are
This is an issue to discuss and collect ideas and reasons to implement either of these grids at some point in the future.
Broadcasting with LowerTriangularMatrix still returns unexpected results. The biggest problem I found was dotted operations like .*= , see how suddenly a zero appears where it shouldn’t
julia> L = rand(LowerTriangularMatrix,3,3)
3×3 LowerTriangularMatrix{Float64}:
0.998746 0.0 0.0
0.185934 0.783976 0.0
0.181445 0.950264 0.970376
julia> L .*= 2
3×3 LowerTriangularMatrix{Float64}:
1.99749 0.0 0.0
0.371868 0.0 0.0
0.0 0.0 1.94075
This is (I think) because it loops over each index (i,j) with an @inbounds
such that the @boundscheck
in setindex!
is disabled and then converts i,j to a single index k which isn’t safe if j>i
… The problem is apparently that the broadcasting of *
creates a Matrix
which is then fused with .=
and writes the entries back into a LowerTriangularMatrix
julia> L = rand(LowerTriangularMatrix,3,3)
3×3 LowerTriangularMatrix{Float64}:
0.0948286 0.0 0.0
0.10467 0.821121 0.0
0.971535 0.0221127 0.744497
julia> L .* 2
3×3 Matrix{Float64}:
0.189657 0.0 0.0
0.20934 1.64224 0.0
1.94307 0.0442254 1.48899
It may be useful to use the logging infrastructure for the feedback messages. One benefit is that you can actually check the content of logged messages with @test_logs
and this doesn't show up when running the tests. Also, you may consider using ProgressMeter.jl
for the progress bar.
Something is wrong with the spectral transform in Float32 at high resolutions (>=T682), if I start with
julia> alms = randn(ComplexF32,854,854); # T853
julia> map = gridded(alms); # = 2560x1280 grid points
then NaNs are created at mid-latitudes of both hemispheres. @jmert have you seen something like this before and could give us a hint what is going wrong?
in contrast, using Float64 everthing seems to be fine
I highly vote for getting this functionality build in
Base.show(io::IO, P::PrognosticVariables) = print(io,heatmap(gridded(P.temp[:,:,end])[:,end:-1:1]'.-273.15))
such that we immediately get a nice plot everytime you run the model in the REPL
I noticed surface pressure pres
is overloaded to also mean normalized surface pressure in the current convection parameteization
https://github.com/milankl/SpeedyWeather.jl/blob/4b3176764bdd4897978ce2312948bd0d2f03da79/src/convection.jl#L43
this is also confirmed in the tests where a normlised pressure is used intsead
https://github.com/milankl/SpeedyWeather.jl/blob/4b3176764bdd4897978ce2312948bd0d2f03da79/test/convection.jl#L10
I think this should be changed throughout to make clear this is not the surface pressure in hPa.
@milankl separate and perhaps for later but I find pres
to mean surface pressure unclear (i.e. when I first saw it I was expending a vector of pressures in hPa!), can this be changed to surf_pres
or something like that?
Following this document
we should be able to formulate a nice unit test for the spectral gradients.
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
speedy.f90 relies on a number of climatological and boundary fields. These are available currently on the T30/96×48 grid but we would like them for all available resolutions.
I will begin with the orography field as I already have the data for this on a high-resolution TCO1279 (2560×1280, 9 km) grid, taken from the IFS.
the performance would be awful (cause emulated floating point), but it would be interesting to use https://github.com/milankl/SoftPosit.jl to test posit accuracy at 16 bit.
One easy way to parallelise speedy might be to distribute the calculation of the spectral transform across n
workers in the vertical, using SharedArrays (documentation here) from Julia's standard library. While this limits us to n
x speedups from parallelisation, which might be fine as SpeedyWeather.jl will probably run on small clusters only anyway. Such that for T30 and n=8
levels we can (hopefully) efficiently run on 8 cores, but for n=48,64
levels we could get significant speedups for higher resolution versions of SpeedyWeather.jl (T100-T500). Given the shared memory of this approach, we'll be limited by the 48 cores on A64FX, but that might be absolutely sufficient for now.
I’d need the inverse laplacian to convert vorticity to streamfunction. I’ve seen that this is available in the code, but commented out in spectral_gradients!
as ∇⁻²!
. Why is that so? Could we make it functional again? SpectralTransform
still has the eigenvalues⁻¹
field that I also only used by ∇⁻²!
UV_from_vor!
also somehow inverts the laplacian but I don't understand how. Anyway though, I need to compute the stream function not the wind field.
This is perhaps not the best place to discuss but I would like to better understand what general API you have in mind for SpeedyWeather.jl. Especially for dynamics and parametrization schemes and passing of variables and parameters. I would like to try avoid pitfalls of other models.
Now as there are more and more parameterisations coming in I would suggest to move every parameterization-related to src/parameterizations
eventually this could even become it's own module, but I don't think this is needed now.
At the moment I think to move in there
so basically everything that is called within parameterization_tendencies!
(but I would leave that one as it defines the interface on the dynamics side).
Variables (e.g. column_variables.jl
) are initialised with zero. Is there any reason for doing so? If not why not using NaN
for floats and typemax
or typemin
for ints?
While it might be nice to have a wide range of resolutions available, in practice this might be difficult for speedy. There might be stability constraints such that we may want to agree on a small subset (n=3...8?) of available resolutions, e.g. T30, T60, ... Speedy's default is
It seems that ECMWF used operationally in the past decades
Which sounds like a good subset of resolutions to aim for. @tomkimpson @samhatfield what do you think, and what are other constraints we should be worried about?
At the moment we have the within the GeoSpectral
struct a SpectralTransform
struct called spectral
such that one can do
@unpack spectral = model_setup.geospectral
as well as the function spectral
that does a spectral transform, i.e. double naming. @unpack
ing the spectral overwrites the available function in local scope. In most cases this isn't an issue as we only use the in-place function spectral!
within the code hence no conflict, but for clarity and bug prevention, we should resolve that.
Copying @milankl's comment from PR #82:
I mean rewriting the entire model such that temperature uses ˚C instead of Kelvin. I expect everything to work very similar, instead of probably radiation, where somewhere the Stefan-Boltzmann ~T^4 should appear. We could then convert from ˚C to Kelvin therein. But from a precision point-of-view using Kelvin is awful as you'll hit massive rounding errors in the time integration, e.g. T = 300K, tendency dT = 0.1K
julia> T = Float16(300) Float16(300.0) julia> dT = Float16(0.1) Float16(0.1) julia> T+dT Float16(300.0) julia> T2 = Float16(300-273.15) Float16(26.84) julia> T2+dT Float16(26.94)In Kelvin you can't resolve the increment, but in ˚C you can.
Having said that, we obviously do the time stepping in spectral space, meaning that we'd only have that problem on the l=m=0 mode which we could also solve with compensated summation, but I think in general, we should try to use ˚C if possible. If it turns out that ˚C is a bad idea then we can still revert back.
While not needed for the octahedral grids as the first longitude point on every rings sits right on the prime meridian, the HEALPix grids need a longitude offset rotation. See Justin's SphericalHarmonicTransforms.jl documentation
such that a
julia> S = SpectralTransform(Float64,HEALPixGrid,15,false);
julia> o = S.lon_offsets[2,:] # m=1
16-element Vector{ComplexF64}:
0.7071067811865476 + 0.7071067811865476im
0.9238795325112867 + 0.3826834323650898im
0.9659258262890683 + 0.25881904510252074im
0.9807852804032304 + 0.19509032201612828im
0.9876883405951378 + 0.15643446504023087im
0.9914448613738104 + 0.13052619222005157im
0.9937122098932426 + 0.11196447610330786im
0.9951847266721969 + 0.0980171403295606im
1.0 + 0.0im
0.9951847266721969 + 0.0980171403295606im
1.0 + 0.0im
0.9951847266721969 + 0.0980171403295606im
1.0 + 0.0im
0.9951847266721969 + 0.0980171403295606im
1.0 + 0.0im
0.9951847266721969 + 0.0980171403295606im
julia> @. atand(imag(o)/real(o)) # retrieve the angle in degree
16-element Vector{Float64}:
45.0
22.500000000000004
14.999999999999998
11.25
9.0
7.499999999999999
6.428571428571429
5.625
0.0
5.625
0.0
5.625
0.0
5.625
0.0
5.625
and
julia> [360/4j/2 for j in 1:8]
8-element Vector{Float64}:
45.0
22.5
15.0
11.25
9.0
7.5
6.428571428571429
5.625
However (and now coming to the actual issue) when enabling the longitude offset rotation for HEALPixGrid and HEALPix4Grid some tests fail, whereas they pass for no rotation. No rotation is equivalent to rings being rotated to start at the prime meridian instead.
Test Summary: | Pass Fail Total Time
Transform: Individual Legendre polynomials (inexact transforms) | 1993714 14 1993728 13.2s
trunc = 127 | 402422 10 402432 4.1s
NF = Float32 | 201211 5 201216 2.2s
Grid = HEALPixGrid | 50301 3 50304 0.9s
Grid = HEALPix4Grid | 50302 2 50304 0.5s
Grid = FullHEALPixGrid | 50304 50304 0.4s
Grid = FullHEALPix4Grid | 50304 50304 0.4s
NF = Float64 | 201211 5 201216 1.9s
Grid = HEALPixGrid | 50301 3 50304 0.6s
Grid = HEALPix4Grid | 50302 2 50304 0.5s
Grid = FullHEALPixGrid | 50304 50304 0.4s
Grid = FullHEALPix4Grid | 50304 50304 0.4s
trunc = 255 | 1591292 4 1591296 9.0s
NF = Float32 | 795646 2 795648 4.1s
Grid = HEALPixGrid | 198911 1 198912 1.0s
Grid = HEALPix4Grid | 198911 1 198912 1.0s
Grid = FullHEALPixGrid | 198912 198912 1.0s
Grid = FullHEALPix4Grid | 198912 198912 1.1s
NF = Float64 | 795646 2 795648 5.0s
Grid = HEALPixGrid | 198911 1 198912 1.1s
Grid = HEALPix4Grid | 198911 1 198912 1.3s
Grid = FullHEALPixGrid | 198912 198912 1.3s
Grid = FullHEALPix4Grid | 198912 198912 1.3s
ERROR: Some tests did not pass: 1993714 passed, 14 failed, 0 errored, 0 broken.
When starting with a random vorticity field in spectral space (randn(Complex{NF})
for l,m
in 2:25,2:25
(or other lmax,mmax) one should be able to
And hence reobtain the original vorticity field, this works somehow:
However, it's unclear to me how exact this method should be, and whether there's another (simpler) loop one can do to check that all gradients work as they are supposed to. @white-alistair @maximilian-gelbrecht any idea?
Given that we have now the flexibility to use various grids, we need to think about how to output gridded data to netcdf. Several issues
So I see several possibilities
FullGaussianGrid
(currently), but that's expensive and considerably slows down the simulationOctahedralGaussianGrid
-FullGaussianGrid
, OctahedralClenshawGrid
-FullClenshawGrid
meaning we would need to define a full version to the HEALPix grid and create a S_output = SpectralTransform(S,output_grid=...)
which changes parameters like nlon
but points to the already precomputed Legendre polynomialsinterpolate!(grid::Grid,lat,lon)
function for every Grid<:AbstractGrid
, which, however, we may need eventually anyway if we want semi-Lagrangian advection at some pointI don't like 1) what we currently do, hence this issue. I do like 2) and 3) and think that's the simplest to implement. 4) is possible but sounds more like a long-term solution. I reckon it's faster than 2) but not than 3) (which is just memory reordering)
@hottad do you agree?
As raised in #31 and #77, v0.3 is still not fully number format flexible. Main bottleneck is the Fourier Transform as FFTW.jl only supports Float32/64. For other formats we'd need to fall back to GenericFFT.jl. #136 addresses that, but there's more that needs to be done. This issue is to collect efforts to remove remaining type instabilities
julia> using SoftPosit
julia> run_speedy(Posit16)
ERROR: promotion of types Float64 and Posit16 failed to change any arguments
Stacktrace:
[1] error(::String, ::String, ::String)
@ Base ./error.jl:44
[2] sametype_error(input::Tuple{Float64, Posit16})
@ Base ./promotion.jl:383
[3] not_sametype(x::Tuple{Float64, Posit16}, y::Tuple{Float64, Posit16})
@ Base ./promotion.jl:377
[4] promote
@ ./promotion.jl:360 [inlined]
[5] -(x::Float64, y::Posit16)
@ Base ./promotion.jl:390
[6] broadcasted(#unused#::Base.Broadcast.DefaultArrayStyle{1}, #unused#::typeof(-), r::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, x::Posit16)
@ Base.Broadcast ./broadcast.jl:1110
[7] broadcasted
@ ./broadcast.jl:1304 [inlined]
[8] generalised_logistic(x::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, coefs::Main.SpeedyWeather.GenLogisticCoefs{Posit16})
@ Main.SpeedyWeather ~/git/SpeedyWeather.jl/src/geometry.jl:195
[9] vertical_coordinates(P::Parameters)
@ Main.SpeedyWeather ~/git/SpeedyWeather.jl/src/geometry.jl:184
We have a models.jl
with the Model
struct that should define all model variables e.g.
P = Parameters(NF=NF,kwargs...)
C = Constants{NF}(P)
G = GeoSpectral{NF}(P)
M = Model(P,C,G)
This needs updating to make it consistent with the rest of the Speedy structure and notation.
The Implicit
and HorizontalDiffusion
structs need updating in particular. For example, HorizontalDiffusion
calls parameters g
and $\gamma$
from Parameters
which are not defined, Model()
redefines the Earth radius
A very basic new Model
struct is defined in #39 for the testing of the tendencies calculations. I will fold in a complete updated Model()
struct into that PR.
This is an issue to track to progress in speeding up the spectral transform. We are starting with
Fourier transform
Legendre transform
@inbounds
dot
(probably OpenBLAS)With the following benchmark (back&forth transform)
Timings are
6.020 ms (4084 allocations: 580.59 KiB) # Fourier
131.585 μs (1345 allocations: 684.67 KiB) # Legendre
6.434 ms (5429 allocations: 1.24 MiB) # Both
What is actually a reasonable warm-up / transient integration time for Speedy?
That's at the moment no straight forward to say. Most simulations I've run so far just used some initial conditions for vorticity. In that case you get a reasonably developed turbulence in 10-20 days even at higher resolution. However, that's decaying turbulence due to the lack of forcing obviously. Hence there's no actually interesting invariant measure the model converges to.
With #144 I've added some forcing for the shallow water equations, even including a seasonal cycle that you can switch on with interface_relaxation = true
(default false) and control with interface_relax_time
and interface_relax_amplitude
.
What's happening there is that one pulls the l=1,m=0 zonal mode of the interface displacement / SSH towards something that mimicks an equatorial heating. Other modes are not affected, so you literally just dump energy into that mode. With the seasonal_cycle = true
(default true) that equatorial heating moves up and down with the seasons, similar to an ITCZ but actually all the way to the tropic of cancer back to the equator and down to the tropic of capricorn and back.
My hope was to get an interesting and stable climate also in the shallow water system. But at the moment it still converges to a steady state (or one that closely follows the seasonal cycle). I probably haven't played around with it enough to get an actually nice setup that produces eddies at all resolutions with a proper energy cascade from some large scale forcing to some small scale dissipation
Installed SpeedyWeather.jl on Julia 1.6.4 for Windows. After executing, as in the example, run_speedy(n_days=30, trunc=63, Grid=OctahedralGaussianGrid, model=:shallowwater, output=true)
I get:
ERROR: UndefVarError: OctahedralGaussianGrid not defined
Stacktrace:
[1] top-level scope
@ REPL[3]:1
This is an issue to discuss the implementation of the spectral dynamical core. There is
n=l-m
-spectral packing, illustrates the calculation of some gradients in spectral space, outlines the algorithm and provides a simple test case.julia> run_speedy(Float64; n_days=20, model=:shallowwater, output=true, trunc=62, )
ERROR: MethodError: no method matching scale_coslat⁻¹!(::SubArray{Float32, 2, Array{Float32, 3}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Int64}, true}, ::Geometry{Float64})
Closest candidates are:
scale_coslat⁻¹!(::AbstractMatrix{NF}, ::Geometry{NF}) where NF at ~/.julia/packages/SpeedyWeather/UDuyp/src/spectral_gradients.jl:266
scale_coslat⁻¹!(::AbstractArray{NF, 3}, ::Any...) where NF at ~/.julia/packages/SpeedyWeather/UDuyp/src/distributed_vertical.jl:5
Likely because they are being converted into Float32
while saving but before scaling.
This is kind of a placeholder but maybe we should consider using https://github.com/psychrometrics/psychrolib for psychometric calculations as it may come in handy later with some parametizations like land and diagnostics. Let me know if this could be useful in the future as will need to port it to Julia...
One discussions we need to have is whether for the parametrizations we loop in the vert, lat, lon order as the arrays are layed out in memory (and as we do for the dynamics as most dependencies are in the horizontal and only(?) the geopotential couples the vertical levels) or whether for the parametrizations we should do lat,lon,vert because they act on columns hence we could decompose the domain the horizontal, pass on vertical views to the functions and parallelise that way. I believe the get_parametrization_tendencies
function could literally just do
@distributed for j=1:nlat,i=1:nlon
temp_column = view(temp_grid,i,j,:)
...
get_parametrization_tendencies(temp_column, ... )
and hence every grid point would get its own thread and we'd have just one place where the horizontal domain decomposition happens.
@white-alistair copying over from zulip
@samhatfield do you think this is a reasonable idea for speedy?
While working with SpeedyWeather.jl now for a bit, I noticed something that I would consider unpractical: When running Speedy the output is only to NetCDF in a path that is set during runtime and not saved somewhere in a struct that is returned to the user.
A practical example:
using SpeedyWeather
p = run_speedy(Float32, output=true)
Now I want to continue to work with this simulation run in the same script. Just from running it like this there's no way of knowing what the path is. If it is the first time I've run Speedy in this folder it is run0001
, but maybe I've already run it a few times, then it is something else.
vor = ncread("run0001/output.nc", "vor") # here I have to fill in the 0001 manually and can't use some value from p
time = ncread("run0001/output.nc", "time")
A way of fixing this would be to either directly return the Output
object as well (maybe only optionally via a kwarg) or to create a separate Solution
object that can hold all kind of information that is only set during run time / time stepping. A separate Solution object could e.g. also have some wrapper functions around NetCDF.jl that make it mimic the solution objects from DiffEq.jl a bit.
This would also make writing tests for the netCDF output more reliable.
So an easy way to address this would be to just return the outputter
alongside the PrognosticVariables
instance, if the output==true
kwarg is there . A more general way would to create such a Solution
object. I would say the latter is preferable if we can also think about other things that are saved / computed during time stepping that are not already saved in the ModelSetup
or Output
.
I've just created github.com/SpeedyWeather but before transferring the ownership of this repository to the orga (which is in the "danger zone") I wanted to make sure that this is the right way to go and that everyone agrees.
This is a left over from this comment on #82
To implement several sets of equations (barotropic vorticity equation, shallow water equations, primitive equations) one could define
BaroptropicModel <: ModelSetup
ShallowWaterModel <: ModelSetup
PrimitiveEquationModel <: ModelSetup
and then dispatch to functions like
function get_tendencies(...,M::BarotropicModel) = ...
function get_tendencies(...,M::ShallowWaterModel) = ...
where ever functions should do something different and otherwise
function leapfrog(...,M::ModelSetup) = ...
when they are the same. That would be a Julian way of avoiding if
-calls in an effort to support various equations simultaneously. But hard to tell whether later on this will lead to ugly design choices to still enable compatibility with BarotropicModel and ShallowWaterModel ...
julia> run_speedy(trunc=341,output=true,output_dt=3,model=:shallowwater)
ERROR: ArgumentError: invalid base 10 digit '.' in "0007.tar.xz"
Stacktrace:
[1] tryparse_internal(#unused#::Type{Int64}, s::String, startpos::Int64, endpos::Int64, base_::Int64, raise::Bool)
@ Base ./parse.jl:137
[2] parse(::Type{Int64}, s::String; base::Nothing)
@ Base ./parse.jl:241
[3] parse
@ ./parse.jl:241 [inlined]
[4] (::SpeedyWeather.var"#37#39")(id::String)
@ SpeedyWeather ./none:0
[5] iterate
@ ./generator.jl:47 [inlined]
[6] collect_to!
@ ./array.jl:782 [inlined]
[7] collect_to_with_first!
@ ./array.jl:760 [inlined]
[8] collect(itr::Base.Generator{Vector{String}, SpeedyWeather.var"#37#39"})
@ Base ./array.jl:734
[9] get_run_id_path(P::Parameters)
@ SpeedyWeather ~/.julia/packages/SpeedyWeather/OJt3A/src/output.jl:14
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.