Comments (10)
Because paywalls are 馃挬, here is the paper:
from speedyweather.jl.
@hottad maybe you can clarify, if IFS/ectrans loops until nfreq = floor((nlon - 1)/(2 + coslat^2))) - 1
then towards the equator this is effectively nlon/3, hence quadratic and not cubic. I currently don't understand why you wouldn't want to loop only until nlon/4 at the Equator for a cubic truncation? Visualised this looks like
with IFS being quadratic at the equator but a coslat^2 scaling towards linear near the poles. Could we do something like what I labelled "new", which is also linear near the poles and uses a coslat scaling towards cubic at the equator. That would save another 10% in the Legendre transform compared to IFS. The first few rings in comparison (ring, nlon, ifs, new)
1 20 8 11
2 24 10 13
3 28 12 15
4 32 14 16
5 36 16 18
6 40 18 19
7 44 20 21
8 48 22 22
which is for new
truely linear for ring 1,2,3 and then only drops down slowly
from speedyweather.jl.
Hi Milan @milankl,
Thank you for this plot!
I honestly do not understand much about this, but I found something interesting about IFS formula.
In addition to zonal aliasing, there is another factor to consider when we determine the size of loop over m
.
At latitudes outside of the Tropics, the associated Legendre function decays quickly as m
gets larger. Thus, if we define some threshold 系 below which we can ignore m
above which we do not need to loop over when performing direct or inverse Legendre transform.
If we call such m
as m_to_retain(系)
, I found that m_to_retain(系)
is rather insensitive to the exact choice of 系, and interestingly, the IFS formula gives a good approximation to m_to_retain(系)
.
This is a plot for O80 grid (nlat_half=80) with Nmax=79 truncation (cubic). As we can see, IFS formula and m_to_retain(系=1e-6)
agree quite well.
Please have a look at this Jupyter notebook https://gist.github.com/hottad/5536882443cb5d018bbd57a88f10eb0e for precise definitions and derivation (and code).
I do not know if this is related to how IFS formula is devised, but seems like it's a too good agreement for being just a coincidence.
from speedyweather.jl.
In the above post I forgot to mention that this m_to_retain
plot is basically a reproduction of Figure 3 (and other similar plots) of Courtier and Naughton 1994, except that in Courtier and Naughton they used quadratic truncation whereas I plotted for cubic truncation.
I computed up to Tc2559 with the same code. The results below:
IFS formula approximates m_to_retain(系=1e-6)
quite well at lower resolutions, but as the resolution (truncation wavenumber) increases, it starts to deviate from m_to_retain(系=1e-6)
outside the polar areas.
Extrapolating the trend, I guess your "new" formula will be a better alternative than IFS formula at very high resolutions beyond, like Tc3999, but perhaps that's outside the scope of SpeedyWeather?
If we use the "new" formula at the resolutions tested here, we will ignore a lot of
Alternatively, do you think it makes sense to compute m_to_retain(系)
for some reasonable 系 within the constructor of SpectralTransform
type? This may slow down the setup process, but will save the cost of transforms in comparison to IFS formula, especially at a relatively high resolution.
In any case, we should decide which way to go by doing experiments.
@samhatfield @milankl Any thoughts?
from speedyweather.jl.
Quick comment: I believe the dip in m_to_retain
between 500 and 1700 at T2559 is due to issues with the computation of the Legendre polynomials at very high resolution similar to jmert/AssociatedLegendrePolynomials.jl#27 (which is the package we are currently using to do that).
from speedyweather.jl.
General comment: I agree that we would need to test these ideas while actually running the model. Daisuke, do you know of a good test case to do that?
In general, I like the m_to_retain
idea, but I'm not sold that this is actually what we want: The reason is that a cubic truncation is a form of filtering, so we actually want a certain error if that means we can scale-selectively filter out some high-frequency waves. I believe all of these methods are idempotent, meaning you may start with a field that contains some waves that shouldn't be representable with a cubic truncation, but once they are filtered out the transform should approach exactness (up to rounding errors). In that sense, I see the m_to_retain
idea as an upper bound of m
beyond which we shouldn't loop, but maybe we do want to shortcut the loop over m even further.
Going forward I suggest we create a parameter like shortcut_legendre::Symbol
with options like :linear, :quadratic, :cubic, :ifs, :lincub_coslat
which implements exactly these formulas and precomputes them in SpectralTransform. In spectral!
we then just load those such that we don't have to play with the spectral transforms just with the precomputation when initiating SpectralTransform.
from speedyweather.jl.
And I've just created a plot that compares m_to_retain, ifs and lincub_coslat which is the linear to cubic transition via coslat scaling. And I've added the savings in % (total loop iterations over m for all latitude rings) relative to 1:mmax+1
for every ring
from speedyweather.jl.
Thank you Milan. I agree. m_to_retain
should be interpreted as the upper bound of the size of m
-loop.
What is not entirely clear to me is if we can ensure orthonormality of m
beyond m_to_retain
. Perhaps, before trying with model runs, we should start by checking the orthonormality (I mean, if we initialize alm
with (l,m)-element having one and all the others zero, and doing a round-trip of transform by successively calling gridded!()
and then spectral!()
, does that perfectly (within the desired tolerance) restore the original alm
, and does this hold for any pair of (l,m)?). I guess this is equivalent to what you call idempotency?
Going forward I suggest we create a parameter like
shortcut_legendre::Symbol
with options like:linear, :quadratic, :cubic, :ifs, :lincub_coslat
which implements exactly these formulas and precomputes them in SpectralTransform.
Thanks. That's a good idea. Will you be able to implement and test this?
from speedyweather.jl.
Quick comment: I believe the dip in
m_to_retain
between 500 and 1700 at T2559 is due to issues with the computation of the Legendre polynomials at very high resolution similar to jmert/AssociatedLegendrePolynomials.jl#27 (which is the package we are currently using to do that).
This underflow/overflow problem is a well known issue. Just in case you are not aware, there are two known resolutions to this.
One way to resolve this is to use an exponent-extension method (i.e., represent Pnm with a struct (called X-number) which is a pair of an integer (to save the exponent) and a floating point number (either Float32 or Float64), and override operators like mul(Float,X-number) and add(X-number,X-number) with a bespoke method). Detailed explanation and sample code in Fortran are documented in
Fukushima (2012, Journal of Geodesy) (it's behind paywall but the author's copy on researchgate.net is available from here). It should be easy to implement X-number in Julia as something like Xnumber{T} <: AbstractFloat where T
. Experience from Fortran 90 implementation is that, performance penalty from using X-number representation is quite small.
Alternatively, when you compute Pnm by recurrence formula like Belousov, you can check whether Pnm goes beyond/below some thresholds(like 1e-16 and 1e+16 for Float64) and apply scaling by some factors (e.g., 2^20 and 2^(-20), these can be arbitrary but should be power of 2 to avoid rounding error) if that occurs, and keep track of how many time the scaling was applied for each Pnm. This is what is done for IFS or JMA's global model. Wedi et al. (2013), Section 2, very briefly mentions this strategy in IFS:
Equation (9) is also unstable but can be made stable by tracking the values of the numbers and keeping them within an acceptable range that can be represented with double precision
from speedyweather.jl.
The first approach Daisuke suggests can already be easily done with AssociatedLegendrePolynomials.jl as it's type-flexible but currently quadmath arithmetics like Float128 aren't fast.
julia> using AssociatedLegendrePolynomials, Quadmath, BenchmarkTools
julia> 螞 = zeros(Float32,3001,3001) # always store in a Float32 array
julia> @btime 位lm!(螞, 3000, 3000, Float32(cos(蟺/4))); # the type of cos(colat) determines the format used for calculation
18.604 ms (5 allocations: 80 bytes)
julia> @btime 位lm!(螞, 3000, 3000, Float64(cos(蟺/4)));
22.545 ms (5 allocations: 80 bytes)
julia> @btime 位lm!(螞, 3000, 3000, Float128(cos(蟺/4)));
6.238 s (6 allocations: 192 bytes)
Although I have quite some expertise in writing new number formats, I'm not particularly keen to implement an X-number myself. On the other hand, how hard can it be? 馃槅 Good thing in Julia is, you don't have to overload the arithmetics.
The 2nd approach would require some tweaks to AssociatedLegendrePolynomials, @jmert you may find this interesting.
from speedyweather.jl.
Related Issues (20)
- Precompilation Error HOT 2
- Add proper citations in Docs via DocumenterCitations.jl HOT 1
- JOSS review: text comments HOT 6
- Replace `PythonPlot` with a Julia-based plotting library in Docs HOT 2
- Verifying conserved quantities HOT 15
- Modified dynamics HOT 16
- PrimitiveDry and WetModel generation on Julia v1.9 hangs HOT 10
- The PrimitiveWetModel example fails HOT 10
- unbalanced initial condition for Galewsky Jet HOT 4
- ShallowWater dataset on PDEArena HOT 6
- Modularise netCDF output
- Time stepping of particle tracking HOT 1
- Virtual temperature as prognostic variable
- Spectral filtering instead of hyperdiffusion
- Lagrangian sampling of the model state HOT 1
- Instability develops over long integrations HOT 24
- set! initial conditions HOT 2
- Additional LowerTriangularArray functionality HOT 11
- arbitrary tracer support HOT 2
- RingGrids: Conflicting Broadcast Rules HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
馃枛 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 馃搳馃搱馃帀
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google 鉂わ笍 Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from speedyweather.jl.