GithubHelp home page GithubHelp logo

Comments (10)

samhatfield avatar samhatfield commented on July 28, 2024 1

Because paywalls are 馃挬, here is the paper:

Quart J Royal Meteoro Soc - July 1994 Part B - Courtier - A pole problem in the reduced Gaussian grid.pdf

from speedyweather.jl.

milankl avatar milankl commented on July 28, 2024 1

@hottad maybe you can clarify, if IFS/ectrans loops until nfreq = floor((nlon - 1)/(2 + coslat^2))) - 1 then towards the equator this is effectively nlon/3, hence quadratic and not cubic. I currently don't understand why you wouldn't want to loop only until nlon/4 at the Equator for a cubic truncation? Visualised this looks like

image

with IFS being quadratic at the equator but a coslat^2 scaling towards linear near the poles. Could we do something like what I labelled "new", which is also linear near the poles and uses a coslat scaling towards cubic at the equator. That would save another 10% in the Legendre transform compared to IFS. The first few rings in comparison (ring, nlon, ifs, new)

 1  20   8  11
 2  24  10  13
 3  28  12  15
 4  32  14  16
 5  36  16  18
 6  40  18  19
 7  44  20  21
 8  48  22  22

which is for new truely linear for ring 1,2,3 and then only drops down slowly

from speedyweather.jl.

hottad avatar hottad commented on July 28, 2024

Hi Milan @milankl,
Thank you for this plot!
I honestly do not understand much about this, but I found something interesting about IFS formula.

In addition to zonal aliasing, there is another factor to consider when we determine the size of loop over m.
At latitudes outside of the Tropics, the associated Legendre function decays quickly as m gets larger. Thus, if we define some threshold 系 below which we can ignore $P_n^m(\sin\rm{lat)}$, we can define m above which we do not need to loop over when performing direct or inverse Legendre transform.
If we call such m as m_to_retain(系), I found that m_to_retain(系) is rather insensitive to the exact choice of 系, and interestingly, the IFS formula gives a good approximation to m_to_retain(系).

This is a plot for O80 grid (nlat_half=80) with Nmax=79 truncation (cubic). As we can see, IFS formula and m_to_retain(系=1e-6) agree quite well.
image
Please have a look at this Jupyter notebook https://gist.github.com/hottad/5536882443cb5d018bbd57a88f10eb0e for precise definitions and derivation (and code).

I do not know if this is related to how IFS formula is devised, but seems like it's a too good agreement for being just a coincidence.

from speedyweather.jl.

hottad avatar hottad commented on July 28, 2024

In the above post I forgot to mention that this m_to_retain plot is basically a reproduction of Figure 3 (and other similar plots) of Courtier and Naughton 1994, except that in Courtier and Naughton they used quadratic truncation whereas I plotted for cubic truncation.

I computed up to Tc2559 with the same code. The results below:
tile_m_truncation_comparison
IFS formula approximates m_to_retain(系=1e-6) quite well at lower resolutions, but as the resolution (truncation wavenumber) increases, it starts to deviate from m_to_retain(系=1e-6) outside the polar areas.

Extrapolating the trend, I guess your "new" formula will be a better alternative than IFS formula at very high resolutions beyond, like Tc3999, but perhaps that's outside the scope of SpeedyWeather?
If we use the "new" formula at the resolutions tested here, we will ignore a lot of $P_n^m(\sin\rm{lat})$ that are not small, so we may end up with inexact associate Legendre transform.

Alternatively, do you think it makes sense to compute m_to_retain(系) for some reasonable 系 within the constructor of SpectralTransform type? This may slow down the setup process, but will save the cost of transforms in comparison to IFS formula, especially at a relatively high resolution.

In any case, we should decide which way to go by doing experiments.

@samhatfield @milankl Any thoughts?

from speedyweather.jl.

milankl avatar milankl commented on July 28, 2024

Quick comment: I believe the dip in m_to_retain between 500 and 1700 at T2559 is due to issues with the computation of the Legendre polynomials at very high resolution similar to jmert/AssociatedLegendrePolynomials.jl#27 (which is the package we are currently using to do that).

from speedyweather.jl.

milankl avatar milankl commented on July 28, 2024

General comment: I agree that we would need to test these ideas while actually running the model. Daisuke, do you know of a good test case to do that?

In general, I like the m_to_retain idea, but I'm not sold that this is actually what we want: The reason is that a cubic truncation is a form of filtering, so we actually want a certain error if that means we can scale-selectively filter out some high-frequency waves. I believe all of these methods are idempotent, meaning you may start with a field that contains some waves that shouldn't be representable with a cubic truncation, but once they are filtered out the transform should approach exactness (up to rounding errors). In that sense, I see the m_to_retain idea as an upper bound of m beyond which we shouldn't loop, but maybe we do want to shortcut the loop over m even further.

Going forward I suggest we create a parameter like shortcut_legendre::Symbol with options like :linear, :quadratic, :cubic, :ifs, :lincub_coslat which implements exactly these formulas and precomputes them in SpectralTransform. In spectral! we then just load those such that we don't have to play with the spectral transforms just with the precomputation when initiating SpectralTransform.

from speedyweather.jl.

milankl avatar milankl commented on July 28, 2024

And I've just created a plot that compares m_to_retain, ifs and lincub_coslat which is the linear to cubic transition via coslat scaling. And I've added the savings in % (total loop iterations over m for all latitude rings) relative to 1:mmax+1 for every ring

image

from speedyweather.jl.

hottad avatar hottad commented on July 28, 2024

Thank you Milan. I agree. m_to_retain should be interpreted as the upper bound of the size of m-loop.

What is not entirely clear to me is if we can ensure orthonormality of $Y_n^m(\rm{lon},\rm{lat})$ if we reduce m beyond m_to_retain. Perhaps, before trying with model runs, we should start by checking the orthonormality (I mean, if we initialize alm with (l,m)-element having one and all the others zero, and doing a round-trip of transform by successively calling gridded!() and then spectral!(), does that perfectly (within the desired tolerance) restore the original alm, and does this hold for any pair of (l,m)?). I guess this is equivalent to what you call idempotency?

Going forward I suggest we create a parameter like shortcut_legendre::Symbol with options like :linear, :quadratic, :cubic, :ifs, :lincub_coslat which implements exactly these formulas and precomputes them in SpectralTransform.

Thanks. That's a good idea. Will you be able to implement and test this?

from speedyweather.jl.

hottad avatar hottad commented on July 28, 2024

Quick comment: I believe the dip in m_to_retain between 500 and 1700 at T2559 is due to issues with the computation of the Legendre polynomials at very high resolution similar to jmert/AssociatedLegendrePolynomials.jl#27 (which is the package we are currently using to do that).

This underflow/overflow problem is a well known issue. Just in case you are not aware, there are two known resolutions to this.

One way to resolve this is to use an exponent-extension method (i.e., represent Pnm with a struct (called X-number) which is a pair of an integer (to save the exponent) and a floating point number (either Float32 or Float64), and override operators like mul(Float,X-number) and add(X-number,X-number) with a bespoke method). Detailed explanation and sample code in Fortran are documented in
Fukushima (2012, Journal of Geodesy) (it's behind paywall but the author's copy on researchgate.net is available from here). It should be easy to implement X-number in Julia as something like Xnumber{T} <: AbstractFloat where T. Experience from Fortran 90 implementation is that, performance penalty from using X-number representation is quite small.

Alternatively, when you compute Pnm by recurrence formula like Belousov, you can check whether Pnm goes beyond/below some thresholds(like 1e-16 and 1e+16 for Float64) and apply scaling by some factors (e.g., 2^20 and 2^(-20), these can be arbitrary but should be power of 2 to avoid rounding error) if that occurs, and keep track of how many time the scaling was applied for each Pnm. This is what is done for IFS or JMA's global model. Wedi et al. (2013), Section 2, very briefly mentions this strategy in IFS:

Equation (9) is also unstable but can be made stable by tracking the values of the numbers and keeping them within an acceptable range that can be represented with double precision

from speedyweather.jl.

milankl avatar milankl commented on July 28, 2024

The first approach Daisuke suggests can already be easily done with AssociatedLegendrePolynomials.jl as it's type-flexible but currently quadmath arithmetics like Float128 aren't fast.

julia> using AssociatedLegendrePolynomials, Quadmath, BenchmarkTools
julia>= zeros(Float32,3001,3001)    # always store in a Float32 array
julia> @btime 位lm!(螞, 3000, 3000, Float32(cos(蟺/4)));    # the type of cos(colat) determines the format used for calculation
  18.604 ms (5 allocations: 80 bytes)
julia> @btime 位lm!(螞, 3000, 3000, Float64(cos(蟺/4)));
  22.545 ms (5 allocations: 80 bytes)
julia> @btime 位lm!(螞, 3000, 3000, Float128(cos(蟺/4)));
  6.238 s (6 allocations: 192 bytes)

Although I have quite some expertise in writing new number formats, I'm not particularly keen to implement an X-number myself. On the other hand, how hard can it be? 馃槅 Good thing in Julia is, you don't have to overload the arithmetics.

The 2nd approach would require some tweaks to AssociatedLegendrePolynomials, @jmert you may find this interesting.

from speedyweather.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    馃枛 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 馃搳馃搱馃帀

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google 鉂わ笍 Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.