GithubHelp home page GithubHelp logo

Comments (5)

milankl avatar milankl commented on July 28, 2024 1

One thread

julia> run_speedy(Float32,PrimitiveDryCore,trunc=127,nlev=26,n_days=1,orography=ZonalRidge)
475.47 days/day

vs 8 threads (4.1x)

julia> run_speedy(Float32,PrimitiveDryCore,trunc=127,nlev=26,n_days=1,orography=ZonalRidge)
5.50 years/day

vs 16 threads (6.1x)

julia> run_speedy(Float32,PrimitiveDryCore,trunc=127,nlev=26,n_days=1,orography=ZonalRidge)
8.21 years/day

from speedyweather.jl.

milankl avatar milankl commented on July 28, 2024

As outlined here, written with @eval metaprogramming style one could similar collect all functions that need a vertical loop and think about adding an @distributed here once SharedArrays are set up
https://github.com/milankl/SpeedyWeather.jl/blob/f21b69eae24f9be0b0667ebe381671d994b9b14f/src/distributed_vertical.jl#L1-L16

from speedyweather.jl.

milankl avatar milankl commented on July 28, 2024

With #117 the @eval-based looping on the level of individiual functions has been removed. We now have, e.g. for the barotropic model a single place in timestep! where the looping over the vertical is happening, hence we moved that loop as far up as possible
https://github.com/milankl/SpeedyWeather.jl/blob/4a3ad4a7659bd1ae8437b15617781420b396678e/src/time_integration.jl#L246-L252
this is where any @distributed-like parallelism should be applied, as the evaluation within the loop is conceptually independent across layers.

from speedyweather.jl.

white-alistair avatar white-alistair commented on July 28, 2024

@milankl what's the versioninfo for these tests? Where do you suspect the bottlenecks are? What kind of scaling would you be satisfied with?

One thing I noticed is that the speedups here are quite consistent with the parallel mergesort showcased in the original Julia multi- threading blogpost.

from speedyweather.jl.

milankl avatar milankl commented on July 28, 2024

That's a good resource, thanks for sharing. In contrast to mergesort the tasks here are quite a bit more expensive, e.g. we can multi-thread this part of the right-hand side completely

    @floop for layer in diagn.layers
        vertical_velocity!(layer,surface,model)     # calculate σ̇ for the vertical mass flux M = pₛσ̇
                                                    # add the RTₖlnpₛ term to geopotential
        linear_pressure_gradient!(layer,progn,model,lf_implicit)
        vertical_advection!(layer,diagn,model)      # use σ̇ for the vertical advection of u,v,T,q

        vordiv_tendencies!(layer,surface,model)     # vorticity advection, pressure gradient term
        temperature_tendency!(layer,surface,model)  # hor. advection + adiabatic term
        humidity_tendency!(layer,model)             # horizontal advection of humidity (nothing for wetcore)
        bernoulli_potential!(layer,S)               # add -∇²(E+ϕ+RTₖlnpₛ) term to div tendency
    end

across layers, which includes several spectral transforms and other expensive operations. So on this I'd expect scaling to be nearly perfect. On the other hand, there's things like the vertical integrations and geopotential which are currently single-threaded (maybe the can be spawned off though). So in short: I haven't done enough profiling to know what parallelism potential there is and at what resolution/number of vertical levels.

from speedyweather.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.