GithubHelp home page GithubHelp logo

Comments (17)

b-fg avatar b-fg commented on August 19, 2024 1

Good! I will create the PR and (re)activate some of the compatibilities. If that still works, then we can merge it :)

from waterlily.jl.

b-fg avatar b-fg commented on August 19, 2024 1

That's unfortunate. This week I will get access to LUMI, which I presume has a more updated ROCm version. I will look into it and let you know how that goes.

from waterlily.jl.

SimonDanisch avatar SimonDanisch commented on August 19, 2024 1

Yes, thank you :)

from waterlily.jl.

b-fg avatar b-fg commented on August 19, 2024

Hi Simon, thanks for reporting this. What ROCm version do you have installed in your system? I was recently able to run WaterLily on an AMD GPU, but because the ROCm version in that system was quite low (5.1.1) I had to force AMDGPU.jl to use its 0.4.15 version. Maybe something similar to this is happening?
Also, we could force WaterLily v1.1 to compile only upstream of a certain AMDGPU.jl version.
And it would also be helpful to see the full log of what packages changed versions when using AMDGPU (if there were more than just WaterLily) - maybe there is an incompatibility with another package.

from waterlily.jl.

SimonDanisch avatar SimonDanisch commented on August 19, 2024

That wasn't the reason, I just updated all my drivers including rocm (5.7 + 6.1). I also don't think Pkg can even install different AMDGPU version based on available drivers, if I'm not mistaken.

You can actually reproduce it with this minimal setup (Julia 1.10.4):

]activate --temp
]add AMDGPU@0.9.6 WaterLily@1.1.0

from waterlily.jl.

b-fg avatar b-fg commented on August 19, 2024

What happens when you try to not specify the version, ie

]activate --temp
]add AMDGPU WaterLily

This works fine on my system, even though it installs AMDGPU v0.6.1. So maybe we should update our version compatibilities.

from waterlily.jl.

b-fg avatar b-fg commented on August 19, 2024

Can you please try my temporary fix in the fix_compat branch and report back?

]
activate --temp
add WaterLily#fix_compat
add [email protected]

from waterlily.jl.

SimonDanisch avatar SimonDanisch commented on August 19, 2024

Yes, this seems to work :)

from waterlily.jl.

SimonDanisch avatar SimonDanisch commented on August 19, 2024

It does not work though :(
I'm getting lots of:

Reason: unsupported call through a literal pointer (call to jl_gc_run_pending_finalizers)
Reason: unsupported call to an unknown function (call to ijl_pop_handler)

I guess some kernel isn't setup correctly, although I would expect CUDA.jl to also caugh on something like jl_gc_run_pending_finalizers ....
Do you have an example that you have tested with AMDGPU that runs fine?

from waterlily.jl.

b-fg avatar b-fg commented on August 19, 2024

You can try the following MWE

using WaterLily
using AMDGPU

function tgv(p, backend; Re=1600, T=Float32)
    L = 2^p; U = 1; Îș=π/L; Îœ = 1/(Îș*Re)
    function uλ(i,xyz)
        x,y,z = @. xyz/L*π                # scaled coordinates
        i==1 && return -U*sin(x)*cos(y)*cos(z) # u_x
        i==2 && return  U*cos(x)*sin(y)*cos(z) # u_y
        return 0.                              # u_z
    end
    Simulation((L, L, L), (0, 0, 0), 1/Îș; U=U, uλ=uλ, Îœ=Îœ, T=T, mem=backend)
end

function main()
    sim = tgv(5, ROCArray)
    sim_step!(sim)
end

main()

I can correctly run this using ROCm/5.1.1, WaterLily.jl/1.1, AMDGPU.jl/0.4.15 on a Radeon Instinct MI50 32GB.

from waterlily.jl.

SimonDanisch avatar SimonDanisch commented on August 19, 2024

That actually kills the julia session 😓
[email protected], WaterLily v1.1.0 and ROCm/6.1 on a 7900xtx.
The AMDGPU test seem to be getting much further on these new versions than any before!

from waterlily.jl.

weymouth avatar weymouth commented on August 19, 2024

The PR doesn't actually fix this issue.

from waterlily.jl.

b-fg avatar b-fg commented on August 19, 2024

Hey Simon, I have successfully run WaterLily on LUMI today (AMD MI250x), where ROCm/5.2.3. There are a couples of fixes for flows with bodies, but this example worked out of the box. This is my project environment right now

  [21141c5a] AMDGPU v0.9.6
  [0c68f7d7] GPUArrays v10.3.0
  [63c18a36] KernelAbstractions v0.9.22
  [90137ffa] StaticArrays v1.9.7
  [ed894a53] WaterLily v1.2.0

And I get the following AMDGPU.jl info

julia> AMDGPU.versioninfo()
[ Info: AMDGPU versioninfo
┌───────────┬──────────────────┬───────────┬──────────────────────────────────────┐
│ Available │ Name             │ Version   │ Path                                 │
├───────────┌──────────────────┌───────────┌───────────────────────────────────────
│     +     │ LLD              │ -         │ /opt/rocm/llvm/bin/ld.lld            │
│     +     │ Device Libraries │ -         │ /opt/rocm/amdgcn/bitcode             │
│     +     │ HIP              │ 5.2.21153 │ /opt/rocm-5.2.3/lib/libamdhip64.so   │
│     +     │ rocBLAS          │ 2.44.0    │ /opt/rocm-5.2.3/lib/librocblas.so    │
│     +     │ rocSOLVER        │ 3.18.0    │ /opt/rocm-5.2.3/lib/librocsolver.so  │
│     +     │ rocALUTION       │ -         │ /opt/rocm-5.2.3/lib/librocalution.so │
│     +     │ rocSPARSE        │ -         │ /opt/rocm-5.2.3/lib/librocsparse.so  │
│     +     │ rocRAND          │ 2.10.5    │ /opt/rocm-5.2.3/lib/librocrand.so    │
│     +     │ rocFFT           │ 1.0.27    │ /opt/rocm-5.2.3/lib/librocfft.so     │
│     +     │ MIOpen           │ 2.17.0    │ /opt/rocm-5.2.3/lib/libMIOpen.so     │
└───────────┮──────────────────┮───────────┮──────────────────────────────────────┘

[ Info: AMDGPU devices
┌────┬──────┬────────────────────────┬───────────┬────────────┐
│ Id │ Name │               GCN arch │ Wavefront │     Memory │
├────┌──────┌────────────────────────┌───────────┌─────────────
│  1 │      │ gfx90a:sramecc+:xnack- │        64 │ 63.984 GiB │
└────┮──────┮────────────────────────┮───────────┮────────────┘

I cannot try anything with ROCm/6.x though... which could be the problem.

from waterlily.jl.

SimonDanisch avatar SimonDanisch commented on August 19, 2024

I just found out that AMDGPU can be used on WSL2 :-O
Now I get:

julia> using AMDGPU

julia> function tgv(p, backend; Re=1600, T=Float32)
           L = 2^p; U = 1; Îș=π/L; Îœ = 1/(Îș*Re)
           function uλ(i,xyz)
               x,y,z = @. xyz/L*π                # scaled coordinates
               i==1 && return -U*sin(x)*cos(y)*cos(z) # u_x
               i==2 && return  U*cos(x)*sin(y)*cos(z) # u_y
               return 0.                              # u_z
           end
           Simulation((L, L, L), (0, 0, 0), 1/Îș; U=U, uλ=uλ, Îœ=Îœ, T=T, mem=backend)
       end
tgv (generic function with 1 method)

julia> function main()
           sim = tgv(5, ROCArray)
           sim_step!(sim)
       end
main (generic function with 1 method)

julia> main()
ERROR: Scalar indexing is disallowed.
Invocation of setindex! resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

from waterlily.jl.

b-fg avatar b-fg commented on August 19, 2024

Ok, that's progress! I think the only problem is now that you might be running Julia with single thread instead of auto. Can you try julia -t auto ..., or export JULIA_NUM_THREADS=auto. Then validate that Threads.nthreads() > 1. This behaviour will eventually be fixed by #133.

from waterlily.jl.

SimonDanisch avatar SimonDanisch commented on August 19, 2024

Yay, that makes the example work on WSL2 ubuntu!

I do notice now, that there seems to be a mismatch in the HIP version on windows:

Windows

julia> AMDGPU.versioninfo()
[ Info: AMDGPU versioninfo
┌───────────┬──────────────────┬───────────┬─────────────────────────────────────────────────────────────────────────────────────────
│ Available │ Name             │ Version   │ Path                                                                                   ⋯
├───────────┌──────────────────┌───────────┌─────────────────────────────────────────────────────────────────────────────────────────
│     +     │ LLD              │ -         │ C:\\Program Files\\AMD\\ROCm\\6.1\\bin\\ld.lld.exe                                     ⋯
│     +     │ Device Libraries │ -         │ C:\\Users\\sdani\\.julia\\artifacts\\5ad5ecb46e3c334821f54c1feecc6c152b7b6a45\\amdgcn/ ⋯
│     +     │ HIP              │ 5.7.32000 │ C:\\WINDOWS\\SYSTEM32\\amdhip64.DLL                                                    ⋯
│     +     │ rocBLAS          │ 4.1.2     │ C:\\Program Files\\AMD\\ROCm\\6.1\\bin\\rocblas.dll                                    ⋯
│     +     │ rocSOLVER        │ 3.25.0    │ C:\\Program Files\\AMD\\ROCm\\6.1\\bin\\rocsolver.dll                                  ⋯
│     +     │ rocALUTION       │ -         │ C:\\Program Files\\AMD\\ROCm\\6.1\\bin\\rocalution.dll                                 ⋯
│     +     │ rocSPARSE        │ -         │ C:\\Program Files\\AMD\\ROCm\\6.1\\bin\\rocsparse.dll                                  ⋯
│     +     │ rocRAND          │ 2.10.5    │ C:\\Program Files\\AMD\\ROCm\\6.1\\bin\\rocrand.dll                                    ⋯
│     +     │ rocFFT           │ 1.0.27    │ C:\\Program Files\\AMD\\ROCm\\6.1\\bin\\rocfft.dll                                     ⋯
│     -     │ MIOpen           │ -         │ -                                                                                      ⋯
└───────────┮──────────────────┮───────────┮─────────────────────────────────────────────────────────────────────────────────────────
                                                                                                                     1 column omitted

[ Info: AMDGPU devices
┌────┬─────────────────────────┬──────────┬───────────┬────────────┐
│ Id │                    Name │ GCN arch │ Wavefront │     Memory │
├────┌─────────────────────────┌──────────┌───────────┌─────────────
│  1 │  AMD Radeon RX 7900 XTX │  gfx1100 │        32 │ 23.984 GiB │
│  2 │ AMD Radeon(TM) Graphics │  gfx1036 │        32 │ 12.019 GiB │
└────┮─────────────────────────┮──────────┮───────────┮────────────┘

WSL ubuntu

[ Info: AMDGPU versioninfo
┌───────────┬──────────────────┬───────────┬─────────────────────────────────────────────────────────────────────────────────────┐
│ Available │ Name             │ Version   │ Path                                                                                │
├───────────┌──────────────────┌───────────┌──────────────────────────────────────────────────────────────────────────────────────
│     +     │ LLD              │ -         │ /opt/rocm/llvm/bin/ld.lld                                                           │
│     +     │ Device Libraries │ -         │ /home/simi/.julia/artifacts/5ad5ecb46e3c334821f54c1feecc6c152b7b6a45/amdgcn/bitcode │
│     +     │ HIP              │ 6.1.40093 │ /opt/rocm-6.1.3/lib/libamdhip64.so                                                  │
│     +     │ rocBLAS          │ 4.1.2     │ /opt/rocm-6.1.3/lib/librocblas.so                                                   │
│     +     │ rocSOLVER        │ 3.25.0    │ /opt/rocm-6.1.3/lib/librocsolver.so                                                 │
│     +     │ rocALUTION       │ -         │ /opt/rocm-6.1.3/lib/librocalution.so                                                │
│     +     │ rocSPARSE        │ -         │ /opt/rocm-6.1.3/lib/librocsparse.so                                                 │
│     +     │ rocRAND          │ 2.10.5    │ /opt/rocm-6.1.3/lib/librocrand.so                                                   │
│     +     │ rocFFT           │ 1.0.27    │ /opt/rocm-6.1.3/lib/librocfft.so                                                    │
│     +     │ MIOpen           │ 3.1.0     │ /opt/rocm-6.1.3/lib/libMIOpen.so                                                    │
└───────────┮──────────────────┮───────────┮─────────────────────────────────────────────────────────────────────────────────────┘

[ Info: AMDGPU devices
┌────┬────────────────────────┬──────────┬───────────┬────────────┐
│ Id │                   Name │ GCN arch │ Wavefront │     Memory │
├────┌────────────────────────┌──────────┌───────────┌─────────────
│  1 │ AMD Radeon RX 7900 XTX │  gfx1100 │        32 │ 23.938 GiB │
└────┮────────────────────────┮──────────┮───────────┮────────────┘

from waterlily.jl.

b-fg avatar b-fg commented on August 19, 2024

Great! And yes, that mismatch might have caused your original error. For that, you could submit an issue on AMDGPU.jl I guess. So, is this issue resolved now? :)

from waterlily.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.