GithubHelp home page GithubHelp logo

Problem with RNN and CUDA. about flux.jl HOT 7 OPEN

kvantitative avatar kvantitative commented on June 10, 2024
Problem with RNN and CUDA.

from flux.jl.

Comments (7)

kvantitative avatar kvantitative commented on June 10, 2024 1

I have attached the complete stack trace as a file.
stacktrace.txt

from flux.jl.

jeremiedb avatar jeremiedb commented on June 10, 2024

It looks like the issue comes from latest Zygote v0.6.67, which merged this issue: FluxML/Zygote.jl#1328
I could reproduce the issue on Zygote v0.6.67, but it worked fine on Zygote v0.6.66.
@ToucheSir

from flux.jl.

ToucheSir avatar ToucheSir commented on June 10, 2024

Changes made on our side should not be directly causing memory access errors on the CUDA one since we operate at a much higher level, so I'm inclined to say the problem lies elsewhere and was exposed by Zygote switching to the ChainRules rule. Can you get a stacktrace which includes Zygote and ChainRules(Core)? Having just CUDA.jl in there is not helpful.

from flux.jl.

jeremiedb avatar jeremiedb commented on June 10, 2024

Sorry I'm afraid I missunderstand your comment regarding having CUDA.jl not being useful, as there's no issue raised when the above issue is run on the cpu.

Working on GPU - Zygote v0.6.67

using Flux
using CUDA
using Flux.Losses: mse
dev = gpu # cpu is working fine

#######################
# no indexing
#######################
m = RNN(2 => 1) |> dev
x = rand(Float32, 2, 3) |> dev;
y = rand(Float32, 1, 3) |> dev

m.state
Flux.reset!(m)
m(x)

loss(m, x, y) = mse(m(x), y)
Flux.reset!(m)
gs = gradient(m) do model
    loss(model, x, y)
end

Failing on GPU - Zygote v0.6.67 (working on Zygote v0.6.66):

m = RNN(2 => 1) |> dev
x = [rand(Float32, 2, 3) for i in 1:4] |> dev;
y = rand(Float32, 1, 3) |> dev

m.state
Flux.reset!(m)
m(x[1])

function loss_1(m, x, y)
    p = [m(xi) for xi in x]
    mse(p[1], y)
end

loss_1(m, x, y)

Flux.reset!(m)
gs = gradient(m) do model
    loss_1(model, x, y)
end

And the above return following error message:

julia> gs = gradient(m) do model
           loss_1(model, x, y)
       end
ERROR: MethodError: no method matching parent(::Type{SubArray{Union{ChainRulesCore.ZeroTangent, CuMatrix{Float32, CUDA.Mem.DeviceBuffer}, DenseCuMatrix{Float32, CUDA.Mem.DeviceBuffer}}, 0, Vector{Union{ChainRulesCore.ZeroTangent, CuMatrix{Float32, CUDA.Mem.DeviceBuffer}, DenseCuMatrix{Float32, CUDA.Mem.DeviceBuffer}}}, Tuple{Int64}, true}})
Closest candidates are:
  parent(::Union{LinearAlgebra.Adjoint{T, S}, LinearAlgebra.Transpose{T, S}} where {T, S}) at C:\Users\jerem\AppData\Local\Programs\Julia-1.8.5\share\julia\stdlib\v1.8\LinearAlgebra\src\adjtrans.jl:218
  parent(::Union{LinearAlgebra.Hermitian{T, S}, LinearAlgebra.Symmetric{T, S}} where {T, S}) at C:\Users\jerem\AppData\Local\Programs\Julia-1.8.5\share\julia\stdlib\v1.8\LinearAlgebra\src\symmetric.jl:275
  parent(::Union{NNlib.BatchedAdjoint{T, S}, NNlib.BatchedTranspose{T, S}} where {T, S}) at C:\Users\jerem\.julia\packages\NNlib\Fg3DQ\src\batched\batchedadjtrans.jl:73
  ...
Stacktrace:
  [1] backend(#unused#::Type{SubArray{Union{ChainRulesCore.ZeroTangent, CuMatrix{Float32, CUDA.Mem.DeviceBuffer}, DenseCuMatrix{Float32, CUDA.Mem.DeviceBuffer}}, 0, Vector{Union{ChainRulesCore.ZeroTangent, CuMatrix{Float32, CUDA.Mem.DeviceBuffer}, DenseCuMatrix{Float32, CUDA.Mem.DeviceBuffer}}}, Tuple{Int64}, true}})
    @ GPUArraysCore C:\Users\jerem\.julia\packages\GPUArraysCore\uOYfN\src\GPUArraysCore.jl:151
  [2] backend(x::SubArray{Union{ChainRulesCore.ZeroTangent, CuMatrix{Float32, CUDA.Mem.DeviceBuffer}, DenseCuMatrix{Float32, CUDA.Mem.DeviceBuffer}}, 0, Vector{Union{ChainRulesCore.ZeroTangent, CuMatrix{Float32, CUDA.Mem.DeviceBuffer}, DenseCuMatrix{Float32, CUDA.Mem.DeviceBuffer}}}, Tuple{Int64}, true})
    @ GPUArraysCore C:\Users\jerem\.julia\packages\GPUArraysCore\uOYfN\src\GPUArraysCore.jl:149
  [3] _copyto!
    @ C:\Users\jerem\.julia\packages\GPUArrays\5XhED\src\host\broadcast.jl:65 [inlined]
  [4] materialize!
    @ C:\Users\jerem\.julia\packages\GPUArrays\5XhED\src\host\broadcast.jl:41 [inlined]
  [5] materialize!
    @ .\broadcast.jl:868 [inlined]
  [6] ∇getindex!(dx::Vector{Union{ChainRulesCore.ZeroTangent, CuMatrix{Float32, CUDA.Mem.DeviceBuffer}, DenseCuMatrix{Float32, CUDA.Mem.DeviceBuffer}}}, dy::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, inds::Int64)
    @ ChainRules C:\Users\jerem\.julia\packages\ChainRules\DSuXy\src\rulesets\Base\indexing.jl:147
  [7] ∇getindex(x::Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, dy::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, inds::Int64)
    @ ChainRules C:\Users\jerem\.julia\packages\ChainRules\DSuXy\src\rulesets\Base\indexing.jl:89
  [8] (::ChainRules.var"#1583#1585"{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Int64}})()
    @ ChainRules C:\Users\jerem\.julia\packages\ChainRules\DSuXy\src\rulesets\Base\indexing.jl:69
  [9] unthunk
    @ C:\Users\jerem\.julia\packages\ChainRulesCore\7MWx2\src\tangent_types\thunks.jl:204 [inlined]
 [10] unthunk(x::ChainRulesCore.InplaceableThunk{ChainRulesCore.Thunk{ChainRules.var"#1583#1585"{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Int64}}}, ChainRules.var"#1582#1584"{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Int64}}})
    @ ChainRulesCore C:\Users\jerem\.julia\packages\ChainRulesCore\7MWx2\src\tangent_types\thunks.jl:237
 [11] wrap_chainrules_output
    @ C:\Users\jerem\.julia\packages\Zygote\YYT6v\src\compiler\chainrules.jl:110 [inlined]
 [12] map
    @ .\tuple.jl:223 [inlined]
 [13] wrap_chainrules_output
    @ C:\Users\jerem\.julia\packages\Zygote\YYT6v\src\compiler\chainrules.jl:111 [inlined]
 [14] ZBack
    @ C:\Users\jerem\.julia\packages\Zygote\YYT6v\src\compiler\chainrules.jl:211 [inlined]
 [15] Pullback
    @ C:\Users\jerem\.julia\packages\Zygote\YYT6v\src\tools\builtins.jl:12 [inlined]
 [16] (::Zygote.Pullback{Tuple{typeof(Zygote.literal_getindex), Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Val{1}}, Tuple{Zygote.ZBack{ChainRules.var"#getindex_pullback#1581"{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Int64}, Tuple{ChainRulesCore.NoTangent}}}}})(Δ::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
    @ Zygote C:\Users\jerem\.julia\packages\Zygote\YYT6v\src\compiler\interface2.jl:0
 [17] Pullback
    @ c:\Users\jerem\OneDrive\github\ADTests.jl\FluxRNNTest\rnn-gpu.jl:94 [inlined]
 [18] (::Zygote.Pullback{Tuple{typeof(loss_1), Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.Pullback{Tuple{Type{Base.Generator}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Zygote.Pullback{Tuple{Type{Base.Generator{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Zygote.Pullback{Tuple{typeof(convert), Type{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Tuple{}}, Zygote.var"#2193#back#313"{Zygote.Jnew{Base.Generator{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Nothing, false}}, Zygote.Pullback{Tuple{typeof(convert), Type{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Any}}}}}, Zygote.var"#2193#back#313"{Zygote.Jnew{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Nothing, false}}, Zygote.Pullback{Tuple{typeof(mse), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.Pullback{Tuple{Flux.Losses.var"##mse#14", typeof(Statistics.mean), typeof(mse), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.ZBack{ChainRules.var"#mean_pullback#1821"{Int64, ChainRules.var"#sum_pullback#1633"{Colon, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ChainRulesCore.ProjectTo{AbstractArray, NamedTuple{(:element, :axes), Tuple{ChainRulesCore.ProjectTo{Float32, NamedTuple{(), Tuple{}}}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}}}}}}, Zygote.var"#3960#back#1287"{Zygote.var"#1283#1286"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#3752#back#1189"{Zygote.var"#1185#1188"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.ZBack{Flux.Losses.var"#_check_sizes_pullback#12"}, Zygote.Pullback{Tuple{typeof(Base.Broadcast.materialize), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{}}}}}}, Zygote.var"#collect_pullback#704"{Zygote.var"#map_back#666"{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, 1, Tuple{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Tuple{Base.OneTo{Int64}}}, Vector{Tuple{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Zygote.Pullback{Tuple{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:m, Zygote.Context{false}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.Pullback{Tuple{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:cell, Zygote.Context{false}, Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 2, Zygote.Context{false}, Int64}}, Zygote.Pullback{Tuple{typeof(setproperty!), Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Symbol, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.ZBack{ChainRules.var"#typeof_pullback#45"}, Zygote.Pullback{Tuple{typeof(convert), Type{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{}}, Zygote.ZBack{ChainRules.var"#fieldtype_pullback#421"}, Zygote.var"#2182#back#311"{Zygote.var"#309#310"{Symbol, Base.RefValue{Any}}}}}, Zygote.var"#back#246"{Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 2, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.var"#back#245"{Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.Pullback{Tuple{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.ZBack{Flux.var"#174#175"}, Zygote.var"#3736#back#1181"{Zygote.var"#1175#1179"{Tuple{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}, Zygote.ZBack{ChainRules.var"#times_pullback#1466"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:b, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}, Zygote.ZBack{ChainRules.var"#size_pullback#919"}, Zygote.var"#1999#back#204"{typeof(identity)}, Zygote.var"#3736#back#1181"{Zygote.var"#1175#1179"{Tuple{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.ZBack{NNlib.var"#broadcasted_tanh_fast_pullback#145"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.Pullback{Tuple{typeof(Flux.reshape_cell_output), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#2047#back#232"{Zygote.var"#226#230"{2, UnitRange{Int64}}}, Zygote.var"#2155#back#293"{Zygote.var"#291#292"{Tuple{Tuple{Nothing, Nothing}, Tuple{Nothing}}, Zygote.var"#2746#back#609"{Zygote.var"#603#607"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Colon, Int64}}}}}, Zygote.ZBack{ChainRules.var"#size_pullback#917"}, Zygote.Pullback{Tuple{typeof(lastindex), Tuple{Int64, Int64}}, Tuple{Zygote.ZBack{ChainRules.var"#length_pullback#747"}}}, Zygote.var"#1999#back#204"{typeof(identity)}, Zygote.ZBack{ChainRules.var"#:_pullback#276"{Tuple{Int64, Int64}}}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:σ, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, typeof(tanh)}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:Wi, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:Wh, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.ZBack{ChainRules.var"#times_pullback#1466"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.Pullback{Tuple{Type{Pair}, Int64, Int64}, Tuple{Zygote.Pullback{Tuple{typeof(Core.convert), Type{Int64}, Int64}, Tuple{}}, Zygote.var"#2193#back#313"{Zygote.Jnew{Pair{Int64, Int64}, Nothing, false}}, Zygote.ZBack{ChainRules.var"#fieldtype_pullback#421"}, Zygote.Pullback{Tuple{typeof(Core.convert), Type{Int64}, Int64}, Tuple{}}, Zygote.ZBack{ChainRules.var"#fieldtype_pullback#421"}}}, Zygote.Pullback{Tuple{typeof(Base.Broadcast.materialize), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{}}, Zygote.ZBack{Flux.var"#_size_check_pullback#201"{Tuple{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Pair{Int64, Int64}}}}, Zygote.Pullback{Tuple{typeof(NNlib.fast_act), typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#1972#back#194"{Zygote.var"#190#193"{Zygote.Context{false}, GlobalRef, typeof(tanh_fast)}}}}}}, Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:state, Zygote.Context{false}, Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}}}}}}}, Nothing}, Zygote.Pullback{Tuple{typeof(Zygote.literal_getindex), Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Val{1}}, Tuple{Zygote.ZBack{ChainRules.var"#getindex_pullback#1581"{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Int64}, Tuple{ChainRulesCore.NoTangent}}}}}}})(Δ::Float32)
    @ Zygote C:\Users\jerem\.julia\packages\Zygote\YYT6v\src\compiler\interface2.jl:0
 [19] Pullback
    @ c:\Users\jerem\OneDrive\github\ADTests.jl\FluxRNNTest\rnn-gpu.jl:101 [inlined]
 [20] (::Zygote.Pullback{Tuple{var"#115#116", Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Zygote.var"#1972#back#194"{Zygote.var"#190#193"{Zygote.Context{false}, GlobalRef, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.Pullback{Tuple{typeof(loss_1), Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.Pullback{Tuple{Type{Base.Generator}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Zygote.Pullback{Tuple{Type{Base.Generator{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Zygote.Pullback{Tuple{typeof(convert), Type{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Tuple{}}, Zygote.var"#2193#back#313"{Zygote.Jnew{Base.Generator{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Nothing, false}}, Zygote.Pullback{Tuple{typeof(convert), Type{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Any}}}}}, Zygote.var"#2193#back#313"{Zygote.Jnew{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Nothing, false}}, Zygote.Pullback{Tuple{typeof(mse), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.Pullback{Tuple{Flux.Losses.var"##mse#14", typeof(Statistics.mean), typeof(mse), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.ZBack{ChainRules.var"#mean_pullback#1821"{Int64, ChainRules.var"#sum_pullback#1633"{Colon, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ChainRulesCore.ProjectTo{AbstractArray, NamedTuple{(:element, :axes), Tuple{ChainRulesCore.ProjectTo{Float32, NamedTuple{(), Tuple{}}}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}}}}}}, Zygote.var"#3960#back#1287"{Zygote.var"#1283#1286"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#3752#back#1189"{Zygote.var"#1185#1188"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.ZBack{Flux.Losses.var"#_check_sizes_pullback#12"}, Zygote.Pullback{Tuple{typeof(Base.Broadcast.materialize), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{}}}}}}, Zygote.var"#collect_pullback#704"{Zygote.var"#map_back#666"{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, 1, Tuple{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Tuple{Base.OneTo{Int64}}}, Vector{Tuple{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Zygote.Pullback{Tuple{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:m, Zygote.Context{false}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.Pullback{Tuple{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:cell, Zygote.Context{false}, Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 2, Zygote.Context{false}, Int64}}, Zygote.Pullback{Tuple{typeof(setproperty!), Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Symbol, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.ZBack{ChainRules.var"#typeof_pullback#45"}, Zygote.Pullback{Tuple{typeof(convert), Type{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{}}, Zygote.ZBack{ChainRules.var"#fieldtype_pullback#421"}, Zygote.var"#2182#back#311"{Zygote.var"#309#310"{Symbol, Base.RefValue{Any}}}}}, Zygote.var"#back#246"{Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 2, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.var"#back#245"{Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.Pullback{Tuple{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.ZBack{Flux.var"#174#175"}, Zygote.var"#3736#back#1181"{Zygote.var"#1175#1179"{Tuple{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}, Zygote.ZBack{ChainRules.var"#times_pullback#1466"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:b, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}, Zygote.ZBack{ChainRules.var"#size_pullback#919"}, Zygote.var"#1999#back#204"{typeof(identity)}, Zygote.var"#3736#back#1181"{Zygote.var"#1175#1179"{Tuple{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.ZBack{NNlib.var"#broadcasted_tanh_fast_pullback#145"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.Pullback{Tuple{typeof(Flux.reshape_cell_output), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#2047#back#232"{Zygote.var"#226#230"{2, UnitRange{Int64}}}, Zygote.var"#2155#back#293"{Zygote.var"#291#292"{Tuple{Tuple{Nothing, Nothing}, Tuple{Nothing}}, Zygote.var"#2746#back#609"{Zygote.var"#603#607"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Colon, Int64}}}}}, Zygote.ZBack{ChainRules.var"#size_pullback#917"}, Zygote.Pullback{Tuple{typeof(lastindex), Tuple{Int64, Int64}}, Tuple{Zygote.ZBack{ChainRules.var"#length_pullback#747"}}}, Zygote.var"#1999#back#204"{typeof(identity)}, Zygote.ZBack{ChainRules.var"#:_pullback#276"{Tuple{Int64, Int64}}}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:σ, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, typeof(tanh)}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:Wi, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:Wh, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.ZBack{ChainRules.var"#times_pullback#1466"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.Pullback{Tuple{Type{Pair}, Int64, Int64}, Tuple{Zygote.Pullback{Tuple{typeof(Core.convert), Type{Int64}, Int64}, Tuple{}}, Zygote.var"#2193#back#313"{Zygote.Jnew{Pair{Int64, Int64}, Nothing, false}}, Zygote.ZBack{ChainRules.var"#fieldtype_pullback#421"}, Zygote.Pullback{Tuple{typeof(Core.convert), Type{Int64}, Int64}, Tuple{}}, Zygote.ZBack{ChainRules.var"#fieldtype_pullback#421"}}}, Zygote.Pullback{Tuple{typeof(Base.Broadcast.materialize), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{}}, Zygote.ZBack{Flux.var"#_size_check_pullback#201"{Tuple{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Pair{Int64, Int64}}}}, Zygote.Pullback{Tuple{typeof(NNlib.fast_act), typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#1972#back#194"{Zygote.var"#190#193"{Zygote.Context{false}, GlobalRef, typeof(tanh_fast)}}}}}}, Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:state, Zygote.Context{false}, Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}}}}}}}, Nothing}, Zygote.Pullback{Tuple{typeof(Zygote.literal_getindex), Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Val{1}}, Tuple{Zygote.ZBack{ChainRules.var"#getindex_pullback#1581"{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Int64}, Tuple{ChainRulesCore.NoTangent}}}}}}}, Zygote.var"#1972#back#194"{Zygote.var"#190#193"{Zygote.Context{false}, GlobalRef, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}})(Δ::Float32)
    @ Zygote C:\Users\jerem\.julia\packages\Zygote\YYT6v\src\compiler\interface2.jl:0
 [21] (::Zygote.var"#75#76"{Zygote.Pullback{Tuple{var"#115#116", Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Zygote.var"#1972#back#194"{Zygote.var"#190#193"{Zygote.Context{false}, GlobalRef, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.Pullback{Tuple{typeof(loss_1), Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.Pullback{Tuple{Type{Base.Generator}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Zygote.Pullback{Tuple{Type{Base.Generator{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Zygote.Pullback{Tuple{typeof(convert), Type{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Tuple{}}, Zygote.var"#2193#back#313"{Zygote.Jnew{Base.Generator{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Nothing, false}}, Zygote.Pullback{Tuple{typeof(convert), Type{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Any}}}}}, Zygote.var"#2193#back#313"{Zygote.Jnew{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Nothing, false}}, Zygote.Pullback{Tuple{typeof(mse), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.Pullback{Tuple{Flux.Losses.var"##mse#14", typeof(Statistics.mean), typeof(mse), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.ZBack{ChainRules.var"#mean_pullback#1821"{Int64, ChainRules.var"#sum_pullback#1633"{Colon, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ChainRulesCore.ProjectTo{AbstractArray, NamedTuple{(:element, :axes), Tuple{ChainRulesCore.ProjectTo{Float32, NamedTuple{(), Tuple{}}}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}}}}}}, Zygote.var"#3960#back#1287"{Zygote.var"#1283#1286"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#3752#back#1189"{Zygote.var"#1185#1188"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.ZBack{Flux.Losses.var"#_check_sizes_pullback#12"}, Zygote.Pullback{Tuple{typeof(Base.Broadcast.materialize), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{}}}}}}, Zygote.var"#collect_pullback#704"{Zygote.var"#map_back#666"{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, 1, Tuple{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Tuple{Tuple{Base.OneTo{Int64}}}, Vector{Tuple{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Zygote.Pullback{Tuple{var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:m, Zygote.Context{false}, var"#113#114"{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.Pullback{Tuple{Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:cell, Zygote.Context{false}, Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 2, Zygote.Context{false}, Int64}}, Zygote.Pullback{Tuple{typeof(setproperty!), Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Symbol, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.ZBack{ChainRules.var"#typeof_pullback#45"}, Zygote.Pullback{Tuple{typeof(convert), Type{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{}}, Zygote.ZBack{ChainRules.var"#fieldtype_pullback#421"}, Zygote.var"#2182#back#311"{Zygote.var"#309#310"{Symbol, Base.RefValue{Any}}}}}, Zygote.var"#back#246"{Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 2, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.var"#back#245"{Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.Pullback{Tuple{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.ZBack{Flux.var"#174#175"}, Zygote.var"#3736#back#1181"{Zygote.var"#1175#1179"{Tuple{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}}, Zygote.ZBack{ChainRules.var"#times_pullback#1466"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:b, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}}, Zygote.ZBack{ChainRules.var"#size_pullback#919"}, Zygote.var"#1999#back#204"{typeof(identity)}, Zygote.var"#3736#back#1181"{Zygote.var"#1175#1179"{Tuple{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}, Zygote.ZBack{NNlib.var"#broadcasted_tanh_fast_pullback#145"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.Pullback{Tuple{typeof(Flux.reshape_cell_output), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#2047#back#232"{Zygote.var"#226#230"{2, UnitRange{Int64}}}, Zygote.var"#2155#back#293"{Zygote.var"#291#292"{Tuple{Tuple{Nothing, Nothing}, Tuple{Nothing}}, Zygote.var"#2746#back#609"{Zygote.var"#603#607"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Colon, Int64}}}}}, Zygote.ZBack{ChainRules.var"#size_pullback#917"}, Zygote.Pullback{Tuple{typeof(lastindex), Tuple{Int64, Int64}}, Tuple{Zygote.ZBack{ChainRules.var"#length_pullback#747"}}}, Zygote.var"#1999#back#204"{typeof(identity)}, Zygote.ZBack{ChainRules.var"#:_pullback#276"{Tuple{Int64, Int64}}}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:σ, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, typeof(tanh)}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:Wi, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:Wh, Zygote.Context{false}, Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.ZBack{ChainRules.var"#times_pullback#1466"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.Pullback{Tuple{Type{Pair}, Int64, Int64}, Tuple{Zygote.Pullback{Tuple{typeof(Core.convert), Type{Int64}, Int64}, Tuple{}}, Zygote.var"#2193#back#313"{Zygote.Jnew{Pair{Int64, Int64}, Nothing, false}}, Zygote.ZBack{ChainRules.var"#fieldtype_pullback#421"}, Zygote.Pullback{Tuple{typeof(Core.convert), Type{Int64}, Int64}, Tuple{}}, Zygote.ZBack{ChainRules.var"#fieldtype_pullback#421"}}}, Zygote.Pullback{Tuple{typeof(Base.Broadcast.materialize), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{}}, Zygote.ZBack{Flux.var"#_size_check_pullback#201"{Tuple{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Pair{Int64, Int64}}}}, Zygote.Pullback{Tuple{typeof(NNlib.fast_act), typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Zygote.var"#1972#back#194"{Zygote.var"#190#193"{Zygote.Context{false}, GlobalRef, typeof(tanh_fast)}}}}}}, Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2015#back#213"{Zygote.var"#back#211"{2, 1, Zygote.Context{false}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}, Zygote.var"#2166#back#303"{Zygote.var"#back#302"{:state, Zygote.Context{false}, Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}}}}}}}, Nothing}, Zygote.Pullback{Tuple{typeof(Zygote.literal_getindex), Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Val{1}}, Tuple{Zygote.ZBack{ChainRules.var"#getindex_pullback#1581"{Vector{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Tuple{Int64}, Tuple{ChainRulesCore.NoTangent}}}}}}}, Zygote.var"#1972#back#194"{Zygote.var"#190#193"{Zygote.Context{false}, GlobalRef, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}}}}})(Δ::Float32)
    @ Zygote C:\Users\jerem\.julia\packages\Zygote\YYT6v\src\compiler\interface.jl:45
 [22] gradient(f::Function, args::Flux.Recur{Flux.RNNCell{typeof(tanh), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}})
    @ Zygote C:\Users\jerem\.julia\packages\Zygote\YYT6v\src\compiler\interface.jl:97
 [23] top-level scope
    @ c:\Users\jerem\OneDrive\github\ADTests.jl\FluxRNNTest\rnn-gpu.jl:100

from flux.jl.

avik-pal avatar avik-pal commented on June 10, 2024

I don't know why exactly this is happening, but I had to push a temporary patch for this. See https://github.com/LuxDL/Lux.jl/pull/442/files

from flux.jl.

ToucheSir avatar ToucheSir commented on June 10, 2024

@jeremiedb I meant that the stacktrace @kvantitative posted only had stack frames from CUDA.jl and not Zygote or ChainRules. It wasn't at all clear how higher-level libraries would've been responsible for a low-level illegal memory access error because of that.

The issue you found is FluxML/Zygote.jl#1470 (comment). If you have a better understanding of what's going on with GPU broadcast styles there or know someone who does, additional guidance would be much appreciated.

from flux.jl.

nomadbl avatar nomadbl commented on June 10, 2024

I commented on the issue @ToucheSir raised here FluxML/Zygote.jl#1470 (comment) but it does not bear directly on the RNN.
@jeremiedb 's second example works for me using JLArrays if I modify it slightly to

...
dev = m-> fmap(jl, m; exclude=Flux.Optimisers.maywrite) # move weights/inputs to "gpu"
x = tuple((rand(Float32, 2, 3) for i in 1:4)...) |> dev; # Tuple{JLArray}
...

So I think the RNN is fine, the issue is the problematic use of Vectors

I used a temp env with

ChainRulesCore v1.18.0
Flux v0.14.6
JLArrays v0.1.

from flux.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.