GithubHelp home page GithubHelp logo

exapf.jl's People

Contributors

amontoison avatar dmaldona avatar frapac avatar michel2323 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

exapf.jl's Issues

Time to first power flow

Currently, we are spending almost a minute to solve the first power flow. On Julia 1.6 and my local machine, I get in a fresh Julia session:

julia> include("tmp/launch_powerflow.jl")
 42.329781 seconds (50.81 M allocations: 2.880 GiB, 10.27% gc time)

This is mostly due to type inference issues. On Julia 1.6, fixing the type inferences on ReducedSpaceEvaluator allowed to decrease the first compile time of ExaPF.hessprod! from 3mn to 10s (see #98 ). We should be able to do the same on powerflow, and most functions exposed to the users.

ExaPF 0.6

This issue lists the remaining TODOs required before releasing ExaPF 0.6

New features

  • Multiple generators per buses (#168 )
  • Merge new overlapping Schwarz preconditioner (#86 )
  • Integration of batch Hessian (PR #179 #185)
  • Figure out new interface for LinearSolvers (PR #176 )

Refactoring

  • Move all Evaluators in a separate package (#191 )
  • Move all CUDA related code in a subpackage in ExaPF (#175 )
  • Clean API of ExaPF (operational constraints in Polar)

Fixes

  • Fix invalidations (PR #182 )
  • Solve issue with allowscalar on CUDA.jl v3.3 (#80 )

Documentation

What do you guys think about documentation? Yes/No? If "Yes", any particular system that you prefer?

Preconditioner weights

The preconditioner currently works without edge weights (all weights = 1). It would make sense to use the electrical distance as an edge weight. The Julia package Metis.jl does only allow for vertex weights. The underlying Metis library supports edge weigths. A PR to Metis.jl would make sense.

Non-deterministic out-of-bounds error in AD backend

Full log:

Active constraints: Error During Test at /home/runner/work/ExaPF.jl/ExaPF.jl/test/penalty.jl:38
  Got exception outside of a @test
  BoundsError: attempt to access 14×106 Array{Float64,2} at index [65, 1]
  Stacktrace:
   [1] getindex at ./array.jl:810 [inlined]
   [2] uncompress!(::SparseMatrixCSC{Float64,Int64}, ::Array{Float64,2}, ::Array{Int64,1}) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/ad.jl:412
   [3] macro expansion at /home/runner/work/ExaPF.jl/ExaPF.jl/src/ad.jl:490 [inlined]
   [4] macro expansion at /home/runner/.julia/packages/TimerOutputs/ZmKD7/src/TimerOutput.jl:190 [inlined]
   [5] residualJacobianAD!(::ExaPF.AD.StateJacobianAD{Array{Int64,1},Array{Float64,1},Array{Float64,2},SparseMatrixCSC,Array{ForwardDiff.Partials{14,Float64},1},Array{ForwardDiff.Dual{Nothing,Float64,14},1},SubArray{Float64,1,Array{Float64,1},Tuple{Array{Int64,1}},false},SubArray{ForwardDiff.Dual{Nothing,Float64,14},1,Array{ForwardDiff.Dual{Nothing,Float64,14},1},Tuple{Array{Int64,1}},false}}, ::typeof(ExaPF.residualFunction_polar!), ::Array{Float64,1}, ::Array{Float64,1}, ::ExaPF.Spmat{Array{Int64,1},Array{Float64,1}}, ::ExaPF.Spmat{Array{Int64,1},Array{Float64,1}}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Int64,1}, ::Array{Int64,1}, ::Array{Int64,1}, ::Int64, ::TimerOutput) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/ad.jl:489
   [6] macro expansion at /home/runner/work/ExaPF.jl/ExaPF.jl/src/models/polar/polar.jl:317 [inlined]
   [7] macro expansion at /home/runner/.julia/packages/TimerOutputs/ZmKD7/src/TimerOutput.jl:190 [inlined]
   [8] macro expansion at /home/runner/work/ExaPF.jl/ExaPF.jl/src/models/polar/polar.jl:316 [inlined]
   [9] macro expansion at /home/runner/.julia/packages/TimerOutputs/ZmKD7/src/TimerOutput.jl:190 [inlined]
   [10] powerflow(::PolarForm{Float64,Array{Int64,1},Array{Float64,1},Array{Float64,2}}, ::ExaPF.AD.StateJacobianAD{Array{Int64,1},Array{Float64,1},Array{Float64,2},SparseMatrixCSC,Array{ForwardDiff.Partials{14,Float64},1},Array{ForwardDiff.Dual{Nothing,Float64,14},1},SubArray{Float64,1,Array{Float64,1},Tuple{Array{Int64,1}},false},SubArray{ForwardDiff.Dual{Nothing,Float64,14},1,Array{ForwardDiff.Dual{Nothing,Float64,14},1},Tuple{Array{Int64,1}},false}}, ::ExaPF.PolarNetworkState{Array{Float64,1}}; solver::ExaPF.LinearSolvers.DirectSolver, tol::Float64, maxiter::Int64, verbose_level::Int64) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/models/polar/polar.jl:312
   [11] update!(::ExaPF.ReducedSpaceEvaluator{Float64}, ::Array{Float64,1}; verbose_level::Int64) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/Evaluators/reduced_evaluator.jl:78
   [12] update! at /home/runner/work/ExaPF.jl/ExaPF.jl/src/Evaluators/reduced_evaluator.jl:73 [inlined]
   [13] ExaPF.MaxScaler(::ExaPF.ReducedSpaceEvaluator{Float64}, ::Array{Float64,1}; η::Float64, tol::Float64) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/Evaluators/common.jl:64
   [14] MaxScaler at /home/runner/work/ExaPF.jl/ExaPF.jl/src/Evaluators/common.jl:62 [inlined]
   [15] ExaPF.PenaltyEvaluator(::ExaPF.ReducedSpaceEvaluator{Float64}, ::Array{Float64,1}; scale::Bool, penalties::Array{Float64,1}, c₀::Float64) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/Evaluators/penalty.jl:28
   [16] top-level scope at /home/runner/work/ExaPF.jl/ExaPF.jl/test/penalty.jl:51
   [17] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [18] top-level scope at /home/runner/work/ExaPF.jl/ExaPF.jl/test/penalty.jl:39
   [19] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [20] top-level scope at /home/runner/work/ExaPF.jl/ExaPF.jl/test/penalty.jl:2
   [21] include(::String) at ./client.jl:457
   [22] top-level scope at /home/runner/work/ExaPF.jl/ExaPF.jl/test/runtests.jl:35
   [23] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [24] top-level scope at /home/runner/work/ExaPF.jl/ExaPF.jl/test/runtests.jl:30
   [25] include(::String) at ./client.jl:457
   [26] top-level scope at none:6
   [27] eval(::Module, ::Any) at ./boot.jl:331
   [28] exec_options(::Base.JLOptions) at ./client.jl:272
   [29] _start() at ./client.jl:506

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Improve the naming of the API

At some point, it would be nice to think about a more proper naming for the API.
For instance:

  • take a look at the abstract functions in src/models/models.jl and decide if their names are explicit enough
  • we could do the same for the abstract attributes implemented in src/PowerSystem/PowerSystem.jl
  • a more thorough discussion should take place for the functions implemented in src/evaluators.jl. At some point, the names of the functions implemented in this file are too concise, and we do not know what exactly do they correspond (take update! for instance).

In my opinion, naming is a non-trivial task, and we could benefit a lot by discussing it together 😸

Implement reduced Hessian on GPU

This issue tracks the implementation of the reduced Hessian on the GPU. The reduced Hessian is computed in two steps:

  1. Compute the Hessians of the objective and the constraints in the full-space.
    The Hessians are computed in the full-space using AD (forward over reverse), with hand-coded adjoints. Hessians are implemented as adjoint-Hessian-vector product routine.
  2. Compute the second-order adjoints by solving two linear systems (see document)
    The second order adjoints are computed using two linear-systems, with different RHS (each RHS corresponding to a given vector v). We should find an efficient way to do that on the GPU.

Current state:

  • Implement the reduced Hessian on the CPU using MATPOWER's expressions
  • Implement the adjoints for the objective and the constraints on the GPU
    • Adjoint of power_balance (#99)
    • Adjoint of reactive_power_constraints (#109)
    • Adjoint of active_power_constraints (#112)
    • Adjoint of flow_constraints (#107)
    • Adjoint of objective (#112)
  • Implement the full-space adjoint-Hessian-vector on the GPU and check results with MATPOWER
    • Hv of power_balance (#99)
    • Hv of reactive_power_constraints (#109)
    • Hv t of active_power_constraints (#112)
    • Hv of flow_constraints (#111)
    • Hv of objective (#112)
  • Implement the reduced Hessian on the GPU, using full-space adjoint-Hessian-vector products
    • Implement reduced Hessian + AutoDiff in ReducedSpaceEvaluator (#118 )
    • Implement reduced Hessian + AutoDiff in ProxALEvaluator (#124)
    • Test resolution of linear systems with multiple RHS with CUSOLVER (integration with cusolverRF)
    • Test resolution of linear systems with multiple RHS with block-BICgSTAB implemented by @amontoison

Release 0.5.0

This issue is opened to track the remaining points to address before the release 0.5.0

  • Discuss together the naming of the functions exposed in the API, and check the consistency of the signatures (#146)
  • Change signature of DirectSolver to allow storing the Factorization (breaking change) (#144)
  • Fix warning issued when loading ExaPF: Warning: Replacing docs for ExaPF.bounds :: Union{} in module ExaPF (#146)
  • Check that all Julia scripts in scripts/ are working (#146)
  • Add in benchmarks the script to reproduce the results presented at JuliaCon
  • Check that ProxAL is working on ExaPF#develop (https://github.com/exanauts/ProxAL.jl/runs/2331420226)
  • Finish updating the documentation (#148)
  • Update README (#148)

Refactor power balance equations

Right now power balance equations are sort-of embedded on the solve() routine. The Jacobians are created inside too. In order to decouple, we should implement the power balance equations as a stand-alone function of the form

g(pf, x, u, p)

CUDA.jl 1.2

CUDA.jl 1.2 breaks the code in the bicgstab when the sparse matrix P is multiplied by a vector.

 x0 .= P * b

This returns zeros in x0.

CUDA.CUSPARSE.mul!(x0, P, b)

This breaks the code generation.

The cause needs to be pinned down.

Refactoring of newton-rhapson solve function: parsing and data structures.

In the current form, the newton-rhapson routine includes code that needs to be externalized to accommodate optimization algorithms. This is a partial list:

  • Initial guess (V) should be provided through the function API.
  • Indexing structures (e.g. pv, pq, npv, npq) should be created externally and be part of a PSYSTEM object. These structures are permanent and will be re-used each type the non-linear solve is called.
  • Creation of vectors x, u, p; and mapping between these and V, VANG, P, Q, etc.
  • Update xk accordingly.

Reviewers: @michel2323 , @frapac

Warning issued when running unit-tests on the GPU

I was testing ExaPF and get a few warnings issued when running the tests on the GPU.

┌ Warning: calls to Base intrinsics might be GPU incompatible
│   exception =
│    You called atan(x::T) where T<:Union{Float32, Float64} in Base.Math at special/trig.jl:519, maybe you intended to call atan(x::Float64) in CUDA at /home/fpacaud/.julia/packag
es/CUDA/42B9G/src/device/cuda/math.jl:32 instead?

I guess you may be aware of this issue, as it arises when running the test. It looks like the issue is related to GPUArrays.

My system is:

julia> versioninfo()
Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Efficient transpose Jacobian vector product for reactive power generation

When we are using an Augmented Lagrangian algorithm, we need to evaluate fast the transpose Jacobian vector product of the constraints. It appears that the J' v is not well implemented when it comes to evaluate the Jacobian of the reactive power generation qg. Indeed, we are computing the full Jacobian, in a sequential manner:
https://github.com/exanauts/ExaPF.jl/blob/develop/src/polar/constraints.jl#L231-L238

We should think of a better way to compute that.

Error in cost_gradients: "BoundsError: attempt to access 0-element Array at index 1"

When we run the OPF on some instances (case200_activ, case1888_rte) we get an error we first call the function cost_gradient. The stacktrace is:

ERROR: LoadError: BoundsError: attempt to access 0-element Array{Int64,1} at index [1]
Stacktrace:
 [1] getindex(::Array{Int64,1}, ::Int64) at ./array.jl:809
 [2] cost_gradients(::ExaPF.PowerSystem.PowerNetwork, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::KernelAbstractions.CPU) at /home/fpacaud/exa/ExaPF.jl/src/ExaPF.jl:470
 [3] cost_gradients at /home/fpacaud/exa/ExaPF.jl/src/ExaPF.jl:407 [inlined]
 [4] build_callback(::ExaPF.PowerSystem.PowerNetwork, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}) at /home/fpacaud/exa/ExaPF.jl/scripts/reduced_ipopt.jl:73
 [5] run_reduced_ipopt(::String; hessian::Bool, cons::Bool) at /home/fpacaud/exa/ExaPF.jl/scripts/reduced_ipopt.jl:270
 [6] top-level scope at /home/fpacaud/exa/ExaPF.jl/scripts/reduced_ipopt.jl:349
 [7] include(::String) at ./client.jl:457
 [8] top-level scope at REPL[10]:1
in expression starting at /home/fpacaud/exa/ExaPF.jl/scripts/reduced_ipopt.jl:349

Investigating it further, it appears that the problem arises when we convert PV buses to PQ buses here:
https://github.com/exanauts/ExaPF.jl/blob/dev/rgm/src/powersystem.jl#L375L381
Indeed, we remove the PV buses in the indexes bustypes and pv, but the array gens is keeping the old classification. At some point, we should remove the buses --- whose status change from PV to PQ --- from the array gens.

Multiple generators per bus

When we solve the power flow we usually have one equation per bus - we merge all the power injections in a bus.

In the OPF we have to keep track of each generator power injection to include in the cost function.

Or current formulation is unable to deal with this case. This affects the pegase and rte cases.

Support contingencies

We should allow the user to remove some lines when creating a model PolarForm.
Removing a line is easier than removing a bus (as we have less issues with indexing afterwards).
A plan could be:

  • remove the lines directly in PolarForm's constructor, according to the indexes specified in a contingencies argument. We need mostly to re-engineer the following line:
    https://github.com/exanauts/ExaPF.jl/blob/master/src/models/polar/polar.jl#L47
  • check that everything works as expected, as the topology of the network in PolarForm would now differ slightly from the model specified in the underlying PS.PowerNetwork, storing the original data

Allow for flexible indexing of buses

The Jacobian matrix structure depends on the indexing of the buses and the way we order the variable vector (i.e. [v1 v2 a1 a2] or [v1 a1 v2 a2], etc.). Having control over this might help us both enhance the performance of the linear solver and decrease divergent scenarios.

Reviewers: @michel2323 , @frapac

Refactor tests

As a wise man once said:

I don't test the tests anymore since I don't know what to test

I would suggest to split the runtests.jl file in several files, to test apart:

  • the behavior of PowerSystem
  • the parsing from PSSE and Matpower
  • the behavior of the costs function, the residual function and the other constraints we have to consider
  • the preconditioners
  • the Newton-Raphson algorithm (both on CPU and GPU)
  • the computation of the reduced gradients
  • the OPF resolution

Julia 1.6: unexpected error during compilation of overdub

When trying to run the tests on my local machine (Julia 1.6) on ExaPF#develop, I get:

Internal error: encountered unexpected error during compilation of overdub:
ErrorException("unsupported or misplaced expression "return" in function overdub")
jl_errorf at /buildworker/worker/package_linux64/build/src/rtutils.c:77
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:4581
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:4020
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:4262 [inlined]
<...>
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2238 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2420
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1703 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:839
unknown function (ip: (nil))
Computing Jacobian of residuals: Error During Test at /home/frapac/dev/anl/ExaPF.jl/test/powersystem.jl:103
  Got exception outside of a @test
  TaskFailedException
  Stacktrace:
    [1] wait
      @ ./task.jl:317 [inlined]
    [2] wait
      @ ~/.julia/packages/KernelAbstractions/jAutM/src/backends/cpu.jl:65 [inlined]
    [3] wait
      @ ~/.julia/packages/KernelAbstractions/jAutM/src/backends/cpu.jl:29 [inlined]
    [4] ExaPF.AutoDiff.Jacobian(structure::ExaPF.StateJacobianStructure{Vector{Float64}}, F::Vector{Float64}, vm::Vector{Float64}, va::Vector{Float64}, ybus_re::ExaPF.Spmat{Vector{Int64}, Vector{Float64}}, ybus_im::ExaPF.Spmat{Vector{Int64
}, Vector{Float64}}, pinj::Vector{Float64}, qinj::Vector{Float64}, pv::Vector{Int64}, pq::Vector{Int64}, ref::Vector{Int64}, type::ExaPF.AutoDiff.StateJacobian)
      @ ExaPF.AutoDiff ~/dev/anl/ExaPF.jl/src/autodiff.jl:125
    [5] macro expansion
      @ ~/dev/anl/ExaPF.jl/test/powersystem.jl:116 [inlined]
    [6] macro expansion
      @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
    [7] macro expansion
      @ ~/dev/anl/ExaPF.jl/test/powersystem.jl:104 [inlined]
    [8] macro expansion
      @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
    [9] top-level scope
      @ ~/dev/anl/ExaPF.jl/test/powersystem.jl:12

My current setup:

(ExaPF) pkg> st
     Project ExaPF v0.4.0
      Status `~/dev/anl/ExaPF.jl/Project.toml`
  [052768ef] CUDA v2.3.0
  [6a86dc24] FiniteDiff v2.7.2
  [f6369f11] ForwardDiff v0.10.14
  [42fd0dbc] IterativeSolvers v0.8.4
  [63c18a36] KernelAbstractions v0.4.5
  [ba0b0d4f] Krylov v0.6.0
  [093fc24a] LightGraphs v1.3.4
  [b8f27783] MathOptInterface v0.9.19
  [2679e427] Metis v1.0.0
  [47a9eef4] SparseDiffTools v1.10.2
  [a759f4b9] TimerOutputs v0.5.7
  [e88e6eb3] Zygote v0.6.0
  [37e2e46d] LinearAlgebra
  [de0858da] Printf
  [2f01184e] SparseArrays

Implement a new structure NewtonRaphson to store options of the algorithm

As we did in the LinearSolvers submodule, it may be interesting to implement a NewtonRaphson structure to store in one place all the options of the algorithm, and ease the implementation of other non-linear algorithms (as the decoupled formulation introduced in the article Fast decoupled flows).

A prototype could be:

abstract type AbstractNonLinearSolver end 

struct NewtonRaphson <: AbstractNonLinearSolver 
    linear_solver::AbstractLinearSolver # direct or indirect 
    tolerance::Float64
    max_iter::Int
end

and we could dispatch the resolution of the powerflow equations with:

powerflow(
    polar::PolarForm, 
    buffer::PolarNetworkState,
    algo::NewtonRaphson,
)

Lack of community guidelines

When reviewing you JuliaCon submission, JuliaCon/proceedings-review#72, I submitted a fix, #136, for an issue that was apparently already fixed on the develop branch. It would be helpful if there were clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software, and 3) Seek support.

Unable to run code in the Quick Start

In reviewing your JuliaCon submission, JuliaCon/proceedings-review#72, I have been exercising you code. I have been unable to run the code described in the quick start, both on the master branch and on the develop branch.

First I started making changes like the following

diff --git a/docs/src/quickstart.md b/docs/src/quickstart.md
index a9fc2c7..491fe75 100644
--- a/docs/src/quickstart.md
+++ b/docs/src/quickstart.md
@@ -69,6 +69,7 @@ in a few lines of code.
 We first instantiate a `PolarForm` object to adopt a polar formulation
 as a model:
 ```julia-repl
+julia> using KernelAbstractions
 julia> polar = PolarForm(pf, CPU())
 

@@ -90,7 +91,7 @@ Hence, the algorithm requires the following elements:

that translate to the Julia code:

-julia> physical_state = get(polar, PhysicalState())
+julia> physical_state = get(polar, ExaPF.PhysicalState())
julia> jx = ExaPF.init_ad_factory(polar, physical_state)
julia> linear_solver = DirectSolver()

but I ran into the issue that ERROR: UndefVarError: init_ad_factory not defined. I am stuck here.

p.s. it looks like the nested code blocks are messing up the formatting, not sure how to fix that.

Improve the computation of the adjoints on the GPU

At the moment, ExaPF is using its own hand-coded adjoints. Each adjoint proceeds in two steps, sequentially:

  1. compute the adjoints w.r.t. edges
  2. aggregate the edges' adjoints on the nodes

I think we could improve the existing implementation:

  1. for the first step (w.r.t. edges) I think it would be more efficient to parallelize w.r.t. the edges directly, thus avoiding the for loop inside the kernel. We could use directly the "from" and the "to" arrays (with size equal to nnz) that we use for the line flow constraints. As we have one element per branch, we won't have any race condition
  2. for the second step (w.rt. nodes) I don"t think we need to compute the transpose of the admittance matrix, at all. Indeed, the node-node admittance Ybus is symmetric.

I think that could simplify a lot the adjoint kernels, and speed-up the computation as it would induce less allocations.

Add support to single precision

Currently, only Float64 is supported. We should add support to Float32 as well.
PolarForm is already parameterized to support different precision, with its signature:

PolarForm{T, VI, VT, MT}

where T is specifying the type.

Now we should just render the implementation generic.

Store factorization in LinearSolver.DirectSolver

Now that we have implemented a wrapper to CUSOLVERRF, we should store the factorization of the powerflow matrix J inside the direct solver, to avoid refactorizing the matrix from scratch each time we are solving a linear system:

struct DirectSolver <: AbstractLinearSolver
    A_factorized::LinearAlgebra.Factorization
end

Preliminary results show that we could get ~ a 10x speed-up when resolving the powerflow with CUSOLVERRF.

Non-deterministic test failure on a GPU system

With the changes in #136 I was able to
successfully run the tests on my local computer using CUDA.jl. However, one
time a test failed (see transcript below). At first glance I didn't see any
non-determinism in the test so I wonder if the failure is related to issue
#110 where other non-deterministic
behaviour is described.

❯ julia --project=.                                        
               _                                           
   _       _ _(_)_     |  Documentation: https://docs.julialang.org                                                    
  (_)     | (_) (_)    |                                   
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.                                                        
  | | | | | | |/ _` |  |                                   
  | | |_| | | | (_| |  |  Version 1.6.0 (2021-03-24)       
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release                                                      
|__/                   |                                   

julia> versioninfo()                                       
Julia Version 1.6.0                                        
Commit f9720dc2eb (2021-03-24 12:55 UTC)                   
Platform Info:                                             
  OS: Linux (x86_64-pc-linux-gnu)                          
  CPU: AMD Ryzen 7 2700X Eight-Core Processor              
  WORD_SIZE: 64                                            
  LIBM: libopenlibm                                        
  LLVM: libLLVM-11.0.1 (ORCJIT, znver1)                    
Environment:                                               
  JULIA_MPI_BINARY = system                                

julia> using CUDA

julia> CUDA.versioninfo()
CUDA toolkit 11.0.3, artifact installation
CUDA driver 11.0.0
NVIDIA driver 450.102.4

Libraries: 
- CUBLAS: 11.2.0
- CURAND: 10.2.1
- CUFFT: 10.2.1
- CUSOLVER: 10.6.0
- CUSPARSE: 11.1.1
- CUPTI: 13.0.0
- NVML: 11.0.0+450.102.4
- CUDNN: 8.10.0 (for CUDA 11.2.0)
- CUTENSOR: 1.2.2 (for CUDA 11.1.0)

Toolchain:
- Julia: 1.6.0
- LLVM: 11.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0
- Device support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80

1 device:
  0: GeForce RTX 2060 (sm_75, 4.622 GiB / 5.792 GiB available)

(ExaPF) pkg> test                                          
     Testing ExaPF                                         
      Status `/tmp/jl_8ipLGV/Project.toml`                 
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [052768ef] CUDA v2.6.2                                   
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [093fc24a] LightGraphs v1.3.5                            
  [b8f27783] MathOptInterface v0.9.20                      
  [2679e427] Metis v1.0.0                                  
  [47a9eef4] SparseDiffTools v1.13.0                       
  [a759f4b9] TimerOutputs v0.5.8                           
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [de0858da] Printf `@stdlib/Printf`                       
  [9a3f8284] Random `@stdlib/Random`                       
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [8dfed614] Test `@stdlib/Test`                           
      Status `/tmp/jl_8ipLGV/Manifest.toml`                
  [621f4979] AbstractFFTs v1.0.1                           
  [79e6a3ab] Adapt v3.2.0                                  
  [ec485272] ArnoldiMethod v0.1.0                          
  [4fba245c] ArrayInterface v3.1.6                         
  [ab4f0b2a] BFloat16s v0.1.0                              
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [b99e7846] BinaryProvider v0.5.10                        
  [fa961155] CEnum v0.4.1                                  
  [052768ef] CUDA v2.6.2                                   
  [7057c7e9] Cassette v0.3.5                               
  [d360d2e6] ChainRulesCore v0.9.33                        
  [523fee87] CodecBzip2 v0.7.2                             
  [944b1d66] CodecZlib v0.7.0                              
  [bbf7d656] CommonSubexpressions v0.3.0                   
  [34da2185] Compat v3.25.0                                
  [864edb3b] DataStructures v0.18.9                        
  [163ba53b] DiffResults v1.0.3                            
  [b552c78f] DiffRules v1.0.2                              
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [e2ba6199] ExprTools v0.1.3                              
  [9aa1b823] FastClosures v0.3.2                           
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [0c68f7d7] GPUArrays v6.2.0                              
  [61eb1bfa] GPUCompiler v0.10.0                           
  [cd3eb016] HTTP v0.9.5                                   
  [615f187c] IfElse v0.1.0                                 
  [d25df0c9] Inflate v0.1.2                                
  [83e8ac13] IniFile v0.5.0                                
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [692b3bcd] JLLWrappers v1.2.0                            
  [682c06a0] JSON v0.21.1                                  
  [7d188eb4] JSONSchema v0.3.3                             
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [929cbde3] LLVM v3.6.0                                   
  [093fc24a] LightGraphs v1.3.5                            
  [5c8ed15e] LinearOperators v1.1.0                        
  [1914dd2f] MacroTools v0.5.6                             
  [b8f27783] MathOptInterface v0.9.20                      
  [fdba3010] MathProgBase v0.7.8                           
  [739be429] MbedTLS v1.0.3                                
  [c03570c3] Memoize v0.4.4                                
  [2679e427] Metis v1.0.0                                  
  [d8a4904e] MutableArithmetics v0.2.14                    
  [872c559c] NNlib v0.7.17                                 
  [77ba4419] NaNMath v0.3.5                                
  [bac558e1] OrderedCollections v1.4.0                     
  [69de0a69] Parsers v1.1.0                                
  [3cdcf5f2] RecipesBase v1.1.1                            
  [189a3867] Reexport v1.0.0                               
  [ae029012] Requires v1.1.3                               
  [6c6a2e73] Scratch v1.0.3                                
  [699a6c99] SimpleTraits v0.9.3                           
  [47a9eef4] SparseDiffTools v1.13.0                       
  [276daf66] SpecialFunctions v1.3.0                       
  [aedffcd0] Static v0.2.4                                 
  [90137ffa] StaticArrays v1.0.1                           
  [a759f4b9] TimerOutputs v0.5.8                           
  [3bb67fe8] TranscodingStreams v0.9.5                     
  [5c2747f8] URIs v1.2.0                                   
  [19fa3120] VertexSafeGraphs v0.1.2                       
  [a5390f91] ZipFile v0.9.3                                
  [ae81ac8f] ASL_jll v0.1.1+4                              
  [6e34b625] Bzip2_jll v1.0.6+5                            
  [9cc047cb] Ipopt_jll v3.13.4+0                           
  [d00139f3] METIS_jll v5.1.0+5                            
  [d7ed1dd3] MUMPS_seq_jll v5.2.1+4                        
  [656ef2d0] OpenBLAS32_jll v0.3.12+1                      
  [efe28fd5] OpenSpecFun_jll v0.5.3+4                      
  [0dad84c5] ArgTools `@stdlib/ArgTools`                   
  [56f22d72] Artifacts `@stdlib/Artifacts`                 
  [2a0f44e3] Base64 `@stdlib/Base64`                       
  [ade2ca70] Dates `@stdlib/Dates`                         
  [8bb1440f] DelimitedFiles `@stdlib/DelimitedFiles`       
  [8ba89e20] Distributed `@stdlib/Distributed`             
  [f43a241f] Downloads `@stdlib/Downloads`                 
  [b77e0a4c] InteractiveUtils `@stdlib/InteractiveUtils`   
  [4af54fe1] LazyArtifacts `@stdlib/LazyArtifacts`         
  [b27032c2] LibCURL `@stdlib/LibCURL`                     
  [76f85450] LibGit2 `@stdlib/LibGit2`                     
  [8f399da3] Libdl `@stdlib/Libdl`                         
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [56ddb016] Logging `@stdlib/Logging`                     
  [d6f4376e] Markdown `@stdlib/Markdown`                   
  [a63ad114] Mmap `@stdlib/Mmap`                           
  [ca575930] NetworkOptions `@stdlib/NetworkOptions`       
  [44cfe95a] Pkg `@stdlib/Pkg`                             
  [de0858da] Printf `@stdlib/Printf`                       
  [3fa0cd96] REPL `@stdlib/REPL`                           
  [9a3f8284] Random `@stdlib/Random`                       
  [ea8e919c] SHA `@stdlib/SHA`                             
  [9e88b42a] Serialization `@stdlib/Serialization`         
  [1a1011a3] SharedArrays `@stdlib/SharedArrays`           
  [6462fe0b] Sockets `@stdlib/Sockets`                     
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [10745b16] Statistics `@stdlib/Statistics`               
  [fa267f1f] TOML `@stdlib/TOML`                           
  [a4e569a6] Tar `@stdlib/Tar`                             
  [8dfed614] Test `@stdlib/Test`                           
  [cf7118a7] UUIDs `@stdlib/UUIDs`                         
  [4ec0a83e] Unicode `@stdlib/Unicode`                     
  [e66e0078] CompilerSupportLibraries_jll `@stdlib/CompilerSupportLibraries_jll`                                       
  [deac9b47] LibCURL_jll `@stdlib/LibCURL_jll`             
  [29816b5a] LibSSH2_jll `@stdlib/LibSSH2_jll`             
  [c8ffd9c3] MbedTLS_jll `@stdlib/MbedTLS_jll`             
  [14a3606d] MozillaCACerts_jll `@stdlib/MozillaCACerts_jll`                                                           
  [83775a58] Zlib_jll `@stdlib/Zlib_jll`                   
  [8e850ede] nghttp2_jll `@stdlib/nghttp2_jll`             
  [3f19e933] p7zip_jll `@stdlib/p7zip_jll`                 
  Progress [========================================>]  22/22                                                          
22 dependencies successfully precompiled in 42 seconds (53 already precompiled)                                        
     Testing Running tests...                              
Reading PSSE format                                        
┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`                                                                                                      
└ @ GPUArrays ~/.julia/packages/GPUArrays/WV76E/src/host/indexing.jl:43                                                
Test Summary:        | Pass  Total                         
Problem formulations |   85     85                         
Test Summary:     | Pass  Total                            
Iterative solvers |   36     36                            
Reading PSSE format                                        
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: Newton-Raphson algorithm failed to converge (182.29864183960703)                                            
└ @ ExaPF ~/research/code/ExaPF.jl/src/Evaluators/reduced_evaluator.jl:105                                             
Test API on CUDADevice(): Test Failed at /home/lucas/research/code/ExaPF.jl/test/reduced_evaluator.jl:65               
  Expression: isapprox(grad_fd, g, rtol = 0.0001)          
   Evaluated: isapprox([-9.935589542501313, 83.29418125602783, 176.58606788964127, 0.0, 88.79764713761492, 61.87925810156231, 5.407209203348859, -16.0610823203679, 21.697626222124487, -5.750990592075815, -11.972362422178843], [-9.935589556165517, 83.29418101143494, 176.5860680653392, 54.61617710821696, 88.79764714293702, 61.87925824943278, 5.407209536445407, -16.061082207200116, 21.697626170063813, -5.75099045219747, -11.97236266954647]; rtol = 0.0001)                      
Stacktrace:                                                
 [1] macro expansion                                       
   @ ~/research/code/ExaPF.jl/test/reduced_evaluator.jl:65 [inlined]                                                   
 [2] macro expansion                                       
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1226 [inlined]             
 [3] macro expansion                                       
   @ ~/research/code/ExaPF.jl/test/reduced_evaluator.jl:22 [inlined]                                                   
 [4] top-level scope                                       
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1226                       
┌ Warning: Passing optimizer attributes as keyword arguments to                                                        
│ `Ipopt.Optimizer` is deprecated. Use                     
│     MOI.set(model, MOI.RawParameter("key"), value)       
│ or                                                       
│     JuMP.set_optimizer_attribute(model, "key", value)    
│ instead.                                                 
└ @ Ipopt ~/.julia/packages/Ipopt/P1XLY/src/MOI_wrapper.jl:88                                                          

******************************************************************************                                         
This program contains Ipopt, a library for large-scale nonlinear optimization.                                         
 Ipopt is released as open source code under the Eclipse Public License (EPL).                                         
         For more information visit https://github.com/coin-or/Ipopt                                                   
******************************************************************************                                         

Test Summary:                       | Pass  Fail  Total    
Optimization evaluators             |  192     1    193    
  Powerflow solver                  |   36           36    
  Compute reduced gradient on CPU   |    8            8    
  ReducedSpaceEvaluators (case9.m)  |   62           62    
  ReducedSpaceEvaluators (case30.m) |   61     1     62    
    Constructor                     |   10           10    
    Constructor                     |   10           10    
    Test API on CPU()               |   21           21    
    Test API on CUDADevice()        |   20     1     21    
  PenaltyEvaluators                 |   10           10    
  AugLagEvaluators                  |   13           13    
  MOI wrapper                       |    2            2    
ERROR: LoadError: Some tests did not pass: 192 passed, 1 failed, 0 errored, 0 broken.                                  
in expression starting at /home/lucas/research/code/ExaPF.jl/test/runtests.jl:29                                       
ERROR: Package ExaPF errored during testing                

(ExaPF) pkg> test                                          
     Testing ExaPF                                         
      Status `/tmp/jl_xEjmzT/Project.toml`                 
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [052768ef] CUDA v2.6.2                                   
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [093fc24a] LightGraphs v1.3.5                            
  [b8f27783] MathOptInterface v0.9.20                      
  [2679e427] Metis v1.0.0                                  
  [47a9eef4] SparseDiffTools v1.13.0                       
  [a759f4b9] TimerOutputs v0.5.8                           
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [de0858da] Printf `@stdlib/Printf`                       
  [9a3f8284] Random `@stdlib/Random`                       
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [8dfed614] Test `@stdlib/Test`                           
      Status `/tmp/jl_xEjmzT/Manifest.toml`                
  [621f4979] AbstractFFTs v1.0.1                           
  [79e6a3ab] Adapt v3.2.0                                  
  [ec485272] ArnoldiMethod v0.1.0                          
  [4fba245c] ArrayInterface v3.1.6                         
  [ab4f0b2a] BFloat16s v0.1.0                              
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [b99e7846] BinaryProvider v0.5.10                        
  [fa961155] CEnum v0.4.1                                  
  [052768ef] CUDA v2.6.2                                   
  [7057c7e9] Cassette v0.3.5                               
  [d360d2e6] ChainRulesCore v0.9.33                        
  [523fee87] CodecBzip2 v0.7.2                             
  [944b1d66] CodecZlib v0.7.0                              
  [bbf7d656] CommonSubexpressions v0.3.0                   
  [34da2185] Compat v3.25.0                                
  [864edb3b] DataStructures v0.18.9                        
  [163ba53b] DiffResults v1.0.3                            
  [b552c78f] DiffRules v1.0.2                              
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [e2ba6199] ExprTools v0.1.3                              
  [9aa1b823] FastClosures v0.3.2                           
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [0c68f7d7] GPUArrays v6.2.0                              
  [61eb1bfa] GPUCompiler v0.10.0                           
  [cd3eb016] HTTP v0.9.5                                   
  [615f187c] IfElse v0.1.0                                 
  [d25df0c9] Inflate v0.1.2                                
  [83e8ac13] IniFile v0.5.0                                
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [692b3bcd] JLLWrappers v1.2.0                            
  [682c06a0] JSON v0.21.1                                  
  [7d188eb4] JSONSchema v0.3.3                             
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [929cbde3] LLVM v3.6.0                                   
  [093fc24a] LightGraphs v1.3.5                            
  [5c8ed15e] LinearOperators v1.1.0                        
  [1914dd2f] MacroTools v0.5.6                             
  [b8f27783] MathOptInterface v0.9.20                      
  [fdba3010] MathProgBase v0.7.8                           
  [739be429] MbedTLS v1.0.3                                
  [c03570c3] Memoize v0.4.4                                
  [2679e427] Metis v1.0.0                                  
  [d8a4904e] MutableArithmetics v0.2.14                    
  [872c559c] NNlib v0.7.17                                 
  [77ba4419] NaNMath v0.3.5                                
  [bac558e1] OrderedCollections v1.4.0                     
  [69de0a69] Parsers v1.1.0                                
  [3cdcf5f2] RecipesBase v1.1.1                            
  [189a3867] Reexport v1.0.0                               
  [ae029012] Requires v1.1.3                               
  [6c6a2e73] Scratch v1.0.3                                
  [699a6c99] SimpleTraits v0.9.3                           
  [47a9eef4] SparseDiffTools v1.13.0                       
  [276daf66] SpecialFunctions v1.3.0                       
  [aedffcd0] Static v0.2.4                                 
  [90137ffa] StaticArrays v1.0.1                           
  [a759f4b9] TimerOutputs v0.5.8                           
  [3bb67fe8] TranscodingStreams v0.9.5                     
  [5c2747f8] URIs v1.2.0                                   
  [19fa3120] VertexSafeGraphs v0.1.2                       
  [a5390f91] ZipFile v0.9.3                                
  [ae81ac8f] ASL_jll v0.1.1+4                              
  [6e34b625] Bzip2_jll v1.0.6+5                            
  [9cc047cb] Ipopt_jll v3.13.4+0                           
  [d00139f3] METIS_jll v5.1.0+5                            
  [d7ed1dd3] MUMPS_seq_jll v5.2.1+4                        
  [656ef2d0] OpenBLAS32_jll v0.3.12+1                      
  [efe28fd5] OpenSpecFun_jll v0.5.3+4                      
  [0dad84c5] ArgTools `@stdlib/ArgTools`                   
  [56f22d72] Artifacts `@stdlib/Artifacts`                 
  [2a0f44e3] Base64 `@stdlib/Base64`                       
  [ade2ca70] Dates `@stdlib/Dates`                         
  [8bb1440f] DelimitedFiles `@stdlib/DelimitedFiles`       
  [8ba89e20] Distributed `@stdlib/Distributed`             
  [f43a241f] Downloads `@stdlib/Downloads`                 
  [b77e0a4c] InteractiveUtils `@stdlib/InteractiveUtils`   
  [4af54fe1] LazyArtifacts `@stdlib/LazyArtifacts`         
  [b27032c2] LibCURL `@stdlib/LibCURL`                     
  [76f85450] LibGit2 `@stdlib/LibGit2`                     
  [8f399da3] Libdl `@stdlib/Libdl`                         
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [56ddb016] Logging `@stdlib/Logging`                     
  [d6f4376e] Markdown `@stdlib/Markdown`                   
  [a63ad114] Mmap `@stdlib/Mmap`                           
  [ca575930] NetworkOptions `@stdlib/NetworkOptions`       
  [44cfe95a] Pkg `@stdlib/Pkg`                             
  [de0858da] Printf `@stdlib/Printf`                       
  [3fa0cd96] REPL `@stdlib/REPL`                           
  [9a3f8284] Random `@stdlib/Random`                       
  [ea8e919c] SHA `@stdlib/SHA`                             
  [9e88b42a] Serialization `@stdlib/Serialization`         
  [1a1011a3] SharedArrays `@stdlib/SharedArrays`           
  [6462fe0b] Sockets `@stdlib/Sockets`                     
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [10745b16] Statistics `@stdlib/Statistics`               
  [fa267f1f] TOML `@stdlib/TOML`                           
  [a4e569a6] Tar `@stdlib/Tar`                             
  [8dfed614] Test `@stdlib/Test`                           
  [cf7118a7] UUIDs `@stdlib/UUIDs`                         
  [4ec0a83e] Unicode `@stdlib/Unicode`                     
  [e66e0078] CompilerSupportLibraries_jll `@stdlib/CompilerSupportLibraries_jll`                                       
  [deac9b47] LibCURL_jll `@stdlib/LibCURL_jll`             
  [29816b5a] LibSSH2_jll `@stdlib/LibSSH2_jll`             
  [c8ffd9c3] MbedTLS_jll `@stdlib/MbedTLS_jll`             
  [14a3606d] MozillaCACerts_jll `@stdlib/MozillaCACerts_jll`                                                           
  [83775a58] Zlib_jll `@stdlib/Zlib_jll`                   
  [8e850ede] nghttp2_jll `@stdlib/nghttp2_jll`             
  [3f19e933] p7zip_jll `@stdlib/p7zip_jll`                 
     Testing Running tests...                              
Reading PSSE format                                        
┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`                                                                                                      
└ @ GPUArrays ~/.julia/packages/GPUArrays/WV76E/src/host/indexing.jl:43                                                
Test Summary:        | Pass  Total                         
Problem formulations |   85     85                         
Test Summary:     | Pass  Total                            
Iterative solvers |   36     36                            
Reading PSSE format                                        
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: Passing optimizer attributes as keyword arguments to                                                        
│ `Ipopt.Optimizer` is deprecated. Use                     
│     MOI.set(model, MOI.RawParameter("key"), value)       
│ or                                                       
│     JuMP.set_optimizer_attribute(model, "key", value)    
│ instead.                                                 
└ @ Ipopt ~/.julia/packages/Ipopt/P1XLY/src/MOI_wrapper.jl:88                                                          

******************************************************************************                                         
This program contains Ipopt, a library for large-scale nonlinear optimization.                                         
 Ipopt is released as open source code under the Eclipse Public License (EPL).                                         
         For more information visit https://github.com/coin-or/Ipopt                                                   
******************************************************************************                                         

Test Summary:           | Pass  Total                      
Optimization evaluators |  193    193                      
Test Summary:            | Pass  Total                     
Reduced space algorithms |    2      2                     
     Testing ExaPF tests passed                            

(ExaPF) pkg> test                                          
     Testing ExaPF                                         
      Status `/tmp/jl_rbh8TS/Project.toml`                 
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [052768ef] CUDA v2.6.2                                   
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [093fc24a] LightGraphs v1.3.5                            
  [b8f27783] MathOptInterface v0.9.20                      
  [2679e427] Metis v1.0.0                                  
  [47a9eef4] SparseDiffTools v1.13.0                       
  [a759f4b9] TimerOutputs v0.5.8                           
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [de0858da] Printf `@stdlib/Printf`                       
  [9a3f8284] Random `@stdlib/Random`                       
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [8dfed614] Test `@stdlib/Test`                           
      Status `/tmp/jl_rbh8TS/Manifest.toml`                
  [621f4979] AbstractFFTs v1.0.1                           
  [79e6a3ab] Adapt v3.2.0                                  
  [ec485272] ArnoldiMethod v0.1.0                          
  [4fba245c] ArrayInterface v3.1.6                         
  [ab4f0b2a] BFloat16s v0.1.0                              
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [b99e7846] BinaryProvider v0.5.10                        
  [fa961155] CEnum v0.4.1                                  
  [052768ef] CUDA v2.6.2                                   
  [7057c7e9] Cassette v0.3.5                               
  [d360d2e6] ChainRulesCore v0.9.33                        
  [523fee87] CodecBzip2 v0.7.2                             
  [944b1d66] CodecZlib v0.7.0                              
  [bbf7d656] CommonSubexpressions v0.3.0                   
  [34da2185] Compat v3.25.0                                
  [864edb3b] DataStructures v0.18.9                        
  [163ba53b] DiffResults v1.0.3                            
  [b552c78f] DiffRules v1.0.2                              
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [e2ba6199] ExprTools v0.1.3                              
  [9aa1b823] FastClosures v0.3.2                           
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [0c68f7d7] GPUArrays v6.2.0                              
  [61eb1bfa] GPUCompiler v0.10.0                           
  [cd3eb016] HTTP v0.9.5                                   
  [615f187c] IfElse v0.1.0                                 
  [d25df0c9] Inflate v0.1.2                                
  [83e8ac13] IniFile v0.5.0                                
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [692b3bcd] JLLWrappers v1.2.0                            
  [682c06a0] JSON v0.21.1                                  
  [7d188eb4] JSONSchema v0.3.3                             
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [929cbde3] LLVM v3.6.0                                   
  [093fc24a] LightGraphs v1.3.5                            
  [5c8ed15e] LinearOperators v1.1.0                        
  [1914dd2f] MacroTools v0.5.6                             
  [b8f27783] MathOptInterface v0.9.20                      
  [fdba3010] MathProgBase v0.7.8                           
  [739be429] MbedTLS v1.0.3                                
  [c03570c3] Memoize v0.4.4                                
  [2679e427] Metis v1.0.0                                  
  [d8a4904e] MutableArithmetics v0.2.14                    
  [872c559c] NNlib v0.7.17                                 
  [77ba4419] NaNMath v0.3.5                                
  [bac558e1] OrderedCollections v1.4.0                     
  [69de0a69] Parsers v1.1.0                                
  [3cdcf5f2] RecipesBase v1.1.1                            
  [189a3867] Reexport v1.0.0                               
  [ae029012] Requires v1.1.3                               
  [6c6a2e73] Scratch v1.0.3                                
  [699a6c99] SimpleTraits v0.9.3                           
  [47a9eef4] SparseDiffTools v1.13.0                       
  [276daf66] SpecialFunctions v1.3.0                       
  [aedffcd0] Static v0.2.4                                 
  [90137ffa] StaticArrays v1.0.1                           
  [a759f4b9] TimerOutputs v0.5.8                           
  [3bb67fe8] TranscodingStreams v0.9.5                     
  [5c2747f8] URIs v1.2.0                                   
  [19fa3120] VertexSafeGraphs v0.1.2                       
  [a5390f91] ZipFile v0.9.3                                
  [ae81ac8f] ASL_jll v0.1.1+4                              
  [6e34b625] Bzip2_jll v1.0.6+5                            
  [9cc047cb] Ipopt_jll v3.13.4+0                           
  [d00139f3] METIS_jll v5.1.0+5                            
  [d7ed1dd3] MUMPS_seq_jll v5.2.1+4                        
  [656ef2d0] OpenBLAS32_jll v0.3.12+1                      
  [efe28fd5] OpenSpecFun_jll v0.5.3+4                      
  [0dad84c5] ArgTools `@stdlib/ArgTools`                   
  [56f22d72] Artifacts `@stdlib/Artifacts`                 
  [2a0f44e3] Base64 `@stdlib/Base64`                       
  [ade2ca70] Dates `@stdlib/Dates`                         
  [8bb1440f] DelimitedFiles `@stdlib/DelimitedFiles`       
  [8ba89e20] Distributed `@stdlib/Distributed`             
  [f43a241f] Downloads `@stdlib/Downloads`                 
  [b77e0a4c] InteractiveUtils `@stdlib/InteractiveUtils`   
  [4af54fe1] LazyArtifacts `@stdlib/LazyArtifacts`         
  [b27032c2] LibCURL `@stdlib/LibCURL`                     
  [76f85450] LibGit2 `@stdlib/LibGit2`                     
  [8f399da3] Libdl `@stdlib/Libdl`                         
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [56ddb016] Logging `@stdlib/Logging`                     
  [d6f4376e] Markdown `@stdlib/Markdown`                   
  [a63ad114] Mmap `@stdlib/Mmap`                           
  [ca575930] NetworkOptions `@stdlib/NetworkOptions`       
  [44cfe95a] Pkg `@stdlib/Pkg`                             
  [de0858da] Printf `@stdlib/Printf`                       
  [3fa0cd96] REPL `@stdlib/REPL`                           
  [9a3f8284] Random `@stdlib/Random`                       
  [ea8e919c] SHA `@stdlib/SHA`                             
  [9e88b42a] Serialization `@stdlib/Serialization`         
  [1a1011a3] SharedArrays `@stdlib/SharedArrays`           
  [6462fe0b] Sockets `@stdlib/Sockets`                     
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [10745b16] Statistics `@stdlib/Statistics`               
  [fa267f1f] TOML `@stdlib/TOML`                           
  [a4e569a6] Tar `@stdlib/Tar`                             
  [8dfed614] Test `@stdlib/Test`                           
  [cf7118a7] UUIDs `@stdlib/UUIDs`                         
  [4ec0a83e] Unicode `@stdlib/Unicode`                     
  [e66e0078] CompilerSupportLibraries_jll `@stdlib/CompilerSupportLibraries_jll`                                       
  [deac9b47] LibCURL_jll `@stdlib/LibCURL_jll`             
  [29816b5a] LibSSH2_jll `@stdlib/LibSSH2_jll`             
  [c8ffd9c3] MbedTLS_jll `@stdlib/MbedTLS_jll`             
  [14a3606d] MozillaCACerts_jll `@stdlib/MozillaCACerts_jll`                                                           
  [83775a58] Zlib_jll `@stdlib/Zlib_jll`                   
  [8e850ede] nghttp2_jll `@stdlib/nghttp2_jll`             
  [3f19e933] p7zip_jll `@stdlib/p7zip_jll`                 
     Testing Running tests...                              
Reading PSSE format                                        
┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`                                                                                                      
└ @ GPUArrays ~/.julia/packages/GPUArrays/WV76E/src/host/indexing.jl:43                                                
Test Summary:        | Pass  Total                         
Problem formulations |   85     85                         
Test Summary:     | Pass  Total                            
Iterative solvers |   36     36                            
Reading PSSE format                                        
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: Passing optimizer attributes as keyword arguments to                                                        
│ `Ipopt.Optimizer` is deprecated. Use                     
│     MOI.set(model, MOI.RawParameter("key"), value)       
│ or                                                       
│     JuMP.set_optimizer_attribute(model, "key", value)    
│ instead.                                                 
└ @ Ipopt ~/.julia/packages/Ipopt/P1XLY/src/MOI_wrapper.jl:88                                                          

******************************************************************************                                         
This program contains Ipopt, a library for large-scale nonlinear optimization.                                         
 Ipopt is released as open source code under the Eclipse Public License (EPL).                                         
         For more information visit https://github.com/coin-or/Ipopt                                                   
******************************************************************************                                         

Test Summary:           | Pass  Total                      
Optimization evaluators |  193    193                      
Test Summary:            | Pass  Total                     
Reduced space algorithms |    2      2                     
     Testing ExaPF tests passed                            

(ExaPF) pkg>                                               

Architecture and SIMD agnostic

Right now, kernel.jl implements a CPU and GPU abstraction. It takes care of either generating CUDA kernels or just a regular function to run on the CPU. There are also other details that make target = "cpu"/"cuda" work. Macros are widely used and the way it is done now allows only for one instantiation of the either CUDA or CPU code.

This should be revisited with another more flexible extraction. Also, what is decided here has an effect of how a framework for nonlinear equations or optimization would like.

Non-deterministic behavior when calling pf multiple times from REPL

I ran into a non-deterministic behavior. Might have been something spurious but noting it here.

target = "cpu" 
include("examples/pf.jl")
include("examples/pf.jl")
datafile = "GO-Data/datasets/Trial_3_Real-Time/Network_13R-015/scenario_11/case.raw"                                                                                                                     
sol, conv, res = pf(datafile, 100)
sol, conv, res = pf(datafile, 100)
sol, conv, res = pf(datafile, 100)

OUTPUT:

julia> target = "cpu"                                                                                                                                                                                              
"cpu"                                                                                                                                                                                                              
                                                                                                                                                                                                                   
julia> include("examples/pf.jl")                                                                                                                                                                                   
[ Info: Precompiling PowerFlow [0cf0e50c-a82e-488f-ac7e-41ffdff1b8aa]                                                                                                                                              
[ Info: Skipping precompilation since __precompile__(false). Importing PowerFlow [0cf0e50c-a82e-488f-ac7e-41ffdff1b8aa].                                                                                           
pf (generic function with 1 method)                                                                                                                                                                                
                                                                                                                                                                                                                   
julia> include("examples/pf.jl")                                                                                                                                                                                   
pf (generic function with 1 method)                                                                                                                                                                                
                                                                                                                                                                                                                   
julia> datafile = "GO-Data/datasets/Trial_3_Real-Time/Network_13R-015/scenario_11/case.raw"                                                                                                                        
"GO-Data/datasets/Trial_3_Real-Time/Network_13R-015/scenario_11/case.raw" 

julia> sol, conv, res = pf(datafile, 100)                                                                                                                                                                          
Target set to cpu                                                                                                                                                                                                  
npartitions = 100                                                                                                                                                                                                  
Blocksize: n = 190.68 Mbytes = 27.73961059570313                                                                                                                                                                   
Partitioning...                                                                                                                                                                                                    
size(A) = (19068, 19068)                                                                                                                                                                                           
Creating matrix                                                                                                                                                                                                    
100 partitions created                                                                                                                                                                                             
Coloring...                                                                                                                                                                                                        
Number of Jacobian colors: 24                                                                                                                                                                                      
Creating arrays...                                                                                                                                                                                                 
Iteration 0. Residual norm: 6.99755.                                                                                                                                                                               
Preconditioner with 100 partitions                                                                                                                                                                                 
Tolerance reached at iteration 187                                                                                                                                                                                 
Iteration 1. Residual norm: 0.748446.                                                                                                                                                                              
Preconditioner with 100 partitions                                                                                                                                                                                 
Tolerance reached at iteration 182                                                                                                                                                                                 
Iteration 2. Residual norm: 0.00999811.                                                                                                                                                                            
Preconditioner with 100 partitions                                                                                                                                                                                 
Tolerance reached at iteration 174                                                                                                                                                                                 
Iteration 3. Residual norm: 1.48092e-06.                                                                                                                                                                           
Preconditioner with 100 partitions                                                                                                                                                                                 
Tolerance reached at iteration 1                                                                                                                                                                                   
Iteration 4. Residual norm: 7.41207e-07.                                                                                                                                                                           
N-R converged in 4 iterations.
 ──────────────────────────────────────────────────────────────────────────────
                                       Time                   Allocations
                               ──────────────────────   ───────────────────────
       Tot / % measured:             367s / 6.37%           17.0GiB / 81.0%

 Section               ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────────────
 Newton                     1    23.1s  99.0%   23.1s   13.7GiB  100%   13.7GiB
   CPU-BICGSTAB             4    12.3s  52.9%   3.09s   1.55GiB  11.3%   397MiB
   Preconditioner           4    9.57s  41.0%   2.39s   11.9GiB  86.8%  2.98GiB
   Jacobian                 4    921ms  3.95%   230ms    177MiB  1.25%  44.2MiB
     Function               4    405ms  1.73%   101ms   59.6MiB  0.42%  14.9MiB
     Before                 4    165ms  0.71%  41.3ms   20.7MiB  0.15%  5.18MiB
       Seeding              4   38.2ms  0.16%  9.54ms   3.73MiB  0.03%   955KiB
     Uncompress             4    106ms  0.46%  26.6ms   60.4MiB  0.43%  15.1MiB
     Get partials           4   82.2ms  0.35%  20.6ms   15.9MiB  0.11%  3.97MiB
   Residual function        4   6.24ms  0.03%  1.56ms      192B  0.00%    48.0B
   Norm                     4    185μs  0.00%  46.4μs      192B  0.00%    48.0B
 Coloring                   1    236ms  1.01%   236ms   64.7MiB  0.46%  64.7MiB
 ──────────────────────────────────────────────────────────────────────────────(Complex{Float64}[1.075782238095634 - 0.08194689497419173im, 1.0766510629186816 - 0.09983266831460846im, 1.043920757073615 - 5.49103
43763173907e-5im, 1.0710716802939646 - 0.13615768251957758im, 1.0711041707137228 - 0.13610482491837006im, 1.0516902702058963 + 0.02577228155565798im, 1.053451801061625 + 0.025933998633740536im, 1.055155837479890
2 - 0.2043446390348763im, 1.0538188521223335 - 0.19550711663748885im, 1.0704709320233152 - 0.07967418915897466im  …  0.7319608443063398 - 0.7601390105560851im, 0.7137795618625866 - 0.7131617888440559im, 0.718946
3033053043 - 0.707952832442697im, 0.7310298350480449 - 0.7599475325141657im, 0.7319608443063398 - 0.7601390105560851im, 0.8086329904117056 - 0.6776968397126093im, 0.7636660778206675 - 0.730404016112231im, 0.8270
028873010823 - 0.6469669422742351im, 0.8123432972154532 - 0.6764291142222301im, 0.8086329904117056 - 0.6776968397126093im], true, 7.412067013490287e-7)

julia> sol, conv, res = pf(datafile, 100)
Target set to cpu
npartitions = 100
Blocksize: n = 190.68 Mbytes = 27.73961059570313
Partitioning...
size(A) = (19068, 19068)
Creating matrix
100 partitions created
Coloring...
Number of Jacobian colors: 25
Creating arrays...
Iteration 0. Residual norm: 6.99755.
ERROR: BoundsError: attempt to access 25×19068 Array{Float64,2} at index [2755, 2]
Stacktrace:
 [1] getindex(::Array{Float64,2}, ::Int64, ::Int64) at ./array.jl:789
 [2] macro expansion at /home/maldonadod/Projects/powerflow.jl/src/ad.jl:85 [inlined]
 [3] macro expansion at /home/maldonadod/.julia/packages/TimerOutputs/NvIUx/src/TimerOutput.jl:229 [inlined]
 [4] residualJacobianAD!(::SparseArrays.SparseMatrixCSC{Float64,Int64}, ::typeof(PowerFlow.residualFunction_polar!), ::PowerFlow.AD.comparrays, ::Array{Int64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::PowerFl
ow.Spmat{Array{T,1} where T}, ::PowerFlow.Spmat{Array{T,1} where T}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Int64,1}, ::Array{Int64,1}, ::Int64, ::TimerOutputs.TimerOutput) at /home/maldonadod/Projects/
powerflow.jl/src/ad.jl:80
 [5] macro expansion at /home/maldonadod/.julia/packages/TimerOutputs/NvIUx/src/TimerOutput.jl:229 [inlined]
 [6] macro expansion at /home/maldonadod/Projects/powerflow.jl/src/PowerFlow.jl:390 [inlined]
 [7] macro expansion at /home/maldonadod/.julia/packages/TimerOutputs/NvIUx/src/TimerOutput.jl:229 [inlined]
 [8] solve(::Pf, ::Int64) at /home/maldonadod/Projects/powerflow.jl/src/PowerFlow.jl:385
 [9] pf(::String, ::Int64) at /home/maldonadod/Projects/powerflow.jl/examples/pf.jl:33
 [10] top-level scope at REPL[4]:1

julia> sol, conv, res = pf(datafile, 100)
Target set to cpu
npartitions = 100
Blocksize: n = 190.68 Mbytes = 27.73961059570313
Partitioning...
size(A) = (19068, 19068)
Creating matrix
100 partitions created
Coloring...
Number of Jacobian colors: 24
Creating arrays...
Iteration 0. Residual norm: 6.99755.
Preconditioner with 100 partitions
Tolerance reached at iteration 187
Iteration 1. Residual norm: 0.748446.
Preconditioner with 100 partitions
Tolerance reached at iteration 182
Iteration 2. Residual norm: 0.00999811.
Preconditioner with 100 partitions
Tolerance reached at iteration 174
Iteration 3. Residual norm: 1.48092e-06.
Preconditioner with 100 partitions
Tolerance reached at iteration 1
Iteration 4. Residual norm: 7.41207e-07.
N-R converged in 4 iterations.
 ──────────────────────────────────────────────────────────────────────────────
                                       Time                   Allocations
                               ──────────────────────   ───────────────────────
       Tot / % measured:            1029s / 4.35%           31.1GiB / 87.8%

 Section               ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────────────
 Newton                     3    44.4s  99.1%   14.8s   27.1GiB  99.5%  9.05GiB
   CPU-BICGSTAB             8    24.4s  54.4%   3.05s   3.03GiB  11.1%   388MiB
   Preconditioner           8    18.3s  40.9%   2.29s   23.8GiB  87.2%  2.97GiB
   Jacobian                 9    1.42s  3.17%   158ms    296MiB  1.06%  32.9MiB
     Function               9    652ms  1.46%  72.5ms    102MiB  0.36%  11.3MiB
     Before                 9    246ms  0.55%  27.4ms   28.2MiB  0.10%  3.13MiB
       Seeding              9   73.5ms  0.16%  8.16ms   6.45MiB  0.02%   734KiB
     Uncompress             9    219ms  0.49%  24.4ms    121MiB  0.43%  13.4MiB
     Get partials           9    142ms  0.32%  15.7ms   25.0MiB  0.09%  2.77MiB
   Residual function        8   13.4ms  0.03%  1.67ms      384B  0.00%    48.0B
   Norm                     8    395μs  0.00%  49.4μs      384B  0.00%    48.0B
 Coloring                   3    396ms  0.88%   132ms    151MiB  0.54%  50.3MiB
 ──────────────────────────────────────────────────────────────────────────────(Complex{Float64}[1.075782238095634 - 0.08194689497419173im, 1.0766510629186816 - 0.09983266831460846im, 1.043920757073615 - 5.49103
43763173907e-5im, 1.0710716802939646 - 0.13615768251957758im, 1.0711041707137228 - 0.13610482491837006im, 1.0516902702058963 + 0.02577228155565798im, 1.053451801061625 + 0.025933998633740536im, 1.055155837479890
2 - 0.2043446390348763im, 1.0538188521223335 - 0.19550711663748885im, 1.0704709320233152 - 0.07967418915897466im  …  0.7319608443063398 - 0.7601390105560851im, 0.7137795618625866 - 0.7131617888440559im, 0.718946
3033053043 - 0.707952832442697im, 0.7310298350480449 - 0.7599475325141657im, 0.7319608443063398 - 0.7601390105560851im, 0.8086329904117056 - 0.6776968397126093im, 0.7636660778206675 - 0.730404016112231im, 0.8270
028873010823 - 0.6469669422742351im, 0.8123432972154532 - 0.6764291142222301im, 0.8086329904117056 - 0.6776968397126093im], true, 7.412067013490287e-7)


julia> sol, conv, res = pf(datafile, 100)
Target set to cpu
npartitions = 100
Blocksize: n = 190.68 Mbytes = 27.73961059570313
Partitioning...
size(A) = (19068, 19068)
Creating matrix
100 partitions created
Coloring...
Number of Jacobian colors: 24
Creating arrays...
Iteration 0. Residual norm: 6.99755.
Preconditioner with 100 partitions
Tolerance reached at iteration 184
Iteration 1. Residual norm: 2.82837.
Preconditioner with 100 partitions
Tolerance reached at iteration 177
Iteration 2. Residual norm: 0.0483687.
Preconditioner with 100 partitions
Tolerance reached at iteration 148
Iteration 3. Residual norm: 6.67291e-06.
Preconditioner with 100 partitions
Tolerance reached at iteration 3
Iteration 4. Residual norm: 1.90235e-07.
N-R converged in 4 iterations.
 ──────────────────────────────────────────────────────────────────────────────
                                       Time                   Allocations
                               ──────────────────────   ───────────────────────
       Tot / % measured:            1079s / 5.99%           44.7GiB / 91.0%

 Section               ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────────────
 Newton                     4    64.1s  99.2%   16.0s   40.5GiB  100%   10.1GiB
   CPU-BICGSTAB            12    35.3s  54.6%   2.94s   4.43GiB  10.9%   378MiB
   Preconditioner          12    27.0s  41.8%   2.25s   35.6GiB  87.7%  2.97GiB
   Jacobian                13    1.58s  2.44%   121ms    372MiB  0.89%  28.6MiB
     Function              13    679ms  1.05%  52.3ms    117MiB  0.28%  9.02MiB
     Uncompress            13    327ms  0.51%  25.2ms    181MiB  0.44%  13.9MiB
     Before                13    261ms  0.40%  20.1ms   28.2MiB  0.07%  2.17MiB
       Seeding             13   78.3ms  0.12%  6.02ms   6.45MiB  0.02%   508KiB
     Get partials          13    147ms  0.23%  11.3ms   25.0MiB  0.06%  1.92MiB
   Residual function       12   19.6ms  0.03%  1.64ms      576B  0.00%    48.0B
   Norm                    12    577μs  0.00%  48.1μs      576B  0.00%    48.0B
 Coloring                   4    485ms  0.75%   121ms    194MiB  0.47%  48.6MiB
 ──────────────────────────────────────────────────────────────────────────────(Complex{Float64}[1.0757822429731894 - 0.08194683324956545im, 1.07665106881237 - 0.09983260656234079im, 1.0439207570756261 - 5.48526
96966641047e-5im, 1.0710716882161502 - 0.13615762136218656im, 1.0711041786328852 - 0.13610476375913055im, 1.0516902688984089 + 0.02577233811727185im, 1.0534517997280568 + 0.02593405540341398im, 1.055155850137720
3 - 0.20434457448041943im, 1.0538188645074775 - 0.19550705047313455im, 1.070470936758665 - 0.07967412801532107im  …  0.7319609044227038 - 0.760138952635622im, 0.7137796182786751 - 0.7131617323790927im, 0.7189463
593095949 - 0.707952775568737im, 0.7310298951489261 - 0.7599474746667808im, 0.7319609044227038 - 0.760138952635622im, 0.8086330432071404 - 0.6776967766558373im, 0.7636661345950334 - 0.730403956670116im, 0.827002
9377315538 - 0.646966877810124im, 0.8123433499223206 - 0.6764290508686757im, 0.8086330432071404 - 0.6776967766558373im], true, 1.9023469910450785e-7)

Change indexing of control variable

Currently, the control u is indexed as:
u = [vmag[ref]; pg[pv]; vmag[pv]]
I believe it would be more consistent to order the voltage together, such as
u = [vmag[ref]; vmag[pv]; pg[pv]].
That would avoid ugly expressions like
image
and simplify some part of the code.

Implement ramping constraints for active power generation

To support time linking constraints in ProxAL, ExaPF should support internally ramping constraints for the active power generations p_g.
The ramping constraint writes, for each generator g:

| pg - pg_previous | <= ramp_agc

with ramp_agc a fixed constant, and pg_previous a value we should be able to update.

The absolute constraint rewrites equivalently as two constraints :

  pg - pg_previous  <= ramp_agc
- pg + pg_previous  <= ramp_agc

We could have different paths onward to implement that in ExaPF:

  • implement a new set of constraints called ramping_constraints, adding all ramping constraints all in once. Even if the ramping constraints are purely linear, that comes at the expense to add 2 * n_g constraints to the model, thus impacting directly the resolution algorithm
  • another solution is to update directly the bounds on the active power generation pg: pg_min and pg_max. That would require to be able to modify the bounds directly through ExaPF's API. At least, this solution would not add any new constraint to the optimization problem.

designJacobianAD: First call is slow on large instances (with > 1000 buses)

The first call to designJacobianAD tends to be very slow for large instances (with more than 1,000 buses).
https://github.com/exanauts/ExaPF.jl/blob/dev/rgm/src/ExaPF.jl#L688L689
I suspect it's something going on with the precompilation. Once the function precompiled, the evaluation of designJacobianAD becomes fast again.

Some observations:

  • this is not due to the closure: if I call directly AD.designJacobianAD at the end of the function solve the first call is as slow as when we are calling through the closure
  • I am wondering if this is an issue with KernelAbstractions.jl

Implement getters/setters

To integrate with ProxAL.jl (ref exanauts/ProxAL.jl#3), we need to implement proper getters/setters at the API level. The goal is to get/modify the values of the problem inplace, ala JuMP:

value.(model[:Pg]), value.(model[:Qg])

where model is a proper ExaPF object. We could do that at different level (level 1: PowerNetwork; level 2: PolarForm, level3: AbstractNLPFormulation). I think we should discuss where it makes more sense to implement the getters/setters. I see two solutions:
1- integrate them directly at the 2nd level (PolarForm)
2- create a new object OptimizationModel that would mimic a JuMP.Model. A prototype could be:

abstract type AbstractOptimizationModel end
struct OptimizationModel <: AbstractOptimizationModel
    solution::Dict
    evaluator::AbstractNLPEvaluator
end
function getfield(model::OptimizationModel, symbol)
    ...
end

A point to discuss is how to remove a line properly in the network.

DesignJacobianAD returns a wrong Jacobian when evaluated on the GPU

It appears that the evaluation of the design Jacobian returns a wrong result, and returns a result different than on the GPU.

A MWE is:

@testset "Test AD on GPU" begin
    datafile = "test/data/case9.m"
    tolerance = 1e-8
    pf = PowerSystem.PowerNetwork(datafile, 1)
    polar = ExaPF.PolarForm(pf, CUDADevice())

    x0 = ExaPF.initial(polar, State())
    u0 = ExaPF.initial(polar, Control())
    p = ExaPF.initial(polar, Parameters())

    jx, ju = ExaPF.init_ad_factory(polar, x0, u0, p)

    # solve power flow
    xk, conv = ExaPF.powerflow(polar, jx, x0, u0, p, tol=1e-12)
    # No need to recompute ∇gₓ
    ∇gₓ = jx.J
    ∇gᵤ = ExaPF.jacobian(polar, ju, xk, u0, p)

    function residualFunction_x!(vecx)
        nx = ExaPF.get(polar, NumberOfState())
        nu = ExaPF.get(polar, NumberOfControl())
        x_ = CuVector{eltype(vecx)}(undef, nx)
        u_ = CuVector{eltype(vecx)}(undef, nu)
        x_ .= vecx[1:length(x)]
        u_ .= vecx[length(x)+1:end]
        g = ExaPF.power_balance(polar, x_, u_, p; V=eltype(x_))
        return g
    end

    x, u = xk, u0
    vecx = CuVector{Float64}(undef, length(x) + length(u))
    vecx[1:length(x)] .= x
    vecx[length(x)+1:end] .= u
    fjac = vecx -> ForwardDiff.jacobian(residualFunction_x!, vecx)
    jac = fjac(vecx)
    jacx = sparse(jac[:,1:length(x)])
    jacu = sparse(jac[:,length(x)+1:end])
    # @info("j", Array(∇gᵤ))
    # @info("j", Array(jacu))
    # This test is passing
    @test isapprox(∇gₓ, jacx, rtol=1e-5)
    # Not this one! 
    @test isapprox(∇gᵤ, jacu, rtol=1e-5)
end

Implement line power constraints

To integrate ExaPF with ProxAL.jl (see exanauts/ProxAL.jl#3), we need to implement line power constraints in the polar formulation implemented in ExaPF.jl. The procedure is:

  • determine how to formulate the line flow constraints. For information, ProxAL.jl is using the following formulation:
        #branch apparent power limits (from bus)
        Yff_abs2=YffR[l]^2+YffI[l]^2; Yft_abs2=YftR[l]^2+YftI[l]^2
        Yre=YffR[l]*YftR[l]+YffI[l]*YftI[l]; Yim=-YffR[l]*YftI[l]+YffI[l]*YftR[l]
        @NLconstraint(opfmodel,
            Vm[from]^2 *
            ( Yff_abs2*Vm[from]^2 + Yft_abs2*Vm[to]^2
            + 2*Vm[from]*Vm[to]*(Yre*cos(Va[from]-Va[to])-Yim*sin(Va[from]-Va[to]))
            )
            - flowmax
            - (sigma_lineFr[l]/baseMVA)
            <=0
        )

        #branch apparent power limits (to bus)
        Ytf_abs2=YtfR[l]^2+YtfI[l]^2; Ytt_abs2=YttR[l]^2+YttI[l]^2
        Yre=YtfR[l]*YttR[l]+YtfI[l]*YttI[l]; Yim=-YtfR[l]*YttI[l]+YtfI[l]*YttR[l]
        @NLconstraint(opfmodel,
            Vm[to]^2 *
            ( Ytf_abs2*Vm[from]^2 + Ytt_abs2*Vm[to]^2
            + 2*Vm[from]*Vm[to]*(Yre*cos(Va[from]-Va[to])-Yim*sin(Va[from]-Va[to]))
            )
            - flowmax
            - (sigma_lineTo[l]/baseMVA)
            <=0
        )
function line_power_constraint(polar::PolarForm, g, buffer)
    ...
    return
end
is_constraint(::typeof(line_power_constraint)) = true
size_constraint(polar::PolarForm{T, IT, VT, AT}, ::typeof(line_power_constraint)) where {T, IT, VT, AT} = ...
function bounds(polar::PolarForm, ::typeof(line_power_constraint))
    ...
end

function jacobian(polar::PolarForm, ::typeof(line_power_constraint), i_cons, ∂jac, buffer)
    ...
end
function jtprod(polar::PolarForm, ::typeof(line_power_constraint), ∂jac, buffer, v)
    ...
end

Implement a logger

At some point, we will need a logger to avoid being flooded by the log.

We could use the Julia built-in logger system or a dedicated logger as Memento. What do you think?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.