exanauts / exapf.jl Goto Github PK

View Code? Open in Web Editor NEW

58.0 58.0 5.0 11.53 MB

A Power Flow Solver for GPUs in Julia

License: MIT License

Julia 99.92% Makefile 0.08%

framework

exapf.jl's People

Contributors

Stargazers

Watchers

Forkers

cuihantao lcw standardgalactic amontoison curent2

exapf.jl's Issues

Support contingencies

We should allow the user to remove some lines when creating a model PolarForm.
Removing a line is easier than removing a bus (as we have less issues with indexing afterwards).
A plan could be:

remove the lines directly in PolarForm's constructor, according to the indexes specified in a contingencies argument. We need mostly to re-engineer the following line:
https://github.com/exanauts/ExaPF.jl/blob/master/src/models/polar/polar.jl#L47
check that everything works as expected, as the topology of the network in PolarForm would now differ slightly from the model specified in the underlying PS.PowerNetwork, storing the original data

Wrong evaluation of reduced gradient on the GPU

Improve the computation of the adjoints on the GPU

At the moment, ExaPF is using its own hand-coded adjoints. Each adjoint proceeds in two steps, sequentially:

compute the adjoints w.r.t. edges
aggregate the edges' adjoints on the nodes

I think we could improve the existing implementation:

for the first step (w.r.t. edges) I think it would be more efficient to parallelize w.r.t. the edges directly, thus avoiding the for loop inside the kernel. We could use directly the "from" and the "to" arrays (with size equal to nnz) that we use for the line flow constraints. As we have one element per branch, we won't have any race condition
for the second step (w.rt. nodes) I don"t think we need to compute the transpose of the admittance matrix, at all. Indeed, the node-node admittance Ybus is symmetric.

I think that could simplify a lot the adjoint kernels, and speed-up the computation as it would induce less allocations.

[Evaluator] Add reduced Hessian for ProxALEvaluator

Add support to single precision

Currently, only Float64 is supported. We should add support to Float32 as well.
PolarForm is already parameterized to support different precision, with its signature:

PolarForm{T, VI, VT, MT}

where T is specifying the type.

Now we should just render the implementation generic.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Efficient transpose Jacobian vector product for reactive power generation

When we are using an Augmented Lagrangian algorithm, we need to evaluate fast the transpose Jacobian vector product of the constraints. It appears that the J' v is not well implemented when it comes to evaluate the Jacobian of the reactive power generation qg. Indeed, we are computing the full Jacobian, in a sequential manner:
https://github.com/exanauts/ExaPF.jl/blob/develop/src/polar/constraints.jl#L231-L238

We should think of a better way to compute that.

Race condition in views with same indices on GPUs

We have a race condition on the GPU for views:
https://github.com/FluxML/Zygote.jl/blob/2b17256e79b2eca9a6512207284219d279398fc9/src/lib/array.jl#L35
JuliaLang/julia#31407
JuliaGPU/CUDA.jl#89

Unable to run code in the Quick Start

In reviewing your JuliaCon submission, JuliaCon/proceedings-review#72, I have been exercising you code. I have been unable to run the code described in the quick start, both on the master branch and on the develop branch.

First I started making changes like the following

diff --git a/docs/src/quickstart.md b/docs/src/quickstart.md
index a9fc2c7..491fe75 100644
--- a/docs/src/quickstart.md
+++ b/docs/src/quickstart.md
@@ -69,6 +69,7 @@ in a few lines of code.
 We first instantiate a `PolarForm` object to adopt a polar formulation
 as a model:
 ```julia-repl
+julia> using KernelAbstractions
 julia> polar = PolarForm(pf, CPU())

@@ -90,7 +91,7 @@ Hence, the algorithm requires the following elements:

that translate to the Julia code:

-julia> physical_state = get(polar, PhysicalState())
+julia> physical_state = get(polar, ExaPF.PhysicalState())
julia> jx = ExaPF.init_ad_factory(polar, physical_state)
julia> linear_solver = DirectSolver()

but I ran into the issue that ERROR: UndefVarError: init_ad_factory not defined. I am stuck here.

p.s. it looks like the nested code blocks are messing up the formatting, not sure how to fix that.

Non-deterministic behavior when calling pf multiple times from REPL

I ran into a non-deterministic behavior. Might have been something spurious but noting it here.

target = "cpu" 
include("examples/pf.jl")
include("examples/pf.jl")
datafile = "GO-Data/datasets/Trial_3_Real-Time/Network_13R-015/scenario_11/case.raw"                                                                                                                     
sol, conv, res = pf(datafile, 100)
sol, conv, res = pf(datafile, 100)
sol, conv, res = pf(datafile, 100)

OUTPUT:

julia> target = "cpu"                                                                                                                                                                                              
"cpu"                                                                                                                                                                                                              
                                                                                                                                                                                                                   
julia> include("examples/pf.jl")                                                                                                                                                                                   
[ Info: Precompiling PowerFlow [0cf0e50c-a82e-488f-ac7e-41ffdff1b8aa]                                                                                                                                              
[ Info: Skipping precompilation since __precompile__(false). Importing PowerFlow [0cf0e50c-a82e-488f-ac7e-41ffdff1b8aa].                                                                                           
pf (generic function with 1 method)                                                                                                                                                                                
                                                                                                                                                                                                                   
julia> include("examples/pf.jl")                                                                                                                                                                                   
pf (generic function with 1 method)                                                                                                                                                                                
                                                                                                                                                                                                                   
julia> datafile = "GO-Data/datasets/Trial_3_Real-Time/Network_13R-015/scenario_11/case.raw"                                                                                                                        
"GO-Data/datasets/Trial_3_Real-Time/Network_13R-015/scenario_11/case.raw" 

julia> sol, conv, res = pf(datafile, 100)                                                                                                                                                                          
Target set to cpu                                                                                                                                                                                                  
npartitions = 100                                                                                                                                                                                                  
Blocksize: n = 190.68 Mbytes = 27.73961059570313                                                                                                                                                                   
Partitioning...                                                                                                                                                                                                    
size(A) = (19068, 19068)                                                                                                                                                                                           
Creating matrix                                                                                                                                                                                                    
100 partitions created                                                                                                                                                                                             
Coloring...                                                                                                                                                                                                        
Number of Jacobian colors: 24                                                                                                                                                                                      
Creating arrays...                                                                                                                                                                                                 
Iteration 0. Residual norm: 6.99755.                                                                                                                                                                               
Preconditioner with 100 partitions                                                                                                                                                                                 
Tolerance reached at iteration 187                                                                                                                                                                                 
Iteration 1. Residual norm: 0.748446.                                                                                                                                                                              
Preconditioner with 100 partitions                                                                                                                                                                                 
Tolerance reached at iteration 182                                                                                                                                                                                 
Iteration 2. Residual norm: 0.00999811.                                                                                                                                                                            
Preconditioner with 100 partitions                                                                                                                                                                                 
Tolerance reached at iteration 174                                                                                                                                                                                 
Iteration 3. Residual norm: 1.48092e-06.                                                                                                                                                                           
Preconditioner with 100 partitions                                                                                                                                                                                 
Tolerance reached at iteration 1                                                                                                                                                                                   
Iteration 4. Residual norm: 7.41207e-07.                                                                                                                                                                           
N-R converged in 4 iterations.
 ──────────────────────────────────────────────────────────────────────────────
                                       Time                   Allocations
                               ──────────────────────   ───────────────────────
       Tot / % measured:             367s / 6.37%           17.0GiB / 81.0%

 Section               ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────────────
 Newton                     1    23.1s  99.0%   23.1s   13.7GiB  100%   13.7GiB
   CPU-BICGSTAB             4    12.3s  52.9%   3.09s   1.55GiB  11.3%   397MiB
   Preconditioner           4    9.57s  41.0%   2.39s   11.9GiB  86.8%  2.98GiB
   Jacobian                 4    921ms  3.95%   230ms    177MiB  1.25%  44.2MiB
     Function               4    405ms  1.73%   101ms   59.6MiB  0.42%  14.9MiB
     Before                 4    165ms  0.71%  41.3ms   20.7MiB  0.15%  5.18MiB
       Seeding              4   38.2ms  0.16%  9.54ms   3.73MiB  0.03%   955KiB
     Uncompress             4    106ms  0.46%  26.6ms   60.4MiB  0.43%  15.1MiB
     Get partials           4   82.2ms  0.35%  20.6ms   15.9MiB  0.11%  3.97MiB
   Residual function        4   6.24ms  0.03%  1.56ms      192B  0.00%    48.0B
   Norm                     4    185μs  0.00%  46.4μs      192B  0.00%    48.0B
 Coloring                   1    236ms  1.01%   236ms   64.7MiB  0.46%  64.7MiB
 ──────────────────────────────────────────────────────────────────────────────(Complex{Float64}[1.075782238095634 - 0.08194689497419173im, 1.0766510629186816 - 0.09983266831460846im, 1.043920757073615 - 5.49103
43763173907e-5im, 1.0710716802939646 - 0.13615768251957758im, 1.0711041707137228 - 0.13610482491837006im, 1.0516902702058963 + 0.02577228155565798im, 1.053451801061625 + 0.025933998633740536im, 1.055155837479890
2 - 0.2043446390348763im, 1.0538188521223335 - 0.19550711663748885im, 1.0704709320233152 - 0.07967418915897466im  …  0.7319608443063398 - 0.7601390105560851im, 0.7137795618625866 - 0.7131617888440559im, 0.718946
3033053043 - 0.707952832442697im, 0.7310298350480449 - 0.7599475325141657im, 0.7319608443063398 - 0.7601390105560851im, 0.8086329904117056 - 0.6776968397126093im, 0.7636660778206675 - 0.730404016112231im, 0.8270
028873010823 - 0.6469669422742351im, 0.8123432972154532 - 0.6764291142222301im, 0.8086329904117056 - 0.6776968397126093im], true, 7.412067013490287e-7)

julia> sol, conv, res = pf(datafile, 100)
Target set to cpu
npartitions = 100
Blocksize: n = 190.68 Mbytes = 27.73961059570313
Partitioning...
size(A) = (19068, 19068)
Creating matrix
100 partitions created
Coloring...
Number of Jacobian colors: 25
Creating arrays...
Iteration 0. Residual norm: 6.99755.
ERROR: BoundsError: attempt to access 25×19068 Array{Float64,2} at index [2755, 2]
Stacktrace:
 [1] getindex(::Array{Float64,2}, ::Int64, ::Int64) at ./array.jl:789
 [2] macro expansion at /home/maldonadod/Projects/powerflow.jl/src/ad.jl:85 [inlined]
 [3] macro expansion at /home/maldonadod/.julia/packages/TimerOutputs/NvIUx/src/TimerOutput.jl:229 [inlined]
 [4] residualJacobianAD!(::SparseArrays.SparseMatrixCSC{Float64,Int64}, ::typeof(PowerFlow.residualFunction_polar!), ::PowerFlow.AD.comparrays, ::Array{Int64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::PowerFl
ow.Spmat{Array{T,1} where T}, ::PowerFlow.Spmat{Array{T,1} where T}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Int64,1}, ::Array{Int64,1}, ::Int64, ::TimerOutputs.TimerOutput) at /home/maldonadod/Projects/
powerflow.jl/src/ad.jl:80
 [5] macro expansion at /home/maldonadod/.julia/packages/TimerOutputs/NvIUx/src/TimerOutput.jl:229 [inlined]
 [6] macro expansion at /home/maldonadod/Projects/powerflow.jl/src/PowerFlow.jl:390 [inlined]
 [7] macro expansion at /home/maldonadod/.julia/packages/TimerOutputs/NvIUx/src/TimerOutput.jl:229 [inlined]
 [8] solve(::Pf, ::Int64) at /home/maldonadod/Projects/powerflow.jl/src/PowerFlow.jl:385
 [9] pf(::String, ::Int64) at /home/maldonadod/Projects/powerflow.jl/examples/pf.jl:33
 [10] top-level scope at REPL[4]:1

julia> sol, conv, res = pf(datafile, 100)
Target set to cpu
npartitions = 100
Blocksize: n = 190.68 Mbytes = 27.73961059570313
Partitioning...
size(A) = (19068, 19068)
Creating matrix
100 partitions created
Coloring...
Number of Jacobian colors: 24
Creating arrays...
Iteration 0. Residual norm: 6.99755.
Preconditioner with 100 partitions
Tolerance reached at iteration 187
Iteration 1. Residual norm: 0.748446.
Preconditioner with 100 partitions
Tolerance reached at iteration 182
Iteration 2. Residual norm: 0.00999811.
Preconditioner with 100 partitions
Tolerance reached at iteration 174
Iteration 3. Residual norm: 1.48092e-06.
Preconditioner with 100 partitions
Tolerance reached at iteration 1
Iteration 4. Residual norm: 7.41207e-07.
N-R converged in 4 iterations.
 ──────────────────────────────────────────────────────────────────────────────
                                       Time                   Allocations
                               ──────────────────────   ───────────────────────
       Tot / % measured:            1029s / 4.35%           31.1GiB / 87.8%

 Section               ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────────────
 Newton                     3    44.4s  99.1%   14.8s   27.1GiB  99.5%  9.05GiB
   CPU-BICGSTAB             8    24.4s  54.4%   3.05s   3.03GiB  11.1%   388MiB
   Preconditioner           8    18.3s  40.9%   2.29s   23.8GiB  87.2%  2.97GiB
   Jacobian                 9    1.42s  3.17%   158ms    296MiB  1.06%  32.9MiB
     Function               9    652ms  1.46%  72.5ms    102MiB  0.36%  11.3MiB
     Before                 9    246ms  0.55%  27.4ms   28.2MiB  0.10%  3.13MiB
       Seeding              9   73.5ms  0.16%  8.16ms   6.45MiB  0.02%   734KiB
     Uncompress             9    219ms  0.49%  24.4ms    121MiB  0.43%  13.4MiB
     Get partials           9    142ms  0.32%  15.7ms   25.0MiB  0.09%  2.77MiB
   Residual function        8   13.4ms  0.03%  1.67ms      384B  0.00%    48.0B
   Norm                     8    395μs  0.00%  49.4μs      384B  0.00%    48.0B
 Coloring                   3    396ms  0.88%   132ms    151MiB  0.54%  50.3MiB
 ──────────────────────────────────────────────────────────────────────────────(Complex{Float64}[1.075782238095634 - 0.08194689497419173im, 1.0766510629186816 - 0.09983266831460846im, 1.043920757073615 - 5.49103
43763173907e-5im, 1.0710716802939646 - 0.13615768251957758im, 1.0711041707137228 - 0.13610482491837006im, 1.0516902702058963 + 0.02577228155565798im, 1.053451801061625 + 0.025933998633740536im, 1.055155837479890
2 - 0.2043446390348763im, 1.0538188521223335 - 0.19550711663748885im, 1.0704709320233152 - 0.07967418915897466im  …  0.7319608443063398 - 0.7601390105560851im, 0.7137795618625866 - 0.7131617888440559im, 0.718946
3033053043 - 0.707952832442697im, 0.7310298350480449 - 0.7599475325141657im, 0.7319608443063398 - 0.7601390105560851im, 0.8086329904117056 - 0.6776968397126093im, 0.7636660778206675 - 0.730404016112231im, 0.8270
028873010823 - 0.6469669422742351im, 0.8123432972154532 - 0.6764291142222301im, 0.8086329904117056 - 0.6776968397126093im], true, 7.412067013490287e-7)


julia> sol, conv, res = pf(datafile, 100)
Target set to cpu
npartitions = 100
Blocksize: n = 190.68 Mbytes = 27.73961059570313
Partitioning...
size(A) = (19068, 19068)
Creating matrix
100 partitions created
Coloring...
Number of Jacobian colors: 24
Creating arrays...
Iteration 0. Residual norm: 6.99755.
Preconditioner with 100 partitions
Tolerance reached at iteration 184
Iteration 1. Residual norm: 2.82837.
Preconditioner with 100 partitions
Tolerance reached at iteration 177
Iteration 2. Residual norm: 0.0483687.
Preconditioner with 100 partitions
Tolerance reached at iteration 148
Iteration 3. Residual norm: 6.67291e-06.
Preconditioner with 100 partitions
Tolerance reached at iteration 3
Iteration 4. Residual norm: 1.90235e-07.
N-R converged in 4 iterations.
 ──────────────────────────────────────────────────────────────────────────────
                                       Time                   Allocations
                               ──────────────────────   ───────────────────────
       Tot / % measured:            1079s / 5.99%           44.7GiB / 91.0%

 Section               ncalls     time   %tot     avg     alloc   %tot      avg
 ──────────────────────────────────────────────────────────────────────────────
 Newton                     4    64.1s  99.2%   16.0s   40.5GiB  100%   10.1GiB
   CPU-BICGSTAB            12    35.3s  54.6%   2.94s   4.43GiB  10.9%   378MiB
   Preconditioner          12    27.0s  41.8%   2.25s   35.6GiB  87.7%  2.97GiB
   Jacobian                13    1.58s  2.44%   121ms    372MiB  0.89%  28.6MiB
     Function              13    679ms  1.05%  52.3ms    117MiB  0.28%  9.02MiB
     Uncompress            13    327ms  0.51%  25.2ms    181MiB  0.44%  13.9MiB
     Before                13    261ms  0.40%  20.1ms   28.2MiB  0.07%  2.17MiB
       Seeding             13   78.3ms  0.12%  6.02ms   6.45MiB  0.02%   508KiB
     Get partials          13    147ms  0.23%  11.3ms   25.0MiB  0.06%  1.92MiB
   Residual function       12   19.6ms  0.03%  1.64ms      576B  0.00%    48.0B
   Norm                    12    577μs  0.00%  48.1μs      576B  0.00%    48.0B
 Coloring                   4    485ms  0.75%   121ms    194MiB  0.47%  48.6MiB
 ──────────────────────────────────────────────────────────────────────────────(Complex{Float64}[1.0757822429731894 - 0.08194683324956545im, 1.07665106881237 - 0.09983260656234079im, 1.0439207570756261 - 5.48526
96966641047e-5im, 1.0710716882161502 - 0.13615762136218656im, 1.0711041786328852 - 0.13610476375913055im, 1.0516902688984089 + 0.02577233811727185im, 1.0534517997280568 + 0.02593405540341398im, 1.055155850137720
3 - 0.20434457448041943im, 1.0538188645074775 - 0.19550705047313455im, 1.070470936758665 - 0.07967412801532107im  …  0.7319609044227038 - 0.760138952635622im, 0.7137796182786751 - 0.7131617323790927im, 0.7189463
593095949 - 0.707952775568737im, 0.7310298951489261 - 0.7599474746667808im, 0.7319609044227038 - 0.760138952635622im, 0.8086330432071404 - 0.6776967766558373im, 0.7636661345950334 - 0.730403956670116im, 0.827002
9377315538 - 0.646966877810124im, 0.8123433499223206 - 0.6764290508686757im, 0.8086330432071404 - 0.6776967766558373im], true, 1.9023469910450785e-7)

Multiple generators per bus

When we solve the power flow we usually have one equation per bus - we merge all the power injections in a bus.

In the OPF we have to keep track of each generator power injection to include in the cost function.

Or current formulation is unable to deal with this case. This affects the pegase and rte cases.

Refactor power balance equations

Right now power balance equations are sort-of embedded on the solve() routine. The Jacobians are created inside too. In order to decouple, we should implement the power balance equations as a stand-alone function of the form

g(pf, x, u, p)

Move parse.jl to powersystems.jl

We try to unify powersystems.jl

Implement overlapping for the pre-conditioner

Overlaping might enhance linear solver performance. Implementing overlapping will require changing the structure of the preconditioned to make it "matrix free".

Reviewers: @michel2323 , @frapac

Disallow scalar operations on GPU

I believe we could get some improvements by ensuring that no scalar operations are performed on the GPU.

Trigger Julia Registrator

Refactor tests

As a wise man once said:

I don't test the tests anymore since I don't know what to test

I would suggest to split the runtests.jl file in several files, to test apart:

the behavior of PowerSystem
the parsing from PSSE and Matpower
the behavior of the costs function, the residual function and the other constraints we have to consider
the preconditioners
the Newton-Raphson algorithm (both on CPU and GPU)
the computation of the reduced gradients
the OPF resolution

Non-deterministic out-of-bounds error in AD backend

Full log:

Active constraints: Error During Test at /home/runner/work/ExaPF.jl/ExaPF.jl/test/penalty.jl:38
  Got exception outside of a @test
  BoundsError: attempt to access 14×106 Array{Float64,2} at index [65, 1]
  Stacktrace:
   [1] getindex at ./array.jl:810 [inlined]
   [2] uncompress!(::SparseMatrixCSC{Float64,Int64}, ::Array{Float64,2}, ::Array{Int64,1}) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/ad.jl:412
   [3] macro expansion at /home/runner/work/ExaPF.jl/ExaPF.jl/src/ad.jl:490 [inlined]
   [4] macro expansion at /home/runner/.julia/packages/TimerOutputs/ZmKD7/src/TimerOutput.jl:190 [inlined]
   [5] residualJacobianAD!(::ExaPF.AD.StateJacobianAD{Array{Int64,1},Array{Float64,1},Array{Float64,2},SparseMatrixCSC,Array{ForwardDiff.Partials{14,Float64},1},Array{ForwardDiff.Dual{Nothing,Float64,14},1},SubArray{Float64,1,Array{Float64,1},Tuple{Array{Int64,1}},false},SubArray{ForwardDiff.Dual{Nothing,Float64,14},1,Array{ForwardDiff.Dual{Nothing,Float64,14},1},Tuple{Array{Int64,1}},false}}, ::typeof(ExaPF.residualFunction_polar!), ::Array{Float64,1}, ::Array{Float64,1}, ::ExaPF.Spmat{Array{Int64,1},Array{Float64,1}}, ::ExaPF.Spmat{Array{Int64,1},Array{Float64,1}}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Int64,1}, ::Array{Int64,1}, ::Array{Int64,1}, ::Int64, ::TimerOutput) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/ad.jl:489
   [6] macro expansion at /home/runner/work/ExaPF.jl/ExaPF.jl/src/models/polar/polar.jl:317 [inlined]
   [7] macro expansion at /home/runner/.julia/packages/TimerOutputs/ZmKD7/src/TimerOutput.jl:190 [inlined]
   [8] macro expansion at /home/runner/work/ExaPF.jl/ExaPF.jl/src/models/polar/polar.jl:316 [inlined]
   [9] macro expansion at /home/runner/.julia/packages/TimerOutputs/ZmKD7/src/TimerOutput.jl:190 [inlined]
   [10] powerflow(::PolarForm{Float64,Array{Int64,1},Array{Float64,1},Array{Float64,2}}, ::ExaPF.AD.StateJacobianAD{Array{Int64,1},Array{Float64,1},Array{Float64,2},SparseMatrixCSC,Array{ForwardDiff.Partials{14,Float64},1},Array{ForwardDiff.Dual{Nothing,Float64,14},1},SubArray{Float64,1,Array{Float64,1},Tuple{Array{Int64,1}},false},SubArray{ForwardDiff.Dual{Nothing,Float64,14},1,Array{ForwardDiff.Dual{Nothing,Float64,14},1},Tuple{Array{Int64,1}},false}}, ::ExaPF.PolarNetworkState{Array{Float64,1}}; solver::ExaPF.LinearSolvers.DirectSolver, tol::Float64, maxiter::Int64, verbose_level::Int64) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/models/polar/polar.jl:312
   [11] update!(::ExaPF.ReducedSpaceEvaluator{Float64}, ::Array{Float64,1}; verbose_level::Int64) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/Evaluators/reduced_evaluator.jl:78
   [12] update! at /home/runner/work/ExaPF.jl/ExaPF.jl/src/Evaluators/reduced_evaluator.jl:73 [inlined]
   [13] ExaPF.MaxScaler(::ExaPF.ReducedSpaceEvaluator{Float64}, ::Array{Float64,1}; η::Float64, tol::Float64) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/Evaluators/common.jl:64
   [14] MaxScaler at /home/runner/work/ExaPF.jl/ExaPF.jl/src/Evaluators/common.jl:62 [inlined]
   [15] ExaPF.PenaltyEvaluator(::ExaPF.ReducedSpaceEvaluator{Float64}, ::Array{Float64,1}; scale::Bool, penalties::Array{Float64,1}, c₀::Float64) at /home/runner/work/ExaPF.jl/ExaPF.jl/src/Evaluators/penalty.jl:28
   [16] top-level scope at /home/runner/work/ExaPF.jl/ExaPF.jl/test/penalty.jl:51
   [17] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [18] top-level scope at /home/runner/work/ExaPF.jl/ExaPF.jl/test/penalty.jl:39
   [19] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [20] top-level scope at /home/runner/work/ExaPF.jl/ExaPF.jl/test/penalty.jl:2
   [21] include(::String) at ./client.jl:457
   [22] top-level scope at /home/runner/work/ExaPF.jl/ExaPF.jl/test/runtests.jl:35
   [23] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [24] top-level scope at /home/runner/work/ExaPF.jl/ExaPF.jl/test/runtests.jl:30
   [25] include(::String) at ./client.jl:457
   [26] top-level scope at none:6
   [27] eval(::Module, ::Any) at ./boot.jl:331
   [28] exec_options(::Base.JLOptions) at ./client.jl:272
   [29] _start() at ./client.jl:506

DesignJacobianAD returns a wrong Jacobian when evaluated on the GPU

It appears that the evaluation of the design Jacobian returns a wrong result, and returns a result different than on the GPU.

A MWE is:

@testset "Test AD on GPU" begin
    datafile = "test/data/case9.m"
    tolerance = 1e-8
    pf = PowerSystem.PowerNetwork(datafile, 1)
    polar = ExaPF.PolarForm(pf, CUDADevice())

    x0 = ExaPF.initial(polar, State())
    u0 = ExaPF.initial(polar, Control())
    p = ExaPF.initial(polar, Parameters())

    jx, ju = ExaPF.init_ad_factory(polar, x0, u0, p)

    # solve power flow
    xk, conv = ExaPF.powerflow(polar, jx, x0, u0, p, tol=1e-12)
    # No need to recompute ∇gₓ
    ∇gₓ = jx.J
    ∇gᵤ = ExaPF.jacobian(polar, ju, xk, u0, p)

    function residualFunction_x!(vecx)
        nx = ExaPF.get(polar, NumberOfState())
        nu = ExaPF.get(polar, NumberOfControl())
        x_ = CuVector{eltype(vecx)}(undef, nx)
        u_ = CuVector{eltype(vecx)}(undef, nu)
        x_ .= vecx[1:length(x)]
        u_ .= vecx[length(x)+1:end]
        g = ExaPF.power_balance(polar, x_, u_, p; V=eltype(x_))
        return g
    end

    x, u = xk, u0
    vecx = CuVector{Float64}(undef, length(x) + length(u))
    vecx[1:length(x)] .= x
    vecx[length(x)+1:end] .= u
    fjac = vecx -> ForwardDiff.jacobian(residualFunction_x!, vecx)
    jac = fjac(vecx)
    jacx = sparse(jac[:,1:length(x)])
    jacu = sparse(jac[:,length(x)+1:end])
    # @info("j", Array(∇gᵤ))
    # @info("j", Array(jacu))
    # This test is passing
    @test isapprox(∇gₓ, jacx, rtol=1e-5)
    # Not this one! 
    @test isapprox(∇gᵤ, jacu, rtol=1e-5)
end

Documentation

What do you guys think about documentation? Yes/No? If "Yes", any particular system that you prefer?

Allow for flexible indexing of buses

The Jacobian matrix structure depends on the indexing of the buses and the way we order the variable vector (i.e. [v1 v2 a1 a2] or [v1 a1 v2 a2], etc.). Having control over this might help us both enhance the performance of the linear solver and decrease divergent scenarios.

Reviewers: @michel2323 , @frapac

Error in cost_gradients: "BoundsError: attempt to access 0-element Array at index 1"

When we run the OPF on some instances (case200_activ, case1888_rte) we get an error we first call the function cost_gradient. The stacktrace is:

ERROR: LoadError: BoundsError: attempt to access 0-element Array{Int64,1} at index [1]
Stacktrace:
 [1] getindex(::Array{Int64,1}, ::Int64) at ./array.jl:809
 [2] cost_gradients(::ExaPF.PowerSystem.PowerNetwork, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::KernelAbstractions.CPU) at /home/fpacaud/exa/ExaPF.jl/src/ExaPF.jl:470
 [3] cost_gradients at /home/fpacaud/exa/ExaPF.jl/src/ExaPF.jl:407 [inlined]
 [4] build_callback(::ExaPF.PowerSystem.PowerNetwork, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}) at /home/fpacaud/exa/ExaPF.jl/scripts/reduced_ipopt.jl:73
 [5] run_reduced_ipopt(::String; hessian::Bool, cons::Bool) at /home/fpacaud/exa/ExaPF.jl/scripts/reduced_ipopt.jl:270
 [6] top-level scope at /home/fpacaud/exa/ExaPF.jl/scripts/reduced_ipopt.jl:349
 [7] include(::String) at ./client.jl:457
 [8] top-level scope at REPL[10]:1
in expression starting at /home/fpacaud/exa/ExaPF.jl/scripts/reduced_ipopt.jl:349

Investigating it further, it appears that the problem arises when we convert PV buses to PQ buses here:
https://github.com/exanauts/ExaPF.jl/blob/dev/rgm/src/powersystem.jl#L375L381
Indeed, we remove the PV buses in the indexes bustypes and pv, but the array gens is keeping the old classification. At some point, we should remove the buses --- whose status change from PV to PQ --- from the array gens.

ExaPF 0.6

This issue lists the remaining TODOs required before releasing ExaPF 0.6

New features

Multiple generators per buses (#168 )
Merge new overlapping Schwarz preconditioner (#86 )
Integration of batch Hessian (PR #179 #185)
Figure out new interface for LinearSolvers (PR #176 )

Refactoring

Move all Evaluators in a separate package (#191 )
Move all CUDA related code in a subpackage in ExaPF (#175 )
Clean API of ExaPF (operational constraints in Polar)

Fixes

Fix invalidations (PR #182 )
Solve issue with allowscalar on CUDA.jl v3.3 (#80 )

Warning issued when running unit-tests on the GPU

I was testing ExaPF and get a few warnings issued when running the tests on the GPU.

┌ Warning: calls to Base intrinsics might be GPU incompatible
│   exception =
│    You called atan(x::T) where T<:Union{Float32, Float64} in Base.Math at special/trig.jl:519, maybe you intended to call atan(x::Float64) in CUDA at /home/fpacaud/.julia/packag
es/CUDA/42B9G/src/device/cuda/math.jl:32 instead?

I guess you may be aware of this issue, as it arises when running the test. It looks like the issue is related to GPUArrays.

My system is:

julia> versioninfo()
Julia Version 1.4.1
Commit 381693d3df* (2020-04-14 17:20 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Lack of community guidelines

When reviewing you JuliaCon submission, JuliaCon/proceedings-review#72, I submitted a fix, #136, for an issue that was apparently already fixed on the develop branch. It would be helpful if there were clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software, and 3) Seek support.

Non-deterministic test failure on a GPU system

With the changes in #136 I was able to
successfully run the tests on my local computer using CUDA.jl. However, one
time a test failed (see transcript below). At first glance I didn't see any
non-determinism in the test so I wonder if the failure is related to issue
#110 where other non-deterministic
behaviour is described.

❯ julia --project=.                                        
               _                                           
   _       _ _(_)_     |  Documentation: https://docs.julialang.org                                                    
  (_)     | (_) (_)    |                                   
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.                                                        
  | | | | | | |/ _` |  |                                   
  | | |_| | | | (_| |  |  Version 1.6.0 (2021-03-24)       
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release                                                      
|__/                   |                                   

julia> versioninfo()                                       
Julia Version 1.6.0                                        
Commit f9720dc2eb (2021-03-24 12:55 UTC)                   
Platform Info:                                             
  OS: Linux (x86_64-pc-linux-gnu)                          
  CPU: AMD Ryzen 7 2700X Eight-Core Processor              
  WORD_SIZE: 64                                            
  LIBM: libopenlibm                                        
  LLVM: libLLVM-11.0.1 (ORCJIT, znver1)                    
Environment:                                               
  JULIA_MPI_BINARY = system                                

julia> using CUDA

julia> CUDA.versioninfo()
CUDA toolkit 11.0.3, artifact installation
CUDA driver 11.0.0
NVIDIA driver 450.102.4

Libraries: 
- CUBLAS: 11.2.0
- CURAND: 10.2.1
- CUFFT: 10.2.1
- CUSOLVER: 10.6.0
- CUSPARSE: 11.1.1
- CUPTI: 13.0.0
- NVML: 11.0.0+450.102.4
- CUDNN: 8.10.0 (for CUDA 11.2.0)
- CUTENSOR: 1.2.2 (for CUDA 11.1.0)

Toolchain:
- Julia: 1.6.0
- LLVM: 11.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0
- Device support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80

1 device:
  0: GeForce RTX 2060 (sm_75, 4.622 GiB / 5.792 GiB available)

(ExaPF) pkg> test                                          
     Testing ExaPF                                         
      Status `/tmp/jl_8ipLGV/Project.toml`                 
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [052768ef] CUDA v2.6.2                                   
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [093fc24a] LightGraphs v1.3.5                            
  [b8f27783] MathOptInterface v0.9.20                      
  [2679e427] Metis v1.0.0                                  
  [47a9eef4] SparseDiffTools v1.13.0                       
  [a759f4b9] TimerOutputs v0.5.8                           
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [de0858da] Printf `@stdlib/Printf`                       
  [9a3f8284] Random `@stdlib/Random`                       
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [8dfed614] Test `@stdlib/Test`                           
      Status `/tmp/jl_8ipLGV/Manifest.toml`                
  [621f4979] AbstractFFTs v1.0.1                           
  [79e6a3ab] Adapt v3.2.0                                  
  [ec485272] ArnoldiMethod v0.1.0                          
  [4fba245c] ArrayInterface v3.1.6                         
  [ab4f0b2a] BFloat16s v0.1.0                              
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [b99e7846] BinaryProvider v0.5.10                        
  [fa961155] CEnum v0.4.1                                  
  [052768ef] CUDA v2.6.2                                   
  [7057c7e9] Cassette v0.3.5                               
  [d360d2e6] ChainRulesCore v0.9.33                        
  [523fee87] CodecBzip2 v0.7.2                             
  [944b1d66] CodecZlib v0.7.0                              
  [bbf7d656] CommonSubexpressions v0.3.0                   
  [34da2185] Compat v3.25.0                                
  [864edb3b] DataStructures v0.18.9                        
  [163ba53b] DiffResults v1.0.3                            
  [b552c78f] DiffRules v1.0.2                              
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [e2ba6199] ExprTools v0.1.3                              
  [9aa1b823] FastClosures v0.3.2                           
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [0c68f7d7] GPUArrays v6.2.0                              
  [61eb1bfa] GPUCompiler v0.10.0                           
  [cd3eb016] HTTP v0.9.5                                   
  [615f187c] IfElse v0.1.0                                 
  [d25df0c9] Inflate v0.1.2                                
  [83e8ac13] IniFile v0.5.0                                
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [692b3bcd] JLLWrappers v1.2.0                            
  [682c06a0] JSON v0.21.1                                  
  [7d188eb4] JSONSchema v0.3.3                             
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [929cbde3] LLVM v3.6.0                                   
  [093fc24a] LightGraphs v1.3.5                            
  [5c8ed15e] LinearOperators v1.1.0                        
  [1914dd2f] MacroTools v0.5.6                             
  [b8f27783] MathOptInterface v0.9.20                      
  [fdba3010] MathProgBase v0.7.8                           
  [739be429] MbedTLS v1.0.3                                
  [c03570c3] Memoize v0.4.4                                
  [2679e427] Metis v1.0.0                                  
  [d8a4904e] MutableArithmetics v0.2.14                    
  [872c559c] NNlib v0.7.17                                 
  [77ba4419] NaNMath v0.3.5                                
  [bac558e1] OrderedCollections v1.4.0                     
  [69de0a69] Parsers v1.1.0                                
  [3cdcf5f2] RecipesBase v1.1.1                            
  [189a3867] Reexport v1.0.0                               
  [ae029012] Requires v1.1.3                               
  [6c6a2e73] Scratch v1.0.3                                
  [699a6c99] SimpleTraits v0.9.3                           
  [47a9eef4] SparseDiffTools v1.13.0                       
  [276daf66] SpecialFunctions v1.3.0                       
  [aedffcd0] Static v0.2.4                                 
  [90137ffa] StaticArrays v1.0.1                           
  [a759f4b9] TimerOutputs v0.5.8                           
  [3bb67fe8] TranscodingStreams v0.9.5                     
  [5c2747f8] URIs v1.2.0                                   
  [19fa3120] VertexSafeGraphs v0.1.2                       
  [a5390f91] ZipFile v0.9.3                                
  [ae81ac8f] ASL_jll v0.1.1+4                              
  [6e34b625] Bzip2_jll v1.0.6+5                            
  [9cc047cb] Ipopt_jll v3.13.4+0                           
  [d00139f3] METIS_jll v5.1.0+5                            
  [d7ed1dd3] MUMPS_seq_jll v5.2.1+4                        
  [656ef2d0] OpenBLAS32_jll v0.3.12+1                      
  [efe28fd5] OpenSpecFun_jll v0.5.3+4                      
  [0dad84c5] ArgTools `@stdlib/ArgTools`                   
  [56f22d72] Artifacts `@stdlib/Artifacts`                 
  [2a0f44e3] Base64 `@stdlib/Base64`                       
  [ade2ca70] Dates `@stdlib/Dates`                         
  [8bb1440f] DelimitedFiles `@stdlib/DelimitedFiles`       
  [8ba89e20] Distributed `@stdlib/Distributed`             
  [f43a241f] Downloads `@stdlib/Downloads`                 
  [b77e0a4c] InteractiveUtils `@stdlib/InteractiveUtils`   
  [4af54fe1] LazyArtifacts `@stdlib/LazyArtifacts`         
  [b27032c2] LibCURL `@stdlib/LibCURL`                     
  [76f85450] LibGit2 `@stdlib/LibGit2`                     
  [8f399da3] Libdl `@stdlib/Libdl`                         
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [56ddb016] Logging `@stdlib/Logging`                     
  [d6f4376e] Markdown `@stdlib/Markdown`                   
  [a63ad114] Mmap `@stdlib/Mmap`                           
  [ca575930] NetworkOptions `@stdlib/NetworkOptions`       
  [44cfe95a] Pkg `@stdlib/Pkg`                             
  [de0858da] Printf `@stdlib/Printf`                       
  [3fa0cd96] REPL `@stdlib/REPL`                           
  [9a3f8284] Random `@stdlib/Random`                       
  [ea8e919c] SHA `@stdlib/SHA`                             
  [9e88b42a] Serialization `@stdlib/Serialization`         
  [1a1011a3] SharedArrays `@stdlib/SharedArrays`           
  [6462fe0b] Sockets `@stdlib/Sockets`                     
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [10745b16] Statistics `@stdlib/Statistics`               
  [fa267f1f] TOML `@stdlib/TOML`                           
  [a4e569a6] Tar `@stdlib/Tar`                             
  [8dfed614] Test `@stdlib/Test`                           
  [cf7118a7] UUIDs `@stdlib/UUIDs`                         
  [4ec0a83e] Unicode `@stdlib/Unicode`                     
  [e66e0078] CompilerSupportLibraries_jll `@stdlib/CompilerSupportLibraries_jll`                                       
  [deac9b47] LibCURL_jll `@stdlib/LibCURL_jll`             
  [29816b5a] LibSSH2_jll `@stdlib/LibSSH2_jll`             
  [c8ffd9c3] MbedTLS_jll `@stdlib/MbedTLS_jll`             
  [14a3606d] MozillaCACerts_jll `@stdlib/MozillaCACerts_jll`                                                           
  [83775a58] Zlib_jll `@stdlib/Zlib_jll`                   
  [8e850ede] nghttp2_jll `@stdlib/nghttp2_jll`             
  [3f19e933] p7zip_jll `@stdlib/p7zip_jll`                 
  Progress [========================================>]  22/22                                                          
22 dependencies successfully precompiled in 42 seconds (53 already precompiled)                                        
     Testing Running tests...                              
Reading PSSE format                                        
┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`                                                                                                      
└ @ GPUArrays ~/.julia/packages/GPUArrays/WV76E/src/host/indexing.jl:43                                                
Test Summary:        | Pass  Total                         
Problem formulations |   85     85                         
Test Summary:     | Pass  Total                            
Iterative solvers |   36     36                            
Reading PSSE format                                        
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: Newton-Raphson algorithm failed to converge (182.29864183960703)                                            
└ @ ExaPF ~/research/code/ExaPF.jl/src/Evaluators/reduced_evaluator.jl:105                                             
Test API on CUDADevice(): Test Failed at /home/lucas/research/code/ExaPF.jl/test/reduced_evaluator.jl:65               
  Expression: isapprox(grad_fd, g, rtol = 0.0001)          
   Evaluated: isapprox([-9.935589542501313, 83.29418125602783, 176.58606788964127, 0.0, 88.79764713761492, 61.87925810156231, 5.407209203348859, -16.0610823203679, 21.697626222124487, -5.750990592075815, -11.972362422178843], [-9.935589556165517, 83.29418101143494, 176.5860680653392, 54.61617710821696, 88.79764714293702, 61.87925824943278, 5.407209536445407, -16.061082207200116, 21.697626170063813, -5.75099045219747, -11.97236266954647]; rtol = 0.0001)                      
Stacktrace:                                                
 [1] macro expansion                                       
   @ ~/research/code/ExaPF.jl/test/reduced_evaluator.jl:65 [inlined]                                                   
 [2] macro expansion                                       
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1226 [inlined]             
 [3] macro expansion                                       
   @ ~/research/code/ExaPF.jl/test/reduced_evaluator.jl:22 [inlined]                                                   
 [4] top-level scope                                       
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1226                       
┌ Warning: Passing optimizer attributes as keyword arguments to                                                        
│ `Ipopt.Optimizer` is deprecated. Use                     
│     MOI.set(model, MOI.RawParameter("key"), value)       
│ or                                                       
│     JuMP.set_optimizer_attribute(model, "key", value)    
│ instead.                                                 
└ @ Ipopt ~/.julia/packages/Ipopt/P1XLY/src/MOI_wrapper.jl:88                                                          

******************************************************************************                                         
This program contains Ipopt, a library for large-scale nonlinear optimization.                                         
 Ipopt is released as open source code under the Eclipse Public License (EPL).                                         
         For more information visit https://github.com/coin-or/Ipopt                                                   
******************************************************************************                                         

Test Summary:                       | Pass  Fail  Total    
Optimization evaluators             |  192     1    193    
  Powerflow solver                  |   36           36    
  Compute reduced gradient on CPU   |    8            8    
  ReducedSpaceEvaluators (case9.m)  |   62           62    
  ReducedSpaceEvaluators (case30.m) |   61     1     62    
    Constructor                     |   10           10    
    Constructor                     |   10           10    
    Test API on CPU()               |   21           21    
    Test API on CUDADevice()        |   20     1     21    
  PenaltyEvaluators                 |   10           10    
  AugLagEvaluators                  |   13           13    
  MOI wrapper                       |    2            2    
ERROR: LoadError: Some tests did not pass: 192 passed, 1 failed, 0 errored, 0 broken.                                  
in expression starting at /home/lucas/research/code/ExaPF.jl/test/runtests.jl:29                                       
ERROR: Package ExaPF errored during testing                

(ExaPF) pkg> test                                          
     Testing ExaPF                                         
      Status `/tmp/jl_xEjmzT/Project.toml`                 
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [052768ef] CUDA v2.6.2                                   
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [093fc24a] LightGraphs v1.3.5                            
  [b8f27783] MathOptInterface v0.9.20                      
  [2679e427] Metis v1.0.0                                  
  [47a9eef4] SparseDiffTools v1.13.0                       
  [a759f4b9] TimerOutputs v0.5.8                           
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [de0858da] Printf `@stdlib/Printf`                       
  [9a3f8284] Random `@stdlib/Random`                       
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [8dfed614] Test `@stdlib/Test`                           
      Status `/tmp/jl_xEjmzT/Manifest.toml`                
  [621f4979] AbstractFFTs v1.0.1                           
  [79e6a3ab] Adapt v3.2.0                                  
  [ec485272] ArnoldiMethod v0.1.0                          
  [4fba245c] ArrayInterface v3.1.6                         
  [ab4f0b2a] BFloat16s v0.1.0                              
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [b99e7846] BinaryProvider v0.5.10                        
  [fa961155] CEnum v0.4.1                                  
  [052768ef] CUDA v2.6.2                                   
  [7057c7e9] Cassette v0.3.5                               
  [d360d2e6] ChainRulesCore v0.9.33                        
  [523fee87] CodecBzip2 v0.7.2                             
  [944b1d66] CodecZlib v0.7.0                              
  [bbf7d656] CommonSubexpressions v0.3.0                   
  [34da2185] Compat v3.25.0                                
  [864edb3b] DataStructures v0.18.9                        
  [163ba53b] DiffResults v1.0.3                            
  [b552c78f] DiffRules v1.0.2                              
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [e2ba6199] ExprTools v0.1.3                              
  [9aa1b823] FastClosures v0.3.2                           
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [0c68f7d7] GPUArrays v6.2.0                              
  [61eb1bfa] GPUCompiler v0.10.0                           
  [cd3eb016] HTTP v0.9.5                                   
  [615f187c] IfElse v0.1.0                                 
  [d25df0c9] Inflate v0.1.2                                
  [83e8ac13] IniFile v0.5.0                                
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [692b3bcd] JLLWrappers v1.2.0                            
  [682c06a0] JSON v0.21.1                                  
  [7d188eb4] JSONSchema v0.3.3                             
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [929cbde3] LLVM v3.6.0                                   
  [093fc24a] LightGraphs v1.3.5                            
  [5c8ed15e] LinearOperators v1.1.0                        
  [1914dd2f] MacroTools v0.5.6                             
  [b8f27783] MathOptInterface v0.9.20                      
  [fdba3010] MathProgBase v0.7.8                           
  [739be429] MbedTLS v1.0.3                                
  [c03570c3] Memoize v0.4.4                                
  [2679e427] Metis v1.0.0                                  
  [d8a4904e] MutableArithmetics v0.2.14                    
  [872c559c] NNlib v0.7.17                                 
  [77ba4419] NaNMath v0.3.5                                
  [bac558e1] OrderedCollections v1.4.0                     
  [69de0a69] Parsers v1.1.0                                
  [3cdcf5f2] RecipesBase v1.1.1                            
  [189a3867] Reexport v1.0.0                               
  [ae029012] Requires v1.1.3                               
  [6c6a2e73] Scratch v1.0.3                                
  [699a6c99] SimpleTraits v0.9.3                           
  [47a9eef4] SparseDiffTools v1.13.0                       
  [276daf66] SpecialFunctions v1.3.0                       
  [aedffcd0] Static v0.2.4                                 
  [90137ffa] StaticArrays v1.0.1                           
  [a759f4b9] TimerOutputs v0.5.8                           
  [3bb67fe8] TranscodingStreams v0.9.5                     
  [5c2747f8] URIs v1.2.0                                   
  [19fa3120] VertexSafeGraphs v0.1.2                       
  [a5390f91] ZipFile v0.9.3                                
  [ae81ac8f] ASL_jll v0.1.1+4                              
  [6e34b625] Bzip2_jll v1.0.6+5                            
  [9cc047cb] Ipopt_jll v3.13.4+0                           
  [d00139f3] METIS_jll v5.1.0+5                            
  [d7ed1dd3] MUMPS_seq_jll v5.2.1+4                        
  [656ef2d0] OpenBLAS32_jll v0.3.12+1                      
  [efe28fd5] OpenSpecFun_jll v0.5.3+4                      
  [0dad84c5] ArgTools `@stdlib/ArgTools`                   
  [56f22d72] Artifacts `@stdlib/Artifacts`                 
  [2a0f44e3] Base64 `@stdlib/Base64`                       
  [ade2ca70] Dates `@stdlib/Dates`                         
  [8bb1440f] DelimitedFiles `@stdlib/DelimitedFiles`       
  [8ba89e20] Distributed `@stdlib/Distributed`             
  [f43a241f] Downloads `@stdlib/Downloads`                 
  [b77e0a4c] InteractiveUtils `@stdlib/InteractiveUtils`   
  [4af54fe1] LazyArtifacts `@stdlib/LazyArtifacts`         
  [b27032c2] LibCURL `@stdlib/LibCURL`                     
  [76f85450] LibGit2 `@stdlib/LibGit2`                     
  [8f399da3] Libdl `@stdlib/Libdl`                         
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [56ddb016] Logging `@stdlib/Logging`                     
  [d6f4376e] Markdown `@stdlib/Markdown`                   
  [a63ad114] Mmap `@stdlib/Mmap`                           
  [ca575930] NetworkOptions `@stdlib/NetworkOptions`       
  [44cfe95a] Pkg `@stdlib/Pkg`                             
  [de0858da] Printf `@stdlib/Printf`                       
  [3fa0cd96] REPL `@stdlib/REPL`                           
  [9a3f8284] Random `@stdlib/Random`                       
  [ea8e919c] SHA `@stdlib/SHA`                             
  [9e88b42a] Serialization `@stdlib/Serialization`         
  [1a1011a3] SharedArrays `@stdlib/SharedArrays`           
  [6462fe0b] Sockets `@stdlib/Sockets`                     
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [10745b16] Statistics `@stdlib/Statistics`               
  [fa267f1f] TOML `@stdlib/TOML`                           
  [a4e569a6] Tar `@stdlib/Tar`                             
  [8dfed614] Test `@stdlib/Test`                           
  [cf7118a7] UUIDs `@stdlib/UUIDs`                         
  [4ec0a83e] Unicode `@stdlib/Unicode`                     
  [e66e0078] CompilerSupportLibraries_jll `@stdlib/CompilerSupportLibraries_jll`                                       
  [deac9b47] LibCURL_jll `@stdlib/LibCURL_jll`             
  [29816b5a] LibSSH2_jll `@stdlib/LibSSH2_jll`             
  [c8ffd9c3] MbedTLS_jll `@stdlib/MbedTLS_jll`             
  [14a3606d] MozillaCACerts_jll `@stdlib/MozillaCACerts_jll`                                                           
  [83775a58] Zlib_jll `@stdlib/Zlib_jll`                   
  [8e850ede] nghttp2_jll `@stdlib/nghttp2_jll`             
  [3f19e933] p7zip_jll `@stdlib/p7zip_jll`                 
     Testing Running tests...                              
Reading PSSE format                                        
┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`                                                                                                      
└ @ GPUArrays ~/.julia/packages/GPUArrays/WV76E/src/host/indexing.jl:43                                                
Test Summary:        | Pass  Total                         
Problem formulations |   85     85                         
Test Summary:     | Pass  Total                            
Iterative solvers |   36     36                            
Reading PSSE format                                        
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: Passing optimizer attributes as keyword arguments to                                                        
│ `Ipopt.Optimizer` is deprecated. Use                     
│     MOI.set(model, MOI.RawParameter("key"), value)       
│ or                                                       
│     JuMP.set_optimizer_attribute(model, "key", value)    
│ instead.                                                 
└ @ Ipopt ~/.julia/packages/Ipopt/P1XLY/src/MOI_wrapper.jl:88                                                          

******************************************************************************                                         
This program contains Ipopt, a library for large-scale nonlinear optimization.                                         
 Ipopt is released as open source code under the Eclipse Public License (EPL).                                         
         For more information visit https://github.com/coin-or/Ipopt                                                   
******************************************************************************                                         

Test Summary:           | Pass  Total                      
Optimization evaluators |  193    193                      
Test Summary:            | Pass  Total                     
Reduced space algorithms |    2      2                     
     Testing ExaPF tests passed                            

(ExaPF) pkg> test                                          
     Testing ExaPF                                         
      Status `/tmp/jl_rbh8TS/Project.toml`                 
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [052768ef] CUDA v2.6.2                                   
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [093fc24a] LightGraphs v1.3.5                            
  [b8f27783] MathOptInterface v0.9.20                      
  [2679e427] Metis v1.0.0                                  
  [47a9eef4] SparseDiffTools v1.13.0                       
  [a759f4b9] TimerOutputs v0.5.8                           
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [de0858da] Printf `@stdlib/Printf`                       
  [9a3f8284] Random `@stdlib/Random`                       
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [8dfed614] Test `@stdlib/Test`                           
      Status `/tmp/jl_rbh8TS/Manifest.toml`                
  [621f4979] AbstractFFTs v1.0.1                           
  [79e6a3ab] Adapt v3.2.0                                  
  [ec485272] ArnoldiMethod v0.1.0                          
  [4fba245c] ArrayInterface v3.1.6                         
  [ab4f0b2a] BFloat16s v0.1.0                              
  [6e4b80f9] BenchmarkTools v0.5.0                         
  [b99e7846] BinaryProvider v0.5.10                        
  [fa961155] CEnum v0.4.1                                  
  [052768ef] CUDA v2.6.2                                   
  [7057c7e9] Cassette v0.3.5                               
  [d360d2e6] ChainRulesCore v0.9.33                        
  [523fee87] CodecBzip2 v0.7.2                             
  [944b1d66] CodecZlib v0.7.0                              
  [bbf7d656] CommonSubexpressions v0.3.0                   
  [34da2185] Compat v3.25.0                                
  [864edb3b] DataStructures v0.18.9                        
  [163ba53b] DiffResults v1.0.3                            
  [b552c78f] DiffRules v1.0.2                              
  [0cf0e50c] ExaPF v0.4.0 `~/research/code/ExaPF.jl`       
  [e2ba6199] ExprTools v0.1.3                              
  [9aa1b823] FastClosures v0.3.2                           
  [6a86dc24] FiniteDiff v2.8.0                             
  [f6369f11] ForwardDiff v0.10.17                          
  [0c68f7d7] GPUArrays v6.2.0                              
  [61eb1bfa] GPUCompiler v0.10.0                           
  [cd3eb016] HTTP v0.9.5                                   
  [615f187c] IfElse v0.1.0                                 
  [d25df0c9] Inflate v0.1.2                                
  [83e8ac13] IniFile v0.5.0                                
  [b6b21f68] Ipopt v0.6.5                                  
  [42fd0dbc] IterativeSolvers v0.8.5                       
  [692b3bcd] JLLWrappers v1.2.0                            
  [682c06a0] JSON v0.21.1                                  
  [7d188eb4] JSONSchema v0.3.3                             
  [63c18a36] KernelAbstractions v0.4.6                     
  [ba0b0d4f] Krylov v0.5.5                                 
  [929cbde3] LLVM v3.6.0                                   
  [093fc24a] LightGraphs v1.3.5                            
  [5c8ed15e] LinearOperators v1.1.0                        
  [1914dd2f] MacroTools v0.5.6                             
  [b8f27783] MathOptInterface v0.9.20                      
  [fdba3010] MathProgBase v0.7.8                           
  [739be429] MbedTLS v1.0.3                                
  [c03570c3] Memoize v0.4.4                                
  [2679e427] Metis v1.0.0                                  
  [d8a4904e] MutableArithmetics v0.2.14                    
  [872c559c] NNlib v0.7.17                                 
  [77ba4419] NaNMath v0.3.5                                
  [bac558e1] OrderedCollections v1.4.0                     
  [69de0a69] Parsers v1.1.0                                
  [3cdcf5f2] RecipesBase v1.1.1                            
  [189a3867] Reexport v1.0.0                               
  [ae029012] Requires v1.1.3                               
  [6c6a2e73] Scratch v1.0.3                                
  [699a6c99] SimpleTraits v0.9.3                           
  [47a9eef4] SparseDiffTools v1.13.0                       
  [276daf66] SpecialFunctions v1.3.0                       
  [aedffcd0] Static v0.2.4                                 
  [90137ffa] StaticArrays v1.0.1                           
  [a759f4b9] TimerOutputs v0.5.8                           
  [3bb67fe8] TranscodingStreams v0.9.5                     
  [5c2747f8] URIs v1.2.0                                   
  [19fa3120] VertexSafeGraphs v0.1.2                       
  [a5390f91] ZipFile v0.9.3                                
  [ae81ac8f] ASL_jll v0.1.1+4                              
  [6e34b625] Bzip2_jll v1.0.6+5                            
  [9cc047cb] Ipopt_jll v3.13.4+0                           
  [d00139f3] METIS_jll v5.1.0+5                            
  [d7ed1dd3] MUMPS_seq_jll v5.2.1+4                        
  [656ef2d0] OpenBLAS32_jll v0.3.12+1                      
  [efe28fd5] OpenSpecFun_jll v0.5.3+4                      
  [0dad84c5] ArgTools `@stdlib/ArgTools`                   
  [56f22d72] Artifacts `@stdlib/Artifacts`                 
  [2a0f44e3] Base64 `@stdlib/Base64`                       
  [ade2ca70] Dates `@stdlib/Dates`                         
  [8bb1440f] DelimitedFiles `@stdlib/DelimitedFiles`       
  [8ba89e20] Distributed `@stdlib/Distributed`             
  [f43a241f] Downloads `@stdlib/Downloads`                 
  [b77e0a4c] InteractiveUtils `@stdlib/InteractiveUtils`   
  [4af54fe1] LazyArtifacts `@stdlib/LazyArtifacts`         
  [b27032c2] LibCURL `@stdlib/LibCURL`                     
  [76f85450] LibGit2 `@stdlib/LibGit2`                     
  [8f399da3] Libdl `@stdlib/Libdl`                         
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`         
  [56ddb016] Logging `@stdlib/Logging`                     
  [d6f4376e] Markdown `@stdlib/Markdown`                   
  [a63ad114] Mmap `@stdlib/Mmap`                           
  [ca575930] NetworkOptions `@stdlib/NetworkOptions`       
  [44cfe95a] Pkg `@stdlib/Pkg`                             
  [de0858da] Printf `@stdlib/Printf`                       
  [3fa0cd96] REPL `@stdlib/REPL`                           
  [9a3f8284] Random `@stdlib/Random`                       
  [ea8e919c] SHA `@stdlib/SHA`                             
  [9e88b42a] Serialization `@stdlib/Serialization`         
  [1a1011a3] SharedArrays `@stdlib/SharedArrays`           
  [6462fe0b] Sockets `@stdlib/Sockets`                     
  [2f01184e] SparseArrays `@stdlib/SparseArrays`           
  [10745b16] Statistics `@stdlib/Statistics`               
  [fa267f1f] TOML `@stdlib/TOML`                           
  [a4e569a6] Tar `@stdlib/Tar`                             
  [8dfed614] Test `@stdlib/Test`                           
  [cf7118a7] UUIDs `@stdlib/UUIDs`                         
  [4ec0a83e] Unicode `@stdlib/Unicode`                     
  [e66e0078] CompilerSupportLibraries_jll `@stdlib/CompilerSupportLibraries_jll`                                       
  [deac9b47] LibCURL_jll `@stdlib/LibCURL_jll`             
  [29816b5a] LibSSH2_jll `@stdlib/LibSSH2_jll`             
  [c8ffd9c3] MbedTLS_jll `@stdlib/MbedTLS_jll`             
  [14a3606d] MozillaCACerts_jll `@stdlib/MozillaCACerts_jll`                                                           
  [83775a58] Zlib_jll `@stdlib/Zlib_jll`                   
  [8e850ede] nghttp2_jll `@stdlib/nghttp2_jll`             
  [3f19e933] p7zip_jll `@stdlib/p7zip_jll`                 
     Testing Running tests...                              
Reading PSSE format                                        
┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`                                                                                                      
└ @ GPUArrays ~/.julia/packages/GPUArrays/WV76E/src/host/indexing.jl:43                                                
Test Summary:        | Pass  Total                         
Problem formulations |   85     85                         
Test Summary:     | Pass  Total                            
Iterative solvers |   36     36                            
Reading PSSE format                                        
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: PowerSystem: cost is not specified in PowerNetwork dataset                                                  
└ @ ExaPF.PowerSystem ~/research/code/ExaPF.jl/src/PowerSystem/power_network.jl:211                                    
┌ Warning: Passing optimizer attributes as keyword arguments to                                                        
│ `Ipopt.Optimizer` is deprecated. Use                     
│     MOI.set(model, MOI.RawParameter("key"), value)       
│ or                                                       
│     JuMP.set_optimizer_attribute(model, "key", value)    
│ instead.                                                 
└ @ Ipopt ~/.julia/packages/Ipopt/P1XLY/src/MOI_wrapper.jl:88                                                          

******************************************************************************                                         
This program contains Ipopt, a library for large-scale nonlinear optimization.                                         
 Ipopt is released as open source code under the Eclipse Public License (EPL).                                         
         For more information visit https://github.com/coin-or/Ipopt                                                   
******************************************************************************                                         

Test Summary:           | Pass  Total                      
Optimization evaluators |  193    193                      
Test Summary:            | Pass  Total                     
Reduced space algorithms |    2      2                     
     Testing ExaPF tests passed                            

(ExaPF) pkg>

Change indexing of control variable

Currently, the control u is indexed as:
u = [vmag[ref]; pg[pv]; vmag[pv]]
I believe it would be more consistent to order the voltage together, such as
u = [vmag[ref]; vmag[pv]; pg[pv]].
That would avoid ugly expressions like

and simplify some part of the code.

Implement ramping constraints for active power generation

To support time linking constraints in ProxAL, ExaPF should support internally ramping constraints for the active power generations p_g.
The ramping constraint writes, for each generator g:

| pg - pg_previous | <= ramp_agc

with ramp_agc a fixed constant, and pg_previous a value we should be able to update.

The absolute constraint rewrites equivalently as two constraints :

  pg - pg_previous  <= ramp_agc
- pg + pg_previous  <= ramp_agc

We could have different paths onward to implement that in ExaPF:

implement a new set of constraints called ramping_constraints, adding all ramping constraints all in once. Even if the ramping constraints are purely linear, that comes at the expense to add 2 * n_g constraints to the model, thus impacting directly the resolution algorithm
another solution is to update directly the bounds on the active power generation pg: pg_min and pg_max. That would require to be able to modify the bounds directly through ExaPF's API. At least, this solution would not add any new constraint to the optimization problem.

[GPU] Non-deterministic behavior with residual function

With multiple instances on a single GPU we observe non-deterministic results. The issue can be pinned down to the residual function in the Newton-Raphson. This either yields a wrong Jacobian or a wrong step update.

Architecture and SIMD agnostic

Right now, kernel.jl implements a CPU and GPU abstraction. It takes care of either generating CUDA kernels or just a regular function to run on the CPU. There are also other details that make target = "cpu"/"cuda" work. Macros are widely used and the way it is done now allows only for one instantiation of the either CUDA or CPU code.

This should be revisited with another more flexible extraction. Also, what is decided here has an effect of how a framework for nonlinear equations or optimization would like.

Refactoring of newton-rhapson solve function: parsing and data structures.

In the current form, the newton-rhapson routine includes code that needs to be externalized to accommodate optimization algorithms. This is a partial list:

Initial guess (V) should be provided through the function API.
Indexing structures (e.g. pv, pq, npv, npq) should be created externally and be part of a PSYSTEM object. These structures are permanent and will be re-used each type the non-linear solve is called.
Creation of vectors x, u, p; and mapping between these and V, VANG, P, Q, etc.
Update xk accordingly.

Reviewers: @michel2323 , @frapac

Implement reduced Hessian on GPU

This issue tracks the implementation of the reduced Hessian on the GPU. The reduced Hessian is computed in two steps:

Compute the Hessians of the objective and the constraints in the full-space.
The Hessians are computed in the full-space using AD (forward over reverse), with hand-coded adjoints. Hessians are implemented as adjoint-Hessian-vector product routine.
Compute the second-order adjoints by solving two linear systems (see document)
The second order adjoints are computed using two linear-systems, with different RHS (each RHS corresponding to a given vector v). We should find an efficient way to do that on the GPU.

Current state:

Move the optimization evaluators in a separate package

ExaPF should focus only on the resolution of the power flow.

Sparsity of design Jacobian

So far the design Jacobian is dense, as the AD needs a sparsity pattern before doing the coloring.

Implement a new structure NewtonRaphson to store options of the algorithm

As we did in the LinearSolvers submodule, it may be interesting to implement a NewtonRaphson structure to store in one place all the options of the algorithm, and ease the implementation of other non-linear algorithms (as the decoupled formulation introduced in the article Fast decoupled flows).

A prototype could be:

abstract type AbstractNonLinearSolver end 

struct NewtonRaphson <: AbstractNonLinearSolver 
    linear_solver::AbstractLinearSolver # direct or indirect 
    tolerance::Float64
    max_iter::Int
end

and we could dispatch the resolution of the powerflow equations with:

powerflow(
    polar::PolarForm, 
    buffer::PolarNetworkState,
    algo::NewtonRaphson,
)

[Evaluator] Allow to evaluate reduced Jacobian on the GPU

Currently, in ReducedSpaceEvaluator the reduced Jacobian is using SparseMatrixCSC and concatenate them before computing the adjoint.

We should find a proper way to solve linear systems with multiple RHS on the GPU.

CUDA.jl 1.2

CUDA.jl 1.2 breaks the code in the bicgstab when the sparse matrix P is multiplied by a vector.

 x0 .= P * b

This returns zeros in x0.

CUDA.CUSPARSE.mul!(x0, P, b)

This breaks the code generation.

The cause needs to be pinned down.

designJacobianAD: First call is slow on large instances (with > 1000 buses)

The first call to designJacobianAD tends to be very slow for large instances (with more than 1,000 buses).
https://github.com/exanauts/ExaPF.jl/blob/dev/rgm/src/ExaPF.jl#L688L689
I suspect it's something going on with the precompilation. Once the function precompiled, the evaluation of designJacobianAD becomes fast again.

Some observations:

this is not due to the closure: if I call directly AD.designJacobianAD at the end of the function solve the first call is as slow as when we are calling through the closure
I am wondering if this is an issue with KernelAbstractions.jl

Julia 1.6: unexpected error during compilation of overdub

When trying to run the tests on my local machine (Julia 1.6) on ExaPF#develop, I get:

Internal error: encountered unexpected error during compilation of overdub:
ErrorException("unsupported or misplaced expression "return" in function overdub")
jl_errorf at /buildworker/worker/package_linux64/build/src/rtutils.c:77
emit_expr at /buildworker/worker/package_linux64/build/src/codegen.cpp:4581
emit_ssaval_assign at /buildworker/worker/package_linux64/build/src/codegen.cpp:4020
emit_stmtpos at /buildworker/worker/package_linux64/build/src/codegen.cpp:4262 [inlined]
<...>
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2238 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2420
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1703 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:839
unknown function (ip: (nil))
Computing Jacobian of residuals: Error During Test at /home/frapac/dev/anl/ExaPF.jl/test/powersystem.jl:103
  Got exception outside of a @test
  TaskFailedException
  Stacktrace:
    [1] wait
      @ ./task.jl:317 [inlined]
    [2] wait
      @ ~/.julia/packages/KernelAbstractions/jAutM/src/backends/cpu.jl:65 [inlined]
    [3] wait
      @ ~/.julia/packages/KernelAbstractions/jAutM/src/backends/cpu.jl:29 [inlined]
    [4] ExaPF.AutoDiff.Jacobian(structure::ExaPF.StateJacobianStructure{Vector{Float64}}, F::Vector{Float64}, vm::Vector{Float64}, va::Vector{Float64}, ybus_re::ExaPF.Spmat{Vector{Int64}, Vector{Float64}}, ybus_im::ExaPF.Spmat{Vector{Int64
}, Vector{Float64}}, pinj::Vector{Float64}, qinj::Vector{Float64}, pv::Vector{Int64}, pq::Vector{Int64}, ref::Vector{Int64}, type::ExaPF.AutoDiff.StateJacobian)
      @ ExaPF.AutoDiff ~/dev/anl/ExaPF.jl/src/autodiff.jl:125
    [5] macro expansion
      @ ~/dev/anl/ExaPF.jl/test/powersystem.jl:116 [inlined]
    [6] macro expansion
      @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
    [7] macro expansion
      @ ~/dev/anl/ExaPF.jl/test/powersystem.jl:104 [inlined]
    [8] macro expansion
      @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
    [9] top-level scope
      @ ~/dev/anl/ExaPF.jl/test/powersystem.jl:12

My current setup:

(ExaPF) pkg> st
     Project ExaPF v0.4.0
      Status `~/dev/anl/ExaPF.jl/Project.toml`
  [052768ef] CUDA v2.3.0
  [6a86dc24] FiniteDiff v2.7.2
  [f6369f11] ForwardDiff v0.10.14
  [42fd0dbc] IterativeSolvers v0.8.4
  [63c18a36] KernelAbstractions v0.4.5
  [ba0b0d4f] Krylov v0.6.0
  [093fc24a] LightGraphs v1.3.4
  [b8f27783] MathOptInterface v0.9.19
  [2679e427] Metis v1.0.0
  [47a9eef4] SparseDiffTools v1.10.2
  [a759f4b9] TimerOutputs v0.5.7
  [e88e6eb3] Zygote v0.6.0
  [37e2e46d] LinearAlgebra
  [de0858da] Printf
  [2f01184e] SparseArrays

Store factorization in LinearSolver.DirectSolver

Now that we have implemented a wrapper to CUSOLVERRF, we should store the factorization of the powerflow matrix J inside the direct solver, to avoid refactorizing the matrix from scratch each time we are solving a linear system:

struct DirectSolver <: AbstractLinearSolver
    A_factorized::LinearAlgebra.Factorization
end

Preliminary results show that we could get ~ a 10x speed-up when resolving the powerflow with CUSOLVERRF.

Iterative solvers are broken

It appears that changing the indexing of the Jacobian broke the iterative solvers.

Some Matpower test cases break the parser

The parser breaks on cases case2383wp.m, case30pwl.m, case33bw.m, case5.m. The log with the errors is attached.
ExaPFerr.txt

Fix divergent behavior on ACTIV70k system

I'll look into what is causing divergence and come up with possible ways to address it.

Reviewers: @michel2323 , @frapac

Time to first power flow

Currently, we are spending almost a minute to solve the first power flow. On Julia 1.6 and my local machine, I get in a fresh Julia session:

julia> include("tmp/launch_powerflow.jl")
 42.329781 seconds (50.81 M allocations: 2.880 GiB, 10.27% gc time)

This is mostly due to type inference issues. On Julia 1.6, fixing the type inferences on ReducedSpaceEvaluator allowed to decrease the first compile time of ExaPF.hessprod! from 3mn to 10s (see #98 ). We should be able to do the same on powerflow, and most functions exposed to the users.

Improve the naming of the API

At some point, it would be nice to think about a more proper naming for the API.
For instance:

take a look at the abstract functions in src/models/models.jl and decide if their names are explicit enough
we could do the same for the abstract attributes implemented in src/PowerSystem/PowerSystem.jl
a more thorough discussion should take place for the functions implemented in src/evaluators.jl. At some point, the names of the functions implemented in this file are too concise, and we do not know what exactly do they correspond (take update! for instance).

In my opinion, naming is a non-trivial task, and we could benefit a lot by discussing it together 😸

Implement line power constraints

To integrate ExaPF with ProxAL.jl (see exanauts/ProxAL.jl#3), we need to implement line power constraints in the polar formulation implemented in ExaPF.jl. The procedure is:

determine how to formulate the line flow constraints. For information, ProxAL.jl is using the following formulation:

        #branch apparent power limits (from bus)
        Yff_abs2=YffR[l]^2+YffI[l]^2; Yft_abs2=YftR[l]^2+YftI[l]^2
        Yre=YffR[l]*YftR[l]+YffI[l]*YftI[l]; Yim=-YffR[l]*YftI[l]+YffI[l]*YftR[l]
        @NLconstraint(opfmodel,
            Vm[from]^2 *
            ( Yff_abs2*Vm[from]^2 + Yft_abs2*Vm[to]^2
            + 2*Vm[from]*Vm[to]*(Yre*cos(Va[from]-Va[to])-Yim*sin(Va[from]-Va[to]))
            )
            - flowmax
            - (sigma_lineFr[l]/baseMVA)
            <=0
        )

        #branch apparent power limits (to bus)
        Ytf_abs2=YtfR[l]^2+YtfI[l]^2; Ytt_abs2=YttR[l]^2+YttI[l]^2
        Yre=YtfR[l]*YttR[l]+YtfI[l]*YttI[l]; Yim=-YtfR[l]*YttI[l]+YtfI[l]*YttR[l]
        @NLconstraint(opfmodel,
            Vm[to]^2 *
            ( Ytf_abs2*Vm[from]^2 + Ytt_abs2*Vm[to]^2
            + 2*Vm[from]*Vm[to]*(Yre*cos(Va[from]-Va[to])-Yim*sin(Va[from]-Va[to]))
            )
            - flowmax
            - (sigma_lineTo[l]/baseMVA)
            <=0
        )

implement the new constraint in ExaPF's polar formulation. New constraints could be added in a modular fashion, provided we implement all the methods required (see https://github.com/exanauts/ExaPF.jl/blob/develop/src/models/polar/constraints.jl#L5L42 for an example). A prototype is:

function line_power_constraint(polar::PolarForm, g, buffer)
    ...
    return
end
is_constraint(::typeof(line_power_constraint)) = true
size_constraint(polar::PolarForm{T, IT, VT, AT}, ::typeof(line_power_constraint)) where {T, IT, VT, AT} = ...
function bounds(polar::PolarForm, ::typeof(line_power_constraint))
    ...
end

function jacobian(polar::PolarForm, ::typeof(line_power_constraint), i_cons, ∂jac, buffer)
    ...
end
function jtprod(polar::PolarForm, ::typeof(line_power_constraint), ∂jac, buffer, v)
    ...
end

Preconditioner weights

The preconditioner currently works without edge weights (all weights = 1). It would make sense to use the electrical distance as an edge weight. The Julia package Metis.jl does only allow for vertex weights. The underlying Metis library supports edge weigths. A PR to Metis.jl would make sense.

Implement getters/setters

To integrate with ProxAL.jl (ref exanauts/ProxAL.jl#3), we need to implement proper getters/setters at the API level. The goal is to get/modify the values of the problem inplace, ala JuMP:

value.(model[:Pg]), value.(model[:Qg])

where model is a proper ExaPF object. We could do that at different level (level 1: PowerNetwork; level 2: PolarForm, level3: AbstractNLPFormulation). I think we should discuss where it makes more sense to implement the getters/setters. I see two solutions:
1- integrate them directly at the 2nd level (PolarForm)
2- create a new object OptimizationModel that would mimic a JuMP.Model. A prototype could be:

abstract type AbstractOptimizationModel end
struct OptimizationModel <: AbstractOptimizationModel
    solution::Dict
    evaluator::AbstractNLPEvaluator
end
function getfield(model::OptimizationModel, symbol)
    ...
end

A point to discuss is how to remove a line properly in the network.

Implement a logger

At some point, we will need a logger to avoid being flooded by the log.

We could use the Julia built-in logger system or a dedicated logger as Memento. What do you think?

Release 0.5.0

This issue is opened to track the remaining points to address before the release 0.5.0

Discuss together the naming of the functions exposed in the API, and check the consistency of the signatures (#146)
Change signature of DirectSolver to allow storing the Factorization (breaking change) (#144)
Fix warning issued when loading ExaPF: Warning: Replacing docs for ExaPF.bounds :: Union{} in module ExaPF (#146)
Check that all Julia scripts in scripts/ are working (#146)
Add in benchmarks the script to reproduce the results presented at JuliaCon
Check that ProxAL is working on ExaPF#develop (https://github.com/exanauts/ProxAL.jl/runs/2331420226)
Finish updating the documentation (#148)
Update README (#148)

exanauts / exapf.jl Goto Github PK

exapf.jl's People

Contributors

Stargazers

Watchers

Forkers

exapf.jl's Issues

New features

Refactoring

Fixes

Recommend Projects

Recommend Topics

Recommend Org

Jobs