Comments (4)
Performance evaluation on a batch run of TRON (calling dspcg() routine) seems promising. The time was averaged over 10 randomly generated problems of size n=8, where n is the number of variables. 70 threads were used for CPU, and TronDenseMatrix was used for CPU. Each run would correspond to one Newton step.
Batch size | CPU | GPU | Ratio (CPU/GPU) |
---|---|---|---|
5,120 | 1.76739e-01 | 7.92999e-04 | 2.27874e+02 |
10,240 | 3.84099e-01 | 1.49459e-03 | 2.56993e+02 |
20,480 | 7.87158e-01 | 8.77802e-04 | 8.96737e+02 |
It seems that GPU scales quite well. Note that ACOPF problem structure is different than these randomly generated problems, so the results could be different. Once I implement evaluation routines for ADMM on GPU, we will see its performance.
from exatron.jl.
Numbers reported were incorrect. I didn't synchronize GPU run. Because a kernel is asynchronously run, I should have put CUDA.@sync to measure its correct run time. After putting CUDA.@sync macro, the gap has reduced by half. Experimental settings were
- 70 threads and TronDenseMatrix were used for CPU run.
- the number of variables is n=8.
- Tron was run over 30 randomly generated QP problems, and its time was averaged.
Batch size | CPU | GPU | Ratio (CPU/GPU) |
---|---|---|---|
5,120 | 1.86585e-02 | 1.46632e-03 | 1.27247e+01 |
10,240 | 3.99126e-02 | 2.75390e-03 | 1.44931e+01 |
20,480 | 1.37620e-01 | 5.26920e-03 | 2.61179e+01 |
GPU time makes more sense now. It increases as batch size increases. But, I wonder why CPU time has reduced compared to previous results. Did I forget to set JULIA_NUM_THREADS? ...
from exatron.jl.
Another intermediate runtime (in seconds) results over case9241pegase of a direct GPU implementation of function evaluation and generator/bus/rho update routines. 70 threads were used for the CPU run. These routines are expected to take much smaller portion of overall runtime than that of branch solve. However, considering the high number of GPU threads, its occupancy doesn't seem good. We may need to visit this at some point.
Component | CPU | GPU | Ratio (CPU/GPU) | # GPU threads |
---|---|---|---|---|
Generator | 0.00004 | 0.00012 | 3.32e-01 | 1,472 |
Bus | 0.00083 | 0.00031 | 2.66e+00 | 9,248 |
Function evaluation (branch) | 0.00016 | 0.00011 | 1.44e+00 | 16,064 |
Gradient evaluation (branch) | 0.00025 | 0.00011 | 2.36e+00 | 16,064 |
Hessian evaluation (branch) | 0.00019 | 0.00011 | 1.73e+00 | 16,064 |
Rho update | 0.00068 | 0.00015 | 4.52e+00 | 99,200 |
from exatron.jl.
The above might not fit in this channel, since it was about ADMM implementation.
from exatron.jl.
Related Issues (20)
- Specifying explicit type information for AbstractTronMatrix
- GPU compatibility of data types HOT 1
- Allow ExaTron to use Hessian-vector product
- Integration with ProxAL HOT 1
- Improving performance on GPUs (1) - replace dsel2() with a GPU friendly top-k algorithm HOT 2
- Test properly the ADMM algorithm HOT 1
- Integrate two-level algorithm in the new interface AdmmEnv
- Improving performance on GPUs (2) - reduce getindex() and setindex!() time HOT 1
- Registrator Release HOT 36
- TagBot trigger issue HOT 14
- Release for moving the ADMM part to `ExaAdmm.jl` HOT 2
- Update `README.md` HOT 1
- Do we need `MPI.jl`? HOT 1
- Test failure HOT 3
- Documentation HOT 1
- Profiling GPU runs HOT 1
- Sporadic hang or freeze in CUDA kernels HOT 1
- dgpnorm for CPU vs. GPU HOT 1
- Solution difference in 1e-5 from dtron on CPU vs. GPU HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from exatron.jl.