Comments (8)
According to ABACUS收敛性问题解决手册. I try to reduce mixing_beta
from 0.8 to 0.4/0.2, and increase mixing_ndim
from 8 to 15. You can check the results in link.
Actually, I try 4 combinations, namely:
mixing_beta=0.4
andmixing_ndim=8
mixing_beta=0.4
andmixing_ndim=15
mixing_beta=0.2
andmixing_ndim=8
mixing_beta=0.2
andmixing_ndim=15
For Ni-hcp, converges in all 4 combinations:
For Mn-bcc, converges in all 4 combinations:
For Fe-fcc, converges in 3 combinations, only fails to converge for mixing_beta=0.4
and mixing_ndim=8
:
For Cr-bcc, converges in all 4 combinations:
For Co-bcc, converges in all 4 combinations:
For Ce-bcc, converges for mixing_beta=0.2
and mixing_ndim=8
Actually, Ce-bcc is not hard converge. Instead, it is very easy to converge, you can see the drho:
You can notice the drho decrease very fast to 1e-7
, while fails to converge to 1e-8
. These results indicate the Ce calculations is unstable numerically. This numerical instability might be caused by the pseudopotential. Furthermore, this instability also can lead to some numerical errors in the iterative solution methods (like Davidson method), but this is not a bug, rather it is a feature of this numerical solution technique. You can see more discussion in Issue #4068.
from abacus-develop.
I have checked some examples calculated previous (ecutwfc is also 100 Ry).
example | natom | nbands | nelec | kpoints | bohrium_machine (parallel core) | cpu | ave scf_time |
---|---|---|---|---|---|---|---|
041_ZnMnGa | 49 | 290 | 481 | 63 | c32_m128_cpu(32) | Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz | 305 |
043_RuSc | 30 | 223 | 370 | 112 | c32_m128_cpu(32) | Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz | 530 |
055_ErAlNi | 24 | 217 | 360 | 152 | c32_m128_cpu(32) | Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz | 278 |
V(this issue) | 32 | 250 | 416 | 112 | c64_m128_cpu_H(64) | AMD EPYC 7452 32-Core Processor | 605 |
Sm(this issue) | 24 | 159 | 264 | 172 | c64_m128_cpu_H(64) | AMD EPYC 7452 32-Core Processor | 1568 |
from abacus-develop.
I have check one example I calculated previous, and the ecutwfc is also 100 Ry.
example natom nbands nelec kpoints bohrium_machine (parallel core) cpu ave scf_time
041_ZnMnGa 49 290 481 63 c32_m128_cpu(32) Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz 305
043_RuSc 30 223 370 112 c32_m128_cpu(32) Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz 530
055_ErAlNi 24 217 360 152 c32_m128_cpu(32) Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz 278
V(this issue) 32 250 416 112 c64_m128_cpu_H(64) AMD EPYC 7452 32-Core Processor 605
Sm(this issue) 24 159 264 172 c64_m128_cpu_H(64) AMD EPYC 7452 32-Core Processor 1568
Is average scf_time
too high for V and Sm? Is any way to solve this?
from abacus-develop.
Is average
scf_time
too high for V and Sm? Is any way to solve this?
Yes, it seems abnormal for these two examples. I suspect the performance of c64_m128_cpu_H(64) is not good. I will try to use c32_m128_cpu (paratera) to test them.
from abacus-develop.
I use c32_m128_cpu (paratera) to run example v with 32 cores parallel, the first 3 scf steps are:
ITER ETOT(eV) EDIFF(eV) DRHO TIME(s)
DA1 -6.424983e+04 0.000000e+00 2.174e+00 7.761e+02
DA2 -6.425507e+04 -5.239252e+00 2.121e+00 4.531e+02
DA3 -6.425613e+04 -1.065634e+00 8.683e+00 6.127e+02
While the results in this issue are:
ITER ETOT(eV) EDIFF(eV) DRHO TIME(s)
DA1 -6.424983e+04 0.000000e+00 2.174e+00 8.604e+02
DA2 -6.425507e+04 -5.239372e+00 2.121e+00 5.164e+02
DA3 -6.425613e+04 -1.063900e+00 8.689e+00 7.088e+02
As we can see, the performance of c32_m128_cpu (paratera) is better than c64_m128_cpu_H(64).
from abacus-develop.
Update the first 3 SCF steps of Sm on c32_m128_cpu :
ITER ETOT(eV) EDIFF(eV) DRHO TIME(s)
DA1 -2.616618e+04 0.000000e+00 7.877e-02 4.249e+03
DA2 -2.616630e+04 -1.143728e-01 1.732e-02 1.742e+03
DA3 -2.616623e+04 6.750572e-02 1.187e-01 1.925e+03
The results in this issue are:
ITER ETOT(eV) EDIFF(eV) DRHO TIME(s)
DA1 -2.616618e+04 0.000000e+00 7.877e-02 3.811e+03
DA2 -2.616630e+04 -1.143728e-01 1.732e-02 1.372e+03
DA3 -2.616623e+04 6.750572e-02 1.187e-01 1.624e+03
from abacus-develop.
I use c32_m128_cpu (paratera) to run example v with 32 cores parallel, the first 3 scf steps are:
ITER ETOT(eV) EDIFF(eV) DRHO TIME(s) DA1 -6.424983e+04 0.000000e+00 2.174e+00 7.761e+02 DA2 -6.425507e+04 -5.239252e+00 2.121e+00 4.531e+02 DA3 -6.425613e+04 -1.065634e+00 8.683e+00 6.127e+02
While the results in this issue are:
ITER ETOT(eV) EDIFF(eV) DRHO TIME(s) DA1 -6.424983e+04 0.000000e+00 2.174e+00 8.604e+02 DA2 -6.425507e+04 -5.239372e+00 2.121e+00 5.164e+02 DA3 -6.425613e+04 -1.063900e+00 8.689e+00 7.088e+02
As we can see, the performance of c32_m128_cpu (paratera) is better than c64_m128_cpu_H(64).
Please see the latest V case which fails to finished the scf calculation due to KILLED BY SIGNAL: 6 (Aborted)
. Maybe this imply something
V_failed_signal_6.zip
from abacus-develop.
Please see the latest V case which fails to finished the scf calculation due to
KILLED BY SIGNAL: 6 (Aborted)
. Maybe this imply something V_failed_signal_6.zip
The error of this test is related to the SchmitOrth in davidson:
abacus: /abacus-develop/source/module_hsolver/diago_david.cpp:947: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.
abacus: /abacus-develop/source/module_hsolver/diago_david.cpp:947: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.
abacus: /abacus-develop/source/module_hsolver/diago_david.cpp:947: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.
abacus: /abacus-develop/source/module_hsolver/diago_david.cpp:947: void hsolver::DiagoDavid<>::SchmitOrth(const int &, const int, const int, psi::Psi<T, Device> &, const T *, T *, const int, const int) [T = std::complex<double>, Device = psi::DEVICE_CPU]: Assertion `psi_norm > 0.0' failed.
Usually, it is because of the numerical instability.
from abacus-develop.
Related Issues (20)
- Found a Bug during Integration Test and Unit Test / Test when Trying to Do a PR HOT 1
- Compilation error with the Sugon environment HOT 1
- The output LCAO wavefunction is different by compiled with intel and gnu HOT 1
- Bug: present orb_matrix file is not safely written/renamed
- Refactor: I will remove GlobalV::CURRENT_SPIN in ABACUS
- Refactor: restart LCAO calculation from wavefunction
- Bug: The search function for online documents is not available. HOT 3
- The doxygen unit test on my branch is not passing HOT 1
- Compile too slowly and have a lot of redundant warnings. HOT 4
- Compilation errors with the GNU environment HOT 14
- HSE calculation is interrupted without error reports HOT 1
- Numerical atomic orbitals for Ti_ONCV_PBE-1.0.upf including unoccupied bands as reference HOT 2
- The stress and force programs of SDFT and KSDFT can be unified.
- Feature: update LibComm to the latest version HOT 2
- `pw_diag_thr` should not increase in SCF loop in function `set_diagethr()` HOT 1
- Refactor DiagoDavid to Eliminate External Dependencies on DiagoIterAssist
- Perf: OOM in 4*32GB DCU
- Request: Transition State Optimization in ABACUS itself HOT 1
- Docs of berry phase calculation should be modified with setting symmetry = -1
- Test: our CICD auto test workflow does not work fully as expected HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from abacus-develop.