GithubHelp home page GithubHelp logo

Comments (4)

pxlxingliang avatar pxlxingliang commented on July 21, 2024

As we can see from the results, even using kpar can speed up the calculation of cg with less memory, the time cost is still much longer than that by dav. The speed up is about 10% for kpar is from 1 to 2 in cg.

from abacus-develop.

pxlxingliang avatar pxlxingliang commented on July 21, 2024

I also do the test on QE for example mp-1067451.
For CG method, the kpar will slow down the calculation, which seems strange.
For david method, the calculation is normal for kpar=2, which indicate the less memory cost in QE than ABACUS, and this is because the pw_diag_ndim is 4 in ABACUS and 2 in QE.

CG:

           kpar ks_solver  scf_time  scf_steps  normal_end  ibzk  \
qe/qe-nk1  None      None   27244.7          2       False     8   
qe/qe-nk2  None      None   28781.4          2       False     8   
qe/qe-nk4  None      None   40043.1          2       False     8   

                                 scf_time_each_step  
qe/qe-nk1              [21035.1, 6209.600000000002]  
qe/qe-nk2  [22302.300000000003, 6479.0999999999985]  
qe/qe-nk4              [31618.5, 8424.600000000002]  

DAV:

                             kpar ks_solver  scf_time  scf_steps  normal_end  \
qe-dav-nk1/mp-1067451/00000  None      None    4854.9          2       False   
qe-dav-nk2/mp-1067451/00000  None      None    4368.9          2       False   
qe-dav-nk4/mp-1067451/00000  None      None       NaN          1       False   

                             ibzk                        scf_time_each_step  \
qe-dav-nk1/mp-1067451/00000     8  [3558.7000000000003, 1296.1999999999998]   
qe-dav-nk2/mp-1067451/00000     8              [3181.3999999999996, 1187.5]   
qe-dav-nk4/mp-1067451/00000     8                                      None  

from abacus-develop.

pxlxingliang avatar pxlxingliang commented on July 21, 2024

I set the pw_daig_ndim to 2 in ABAUCS and do the kpar test on mp-1067451 with a larger machine c64_m520_cpu (mpi parallel with 32 cores).
Compared to kpar=1, kpar=2 can speed up the SCF calculation about 20%, while kpar=4 is slower than kpar=2.

It doesn't seem like the larger the kpar, the higher the efficiency.

As the kpar larger, the memory cost is larger. For this case, kpar=1/2 can finish the SCF/FORCE/STRESS calculation, but kpar=4/6 can only finish SCF/FORCE calculation, and the memory for STRESS calculation is larger than 520G.

The memory need by STRESS seems about 1.5 times to SCF/FORCE calculation.

                       kpar ks_solver  scf_time  scf_steps  normal_end  ibzk  \​
mp-1067451-new/00000     1       dav   5704.52        2.0        True     8   ​
mp-1067451-new/00001     2       dav   4321.94        2.0        True     8   ​
mp-1067451-new/00002     4       dav   4567.33        2.0       False     8   ​
mp-1067451-new/00003     6       dav   4913.62        2.0       False     8   ​
mp-1067451-new/00004     8       dav       NaN        NaN       False     8   ​
​
                     scf_time_each_step  ​
mp-1067451-new/00000  [4708.34, 996.18]  ​
mp-1067451-new/00001  [3541.35, 780.59]  ​
mp-1067451-new/00002  [3726.31, 841.02]  ​
mp-1067451-new/00003  [3997.89, 915.73]  ​
mp-1067451-new/00004               None

The memory cost for kpar=1
097275fdcd98f0bb6486105aa27aff76__preview_type=16

from abacus-develop.

pxlxingliang avatar pxlxingliang commented on July 21, 2024

The performance may not be tested by one example, and the performance of c64_m520_cpu is unstable. I have rerun the mp-1067451-new/00002, and this time the time cost of first two SCF steps are 3427.07 and 764.52 s, which is faster than previous test.

from abacus-develop.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.