Rewrite interfaces to dft calculators (e.g. castep.evaluate_op) into classes inheritin

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Refactor calculators about workflow HOT 11 CLOSED

gelzinyte commented on August 10, 2024

Refactor calculators

from workflow.

Comments (11)

bernstei commented on August 10, 2024 1

I've had mixed success contributing to ASE. If you think it's stuff other people might use it's probably worth it, because it's not very much work to start (clone the gitlab repo, add the changes, make an MR), but we'll see how the developers respond. I've had some times when they are basically reasonable, some times when the changes they wanted were not (in my opinion) excessive (e.g. modernize/fix other bugs in the same class just because I was fixing one thing), and most recently I just got no response.

from workflow.

bernstei commented on August 10, 2024

This is written like it's already a PR ("An example is in..."). If that's intended, can you create the PR (I think you can link it to this issue with closes #17 in the description.

We could also consider combining it with my new iloop wrapper that's a bit more like a decorator that the current approach (although not using python @ notation), if we decide it's a nicer interface.

from workflow.

gelzinyte commented on August 10, 2024

@gelzinyte For now I'll take another look at the ASE-extended ORCA calculator, and see if I have any comments relevant to doing the something similar to the other DFT calculators. [edited] I'll comment in #17. Also, can you take another look at #16 , and see if it can be closed?

Great, let me know what you think. Note that we added a bunch of features to ORCA, on top of file handling, so implementing file handling for other DFT calculators would be a bit less of a hassle. All the relevant changes should only be in ORCA.__init__() and ORCA.calculate().

from workflow.

bernstei commented on August 10, 2024

One thing that might be nice (although maybe beyond the scope of this issue) is to have a way to have multiple reusable directories. If you are doing something like NEB, you might want each image to re-use its own the saved charge density/wavefunctions from the previous iteration. That would require passing to the calculator (probably the constructor) something, e.g. a suffix, to define the run directory (and of course the NEB code knowing to pass it).

from workflow.

gelzinyte commented on August 10, 2024

yes, I was thinking that too. Another example - imagine only some of the autoparallelized DFT evaluations have succeeded. Then it would be nice to re-run the same script, but if the results are already there they would be read in. And only the non-succeeded calculations would have to be restarted.

I don't think it would be very difficult? Although probably better done as a separate issue/PR if there's need. First, we could make the temporary directories' names be Atoms-dependent hashes (for example hash of (atomic numbers, positions, calc_settings) or a pre-calculated hash in Atoms.info in NEB's case where the positions change). In addition, keep_files should be set to True (or to whatever files must be kept for a restart). Finally, you'd only need to set a flag for "reuse previous calc results".

from workflow.

bernstei commented on August 10, 2024

Auto-parallelization is a whole can of worms. First, there's the question of why you think a rerun will work? Have you done something to change at the expyre level, e.g. a longer runtime? Have you changed some DFT input param (max # of SCF steps)? Is it just that the node crashed and you want to try again without changing anything? All of that will affect what you think you can retain from the previous run.

I agree that it's worth thinking about, but it's not trivial. When it's just something like runtime, or number of nodes, I tend to manually log into the HPC machine, go to the relevant directory under ~/run_expyre, edit the job file by hand, qsub by hand, wait for it to finish, then rerun the wfl task so it gathers the results. Note that right now you can't rerun the wfl before the job is done, because it'll be confused about the different jobid when doing qstat. I want to fix that (maybe not even store the remote jobid, but just extract it from the output of qstat filtered by job name, which is already a unique string from the same hash as the job run directory), but haven't had time to think about the details. Maybe I should open an expyre issue. [done: https://github.com/libAtoms/ExPyRe/issues/23]

from workflow.

bernstei commented on August 10, 2024

The derived class ORCA looks mostly fine, although I don't really understand why it seems to be reimplementing stuff that I'd expect to be in the ASE original. Should we split the class into a different source file than the associated utility functions?

However, we should also think about whether we can make a more generalized structure, since I'm assuming that things like doing the mkdir will be needed for all of the DFT calculators, but there will probably always be some things that are really one-off, like the VASP pbc mangling. Should we do something similar for another calculator and see what they have in common?

from workflow.

gelzinyte commented on August 10, 2024

Maybe I should open an expyre issue

Thanks, I think this will be very handy - re-running failed jobs in _expyre seems intuitive, if a bit hack-y.

Should we do something similar for another calculator and see what they have in common?

I will try and implement a Castep example today/tomorrow and open a PR.

although I don't really understand why it seems to be reimplementing stuff that I'd expect to be in the ASE original

We have added a lot of extra functionality, in some cases very specific and in some general but not contributed back to ASE. We had to over-write some of the ASE-orca functions to extend:

write_input: automatically pick default multiplicity, option for geometry optimisation in addition to single point.
read_results: raise error if wavefunction isn't converged (instead of silently skipping); read dipole, optimised geometry, trajectory, frequencies

And the rest is just not implemented in ASE-Orca.

from workflow.

bernstei commented on August 10, 2024

I'm not sure exactly when I'll have time, but I'll try to prioritize ExPyRe issue 23 when I have time to code.

I'm find with extending ORCA relative to ASE. Do you think it's worth it to have two separate classes, one derived from the other. The first with just extensions over the ASE interface (i.e. that parts that might in principle be worth contributing back), and the second that just contains the wfl-specific implementation?

from workflow.

gelzinyte commented on August 10, 2024

I'm not sure exactly when I'll have time, but I'll try to prioritize ExPyRe issue 23 when I have time to code.

I misread your comment and thought that that issue is an already merged PR that I have missed. It would be a very convenient feature, but definitely not at all urgent or even high on priority list.

Do you think it's worth it to have two separate classes

I would rather not split it, because we would end up triplicating some things and I am not sure if the exercise is wroth it? For example, dipole calculations might be useful to contribute to ASE, but geometry optimisation isn't compatible with ASE. So the both of the classes would need to overwrite the inherited read_results. Instead of splitting into two classes I would rather try and contribute the relevant bits to ASE, if you think that's worthwhile?

from workflow.

bernstei commented on August 10, 2024

VASP, QE, and ORCA are now derived from the underlying ASE Calculator classes. Castep is a separate issue, because it seems more nonstandard, so will be dealt with in a separate issue.

from workflow.

Refactor calculators about workflow HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs