amrex-astro / microphysics Goto Github PK

common astrophysical microphysics routines with interfaces for the different AMReX codes

Home Page: https://amrex-astro.github.io/Microphysics

License: Other

Makefile 0.30% Fortran 0.02% Python 1.42% Shell 0.01% C++ 96.83% CSS 0.01% HTML 0.01% CMake 0.42% Batchfile 0.08% BrighterScript 0.01% Starlark 0.07% Cuda 0.83%

equation-of-state reactions nuclear-reactions conductivity stars microphysics-routines

microphysics's Introduction

Microphysics

A collection of astrophysical microphysics routines for stellar explosions

There are several core types of microphysics routines hosted here:

conductivity/: stellar conductivities needed for modeling thermal diffusion processes.
constants/: fundamental physical constants.
EOS/: these are the equations of state. All of them accept a struct called eos_t to pass the thermodynamic state information in and out, though in C++ they are templated such that they can accept other objects with members of the same name.
integration/: this holds the various ODE integrators. VODE is the primary integrator for production use, but other integrators are provided for experimentation.
interfaces/: this holds the structs used to interface with the EOS and networks.
networks/: these are the reaction networks. They serve both to define the composition and its properties, as well as describe the reactions and energy release when reactions occur.
neutrinos/: this holds the plasma neutrino cooling routines used in the reaction networks.
nse_solver/: a solver for nuclear statistical equilibrium that finds the equilibrium state for the nuclei represented by the network.
nse_tabular/: a tabulation of the NSE state from a large network that can be used together with the aprox19 network.
opacity/: radiative opacities used for radiation solvers.
rates/: this contains some common rate routines used by the various aprox networks, and could be expanded to contain other collections of rates in the future
screening/: the screening routines for nuclear reactions. These are called by the various networks
unit_test/: a collection of unit tests that exercise the different pieces of Microphysics
util: linear algebra routines for the integrators (specifically a linear system solver from LINPACK), the hybrid Powell solver, other math routines, and build scripts

AMReX-Astro Codes

At the moment, these routines are written to be compatible with the AMReX-Astro codes, Maestro and Castro.

Castro: http://amrex-astro.github.io/Castro/
MAESTROeX: http://amrex-astro.github.io/MAESTROeX/
Quokka: https://quokka-astro.github.io/quokka/

To use this repository with AMReX codes, set MICROPHYSICS_HOME to point to the Microphysics/ directory.

There are various unit tests that work with the AMReX build system to test these routines.

Other Simulation Codes

The interfaces are fairly general, so they can be expanded to other codes. This will require adding any necessary make stubs for the code's build system as well as writing unit tests for that build system to ensure the interfaces are tested.

Documentation

A user's guide for Microphysics is available at: http://amrex-astro.github.io/Microphysics/docs/

The sphinx source for the documentation is in Microphysics/sphinx_docs/

Development Model:

Development generally follows the following ideas:

New features are committed to the development branch.

Nightly regression testing is used to ensure that no answers change (or if they do, that the changes were expected).

If a change is critical, we can cherry-pick the commit from development to main.
Contributions are welcomed from anyone. Any contributions that have the potential to change answers should be done via pull requests. A pull request should be generated from your fork of Microphysics and target the development branch. (If you mistakenly target main, we can change it for you.)

Please add a line to CHANGES summarizing your change if it is a bug fix or new feature. Reference the PR or issue as appropriate. Additionally, if your change fixes a bug (or if you find a bug but do not fix it), and there is no current issue describing the bug, please file a separate issue describing the bug, regardless of how significant the bug is. If possible, in both the CHANGES file and the issue, please cite the pull request numbers or git commit hashes where the problem was introduced and fixed, respectively.

If there are a number of small commits making up the PR, we may wish to squash commits upon merge to have a clean history. Please ensure that your PR title and first post are descriptive, since these will be used for a squashed commit message.
On the first workday of each month, we perform a merge of development into main, in coordination with AMReX, Maestro, and Microphysics. For this merge to take place, we need to be passing the regression tests.

To accommodate this need, we close the merge window into development a few days before the merge day. While the merge window is closed, only bug fixes should be pushed into development. Once the merge from development -> main is done, the merge window reopens.

Core Developers

People who make a number of substantive contributions will be named "core developers" of Microphysics. The criteria for becoming a core developer are flexible, but generally involve one of the following:

10 non-merge commits to Microphysics/ (including Docs/) or one of the problems that is not your own science problem or
addition of a new algorithm / module or
substantial input into the code design process or testing

Core developers will be recognized in the following ways:

invited to the group's slack team
listed in the User's Guide and website as a core developer
invited to co-author general code papers / proceedings describing Microphysics, its performance, etc. (Note: science papers will always be left to the science leads to determine authorship).

If a core developer is inactive for 3 years, we may reassess their status as a core developer.

Getting help

We use github discussions for requesting help and interacting with the community:

https://github.com/amrex-astro/Microphysics/discussions

microphysics's People

Contributors

Stargazers

Watchers

microphysics's Issues

remove explicit bl_* stuff to generalize

to allow the code to be used by stuff other than AMReX codes, we should write wrappers for things like bl_error, etc.

need to reevaluate the tolerances

We ask for species to be evolved to a tolerance of 1.d-12 (in integration/_parameters).

This is pretty tight. We need to check whether it can be relaxed. We can relax on a network-by-network basis (using priorities in the _parameter files).

It seems that the original aprox13 and aprox19 networks used tolerances of 1.e-6

burn_cell should be documented

We should add unit_tests/burn_cell/ to the Docs

rate caching is dangerous and memory-intensive

the rate caches should be removed from the burn_t

Make VODE clean_state look like BS clean_state

VODE's clean_state should include the calls to renormalize_species like BS does. Also, VODE doesn't have a check to ensure that the temperature stays reasonable, like BS does.

Compare VODE to VODE90. Determine if a switch to VODE90 is in order.

At the moment, we have two VODE-style integrators, VODE and VODE90.

Earlier testing indicates they yield identical integration answers but there may be performance differences. We should compare the performance of each and determine whether it is worth switching to VODE90.

Issues to consider:

VODE90's use of derived types may slow down the GPU. We may need to refactor a bit to eliminate derived types.
VODE90 seems a bit slower than VODE, but we should check this and figure out why.

Allow the user to set a maximum temperature in the integration

This should be done via a call to eos_get_max_temp so that the user's probin variables get used properly.

remove integrate_molar_fraction option

We should completely remove the integrate_molar_fraction option and instead rely on networks
to always return stuff in terms of dX/dt. This will cut back on the complexity of the code a lot, eliminating a lot of unnecessary conversions

test suite has poor coverage of eos in burn

None of the tests exercise use_eos_in_rhs or the dT_crit options

make a tabular EOS in terms of (rho, e)

We should investigate making a tabular EOS in (rho, e) -- this would be especially useful for SDC. Perhaps the thing to tabulate is entropy, then we can express it in terms of rho, e and get p, T via partial derivatives.

make species naming consistent

Some networks use He4 others he4 -- we should be consistent

need GPU tests in the regression test suite

We have no GPU tests in the PGI test suite. We should pick some basic tests to add coverage.

create top-level rate tabulation module

We should create a microphysics-wide rate tablulation module that operates on the T-dependence of the reaction rates

Switch CUDA VODE90 to use cuBLAS

Testing has shown system implementations of BLAS are much more efficient than compiling in BLAS ourselves. We should switch the CUDA version of VODE90 to use cuBLAS and check performance.

In particular, since cuBLAS calls require an on-device kernel launch, it will be interesting to see whether the overall performance gains from cuBLAS are worthwhile.

helmeos coulomb corrections should gracefully turn off

If we encounter a situation where the Coulomb corrections make the pressure, energy, or entropy negative, we simply turn them off now. We should smoothly bring them to zero to prevent discontinuous derivatives.

test suite should include some GPU tests

To my knowledge, the PGI test suite doesn't do any tests that utilize the GPU. I recommend adding a test using the ignition_simple network, BS integrator, and the GPU (ACC=t). Something along the lines of:

$ cd $MICROPHYSICS_HOME/networks/ignition_simple/test
$ make COMP=PGI ACC=t
$ ./testburn.Linux.PGI.debug.acc.exe

Something like this should serve as a minimal verification that basic GPU code is working. As more integrators and/or networks are robustly utilizing the GPU, we can add similar tests to test them (in this case, the default GNUMakefile has already chosen the BS integrator for us).

for BS integrator, ode_scale_floor seems arbitrary

the current value of ode_scale_floor means that trace abundances have essentially no input of the error and convergence. We should try to link ode_scale_floor to small_x somehow

esum is slow

We use esum() to do exact sums of specific terms in the RHS of the ODEs to prevent roundoff. But esum() is slow. At the moment, we have a general routine with a large max_esum_size -- this also causes trouble on the GPUs.

We should experiment with creating specific esum() routines for the number of terms involved, e.g., esum3(), esum4(), esum5(), ...

We know this ahead of time, since we are explicitly calling esum() on specific combination of terms in rate equations.

Include fundamental constants in this repo?

Currently this would be inconsistent with some EoS tables that are pre-generated with a specific set of constants, but could still be useful in the long-run for separate codes to have a shared set of constants.

Migrate Test Suite to C++

Since MAESTRO is now moving to the C++ AMReX, we should migrate the test suite drivers in Microphysics to use the C++ AMReX as well.

A good starting point is test_react in Castro.

add neutrino losses to aprox13

This comes out of discussions with Sam Jones, Aron Michel, and @carlnotsagan

We should implement the neutrino losses from the weak reactions. This would mean keeping track of each reaction and what the actual Q value is (subtracting neutrino losses), and evolving an enuc equation that used these Q values.

From Sam:

I think we estimated the neutrino energy losses, and even though they were smaller than 
I had expected, I agree that they're still important.
...
The way I would implement it would be to introduce the Q value (binding energy difference 
between products and reactants) for each reaction, and additionally a Q_neu for the weak 
reactions, which is the average neutrino energy per reaction, Q_neu = eps_neu/lambda, 
where eps_neu and lambda are the neutrino luminosity [MeV/s] and the rate [/s] from the 
LMP tables, respectively. Q_neu is of course 0 for the reactions involving the strong 
nuclear force. Then the energy generation is the sum of the number of times a reaction 
takes place multiplied by (Q-Qneu).

vbdf should carry its own burn_t

We should modify the bdf_t to include a burn_t directly, eliminating much of the work done in bdf_to_burn. This will mirror what is done with BS.

make a VODE SDC integrator

We want to make VODE work with the SDC interface. Unlike the BS integrator, there is no VODE analog to the bs_t type. We need to do the following:

we need to create a version of vode_type.F90 for SDC.
- This will need to have have a clean_state the fixes up the internal energy, a fill_unevolved_variables routine,
- There will be no update_thermodynamics routine.
- there are no vode_to_eos or eos_to_vode routines
- we need vode_to_sdc and sdc_to_vode routines
the general rpar.F90 that lives in integrator/ will need some different components -- see the BS/ version as comparison. In particular, it will need the rho and momentum indices. We probably will also need to store the advective sources here.

Basic GPU test fails

The basic GPU test described in Issue #15 fails. On my local machine, I get

[ajacobs@xrb test](development *)$ ./testburn.Linux.PGI.acc.exe

 Initializing Helmholtz EOS and using Coulomb corrections.

FATAL ERROR: data in update device clause was not found on device 1: name=pi
 file:/home/ajacobs/Codebase/Microphysics/networks/ignition_simple/test/../../../EOS/helmholtz/actual_eos.F90 actual_eos_init line:1327

On Stony Brook's bender, I get what may be an error in the system configuration:

[ajacobs@bender test](development)$ ./testburn.Linux.PGI.acc.exe 

 Initializing Helmholtz EOS and using Coulomb corrections.

modprobe: FATAL: Module nvidia-uvm not found in directory /lib/modules/4.7.5-200.fc24.x86_64
call to cuInit returned error 999: Unknown

The error happens with and without debug symbols.

The error seems to be saying pi isn't initialized, but in actual_eos.F90 it is declared. I'm investigating the error now.

Bad GPU results

Many of the results from GPU-accelerated unit-test code appear to be wrong. As a concrete example, I've built an accelerated and CPU-only executable of the test_react unit test.

Build and execute an accelerated binary, move output for later comparison (note that I've supressed the output of commands):

cd $MICROPHYSICS_HOME/unit_test/test_react
make COMP=PGI NETWORK_DIR=ignition_simple ACC=t -j6
./main.Linux.PGI.acc.exe inputs_ignition.BS
mv react_ignition_test_react.BS react_ignition_test_react.BS.ACC

Build and execute a CPU-only binary:

make COMP=PGI NETWORK_DIR=ignition_simple -j6
./main.Linux.PGI.exe inputs_ignition.BS

If I now compare the two output files, we see they're very different:

fcompare.Linux.gfortran.exe --infile1 react_ignition_test_react.BS --infile2 react_ignition_test_react.BS.ACC

            variable name            absolute error            relative error
                                        (||A - B||)         (||A - B||/||A||)
 ----------------------------------------------------------------------
 level =  1
 density                           0.2384185791E-06          0.1192092896E-15
 temperature                       0.6854534149E-06          0.9792191642E-15
 Xnew_carbon-12                    0.9999999997              0.9999999999    
 Xnew_oxygen-16                    0.7999999999              0.9999999999    
 Xnew_magnesium-24                 0.9999999997               9.999436761    
 Xold_carbon-12                    0.9999999997              0.9999999999    
 Xold_oxygen-16                    0.7999999999              0.9999999999    
 Xold_magnesium-24                 0.9999999997               9.999999997    
 wdot_carbon-12                    0.2812178371E-03           1.000000000    
 wdot_oxygen-16                    0.1110223025E-14           1.000000000    
 wdot_magnesium-24                 0.2812178371E-03           1.000000000    
 rho_Hnuc                          0.3150192097E+24           1.000000000

So while many networks and integrators seem to be able to compile and run without crashing, it's not clear how many are generating correct physical results. I've seen a similar issue with the VBDF integrator, so it doesn't appear to be specific to an integrator or network. These results are from bender, which has PGI 16.9 and a GeForce GTX 960 GPU (with CUDA 8.0 drivers and CUDA 7.5 compilers).

add conductivities

we should move the stellar conductivity routine from Maestro to here

create an EXTRA_THERMO preprocessor

At the moment, the EOS returns all possible thermodynamic quantities, but sometimes we don't need all of these. We should create and EXTRA_THERMO preprocessor flag that will turn off some of the less-needed quantities. This also should be hooked into the eos_t type in the application codes.

OpenACC F90 test_react w/ ignition_simple & VBDF giving ptx errors

Building test_react with

make COMP=PGI NDEBUG= OMP= NETWORK_DIR=ignition_simple INTEGRATOR_DIR=VBDF ACC=t

Errors like the following come up:

ptxas /tmp/pgaccBw5JrAtcYokR.ptx, line 1842; fatal   : Parsing error near '-': syntax error
ptxas fatal   : Ptx assembly aborted due to errors
PGF90-S-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (../../integration/VBDF/actual_integrator.F90: 1)
  0 inform,   0 warnings,   1 severes, 0 fatal for 
make: *** [t/Linux.PGI.debug.acc/o/actual_integrator.o] Error 2
make: *** Waiting for unfinished jobs....

Through commenting out and slowly uncommenting, I've traced at least one triggering of the error to a derived type assignment in Microphysics/integration/VBDF/actual_integrator.F90 in the initial_timestep() subroutine: ts_temp = ts.

However, after writing and using a copy subroutine for bdf_ts types, the error continues. It seems any use of ts_temp triggers the error, even ts_temp%neq = 1.

num_rate_groups should be a network parameter

Not all the networks need the same amount of rate storage. num_rate_groups should be defined on a network-by-network basis

scaling of Jacobian elements is not right

Applying the temp_scale and ener_scale to the Jacobian elements after they are filled doesn't seem right for the derivative wrt T. E.g., we do:

bs % jac(net_itemp,:) = bs % jac(net_itemp,:) * inv_temp_scale

but that shouldn't apply to bs % jac(net_itemp, net_itemp)

BS integrator uses a single rtol

The BS integrator does not allow for different tolerances on each component, like we do with VODE. We should generalize it so that we can specify a separate rtol for each integration variable.

reset of integration needs to reset T_old

If integration failed and we reset to the initial state to try again, we need to reset T_old and the cv/cp too, for consistency. Perhaps this would be easier with a bs_init variable so we can just do bs = bs_init and go.

SDC integrators don't support nspec_evolve != nspec

The SDC integrators don't currently handle how we update species where nspec_evolve < nspec. Siince these still have advective terms, we still would need to do some integration. But the current update_unevolved_species mechanism probably is not enough

Vectorize helmholtz EOS!

The helmholtz EOS can represent a significant computational cost. We could consider vectorizing it.

need to document the backup integration mechanism

we now have the ability to use VODE or BS as the backup for integration -- this needs to be documented.

add eos_finalize

We should have an eos_finalize() and actual_eos_finalize() functionality.

Profile the SDC implementation in the Microphysics integrators

Max has suggested we profile the SDC integration to determine how expensive the EOS calls really are.

The motivation for this is that the EOS calls use rho, e as input variables and it may be worthwhile to think about how to formulate T integration source terms so we could use rho, T as input variables to the EOS instead.

The cost of the EOS should be more apparent using tabulated rates, so this is related to issue #12

test suite has poor coverage of different burning modes

In particular, there are no tests of anything other than burning_mode = 1

Also need to check if we have coverage of do_constant_volume_burn

decouple from amrex

With AMReX coming online, we need to decouple these routines from the ~~boxlib~~ AMReX dependency.

The main place this comes in is through calls to bl_error and using bl_constants_module.

We can instead provide a microphysics_error and microphysics_constants. These can simply wrap the BoxLib or AMReX routines, assuming that they provide the necessary info. We need to then have a build-time way of letting Microphysics know which of the libraries to link in.

BS SDC dimensioning

In bs_type_sdc we dimension:

     real(kind=dp_t) :: u(n_rpar_comps), u_init(n_rpar_comps), udot_a(n_rpar_comps)

but these should really be dimensioned as SVAR-SVAR_EVOLVE

update C12(a,g)O16 rate

Consider using the new rate from this compilation:

https://journals.aps.org/rmp/abstract/10.1103/RevModPhys.89.035007

@carlnotsagan can advise us :)

reintroduce parameters into helmholtz/actual_eos.F90

When playing with OpenACC, there were compiler issues with Fortran parameters on GPUs. We got rid of the parameters to make things play nice. With our new CUDA methodology, we should go back to parameters. E.g., in helmholtz/actual_eos.F90, the variable pi

aprox21 missing rates (reported by Sam Jones)

from Sam:

I found a bug in your implementation of approx21 in BoxLib. The
jacobian is fine but the rhss do not include terms for fe56 and cr56
(i.e. they are zero). Looks like it was copied from approx19 but not
modified for approx21.

BS scaling method

we scale based on abs(y) + dt abs(ydot), but shouldn't we try abs(y + dt*ydot) too? maybe scaling_method = 3?

Order Pizza for October 16 Hackathon

It would be fun to order pizza for those of us participating in the mini-hackathon.

I'd be happy to chip in $10.

BS SDC uses SVAR instead of SVAR_EVOLVE

The size of the system allocated in the BS actual_integrator_sdc.F90 is SVAR but it shouldn't it really be SVAR_EVOLVE? This affects, for example, the tolerances.

we should update VODE90 from linpack to lapack

VODE90 currently uses linpack (since that's what VODE used originally). We should switch it over to lapack so we can use system-optimized lapack routines. This shows the lapack equivalent functions:

http://www.netlib.org/lapack/lug/node147.html

(note, the table has routines starting with s for single precision, but ours, of course, use d)

For aprox13, rate tabulation should be the default

This gives fairly accurate results relative to the direct rate evaluation method, but is much faster on CPUs and essentially necessary for GPUs.

This can be done by setting use_tables to .true. in the aprox13/_parameters file.

VBDF fails for some networks on CPU

A table has been started to keep track of which integrators are able to integrate different networks on the CPU (space is also available for a similar table for the GPU, but isn't populated yet. We should work on the CPU before trying to work on the GPU anyway).

This issue addresses VBDF failures on the CPU. As the table shows, VBDF fails for the aprox13 and aprox19 networks using the configuration and input found in the unit test.

I'm currently comparing the integration of VBDF with VODE, which in theory implement the same algorithms. For aprox19 I've isolated the cell that fails for VBDF, which VODE seems fine with. I'm currently working to find where the algorithms deviate such that VODE is able to converge to a result while VBDF is not.

add SDC unit tests to regression suite

We don't have any SDC unit tests in the suite