yt-project / libyt Goto Github PK
View Code? Open in Web Editor NEWIn-situ analysis with yt
Home Page: https://libyt.readthedocs.io/
License: BSD 3-Clause "New" or "Revised" License
In-situ analysis with yt
Home Page: https://libyt.readthedocs.io/
License: BSD 3-Clause "New" or "Revised" License
libyt
API memory consumption and see if there is any memory leakage.# run with "sh run.sh <log_folder_name>"
# naming
LOG_FOLDER=$1
EXE=API-test.out
# execute
export LD_PRELOAD=/home/calab912/software/valgrind-3.19.0/lib/valgrind/libmpiwrap-amd64-linux.so
export PYTHONMALLOC=malloc
mpirun -np 4 --output-filename $LOG_FOLDER /home/calab912/software/valgrind-3.19.0/bin/valgrind -v --tool=massif --time-unit=B --detailed-freq=1 ./$EXE
# path
MPI_PATH := /home/calab912/software/openmpi/3.1.5-gnu
LIBYT_PATH := /home/calab912/Documents/GitHub/libyt
# output name
BIN := API-test.out
FILE := main.cpp
COMPILER := $(MPI_PATH)/bin/mpic++
# command
$(BIN): $(FILE)
$(COMPILER) -o $(BIN) $(FILE) -I$(LIBYT_PATH)/include -L$(LIBYT_PATH)/lib -lyt
clean:
rm -f $(BIN)
rm -rf log
rm -rf __pycache__
rm -f *.png *.gif RecordTime_* massif.*
libyt
data member's reference count using sys.getrefcount
.sys.getrefcount
will increase original count by 1, since it refers to that value to get the count.(link)
2
.yt_init
and yt_finalize
yt_set_parameter
yt_add_user_parameter_*
yt_set_parameter
and yt_get_gridsPtr
yt_set_parameter
, yt_get_gridsPtr
, yt_get_fieldsPtr
, and yt_commit_grids
yt_set_parameter
, yt_get_fieldsPtr
and yt_commit_grids
yt_set_parameter
, yt_get_fieldsPtr
, yt_get_gridsPtr
, yt_commit_grids
, and RMA Processyt_set_parameter
, yt_get_fieldsPtr
, yt_get_gridsPtr
, yt_commit_grids
, and load ytyt_set_parameter
, yt_get_fieldsPtr
, yt_get_gridsPtr
, yt_commit_grids
, and load yt with GAMER hierarchyyt
Operationsyt_set_parameter
, yt_get_fieldsPtr
, yt_get_gridsPtr
, yt_commit_grids
, and then draw covering_grids
yt_set_parameter
, yt_get_fieldsPtr
, yt_get_gridsPtr
, yt_commit_grids
, load gamer
hierarchy, and then draw covering_grids
yt_set_parameter
, yt_get_fieldsPtr
, yt_get_gridsPtr
, yt_commit_grids
, load gamer
hierarchy, and then slice target covering_grid
to smaller unityt_init()
PyInit_libyt
(init_libyt_module.cpp)
libyt
, grid_data
, hierarchy
, param_yt
, param_user
libyt_field_derived_func
libyt_particle_get_attr
libyt_field_get_field_remote
data[gid]["field"][3d-NumPyArray]
libyt_particle_get_attr_remote
data[gid]["ptype"]["attr"][1d-NumPyArray]
yt_set_parameter()
add_dict_string
(add_dict.cpp)
param_yt
frontend
, fig_basename
add_dict_scalar
(add_dict.cpp)
param_yt
current_time
, current_redshift
, omega_lambda
, omega_matter
, hubble_constant
, length_unit
, mass_unit
, time_unit
, magnetic_unit
, cosmological_simulation
, dimensionality
, refine_by
, num_grids
, domain_left_edge
, domain_right_edge
, periodicity
, domain_dimensions
allocate_hierarchy()
(allocate_hierarchy.cpp)
hierarchy
grid_left_edge
, grid_right_edge
, grid_dimensions
, grid_particle_count
, grid_parent_id
, grid_levels
, proc_num
yt_add_user_parameter_*
add_dict_string
add_dict_scalar
add_dict_vector3
yt_commit_grids()
add_dict_field_list
(add_dict.cpp)
param_yt["field_list"]
"attribute"
: [ <field_unit>, [<field_name_alias>,], <field_display_name>]"field_define_type"
: <field_define_type>"swap_axes"
: true/false"ghost_cell"
: [ ]add_dict_particle_list
(add_dict.cpp)
param_yt["particle_list"]
"attribute"
: Dictionary
"particle_coor_label"
: [ <coor_x>, <coor_y>, <coor_z> ]append_grid
(append_grid.cpp)
grid_data
grid_data[gid]["field"]
yt_free()
yt_finalize()
Under these structure, sys.getrefcount
are all 2.
libyt
: 9
grid_data
: 3hierarchy
: 3param_yt
: 3param_user
: 3derived_func
: 3get_attr
: 3get_field_remote
: 3get_attr_remote
: 3derived_func
and derived_func_with_name
prepare one data grid at a time.gamer
) to generate derived field.libyt
likes.libyt
's overall performance for now.derived_func_chunk
and derived_func_with_name_chunk
in yt_field
struct.
void (*derived_func_chunk) ( int list_length, long *list_gid, yt_array *data )
void (*derived_func_with_name_chunk) ( int list_length, long *list_gid, char *field, yt_array *data )
get_attr
in yt_particle
struct.
void (*get_attr) ( int list_length, long *list_gid, char *attr_name, yt_array *data_array)
yt_rma_field.cpp
when preparing grid data for remote ranks.init_libyt_module.cpp
function libyt_field_get_field_remote
so that it asks many grids at a time.io.py
, find places to store grids._get_field_from_libyt
, _read_chunk_data
, and _read_fluid_selection
._validate_parent_children_relasionship
in yt/frontends/gamer/data_structures.py
but make it more general
yt_check_grid
0
~ num_grids - 1
0
.-1
.libyt.param_user
dictionary to class libytDataset
.dimensionality < 3
int
, so libyt
abort when long num_grids
exceed size of int
periodic[3]
and grid_left_edge[3]
and grid_right_edge[3]
( #5 )yt_get_gridsPtr
and yt_add_grids
. Change yt_add_grids
to yt_commit_grids
yt_get_gridsPtr
-> yt_commit_grids()
-> yt_inline
-> yt_free_grids()
yt_add_grid()
num_grids_local
times ->yt_inline
yt_inline_ProjectionPlot()
. Make it can perform input parameter to function like yt_inline_ProjectionPlot(a,b,c)
...etc.yt_add_user_parameter_*
can add type other than scalar or a 3-dim vector.
data_dim
to data_dimensions
.libyt
APIInline-analysis went wrong when plotting with selected data, even if yt
operation are using parallel_objects
function.
reminder: https://calab-ntu.slack.com/archives/CSNGU2B4L/p1610709131006300
libyt
Milestoneyt
functionalities.OffAxisProjectionPlot
SlicePlot
OffAxisSlicePlot
Halo Analysis
Isocontours
volume_render
only if size of MPI is even.ParticlePlot
ParticleProjectionPlot
LinePlot
yt
operation should be inside if suite:
if yt.is_root():
if yt.is_root():
clause. #26if yt.is_root()
. #35Halo Analysis
and Isocontours
have not been tested yet.Initially, particle plots may generate false figure, if the MPI size is too large. After testing on different machine, this doesn't seems to be an issue in libyt
. This is more or less related to memory space of one machine has. But we still cannot find where actually is this issue, we move to another issue.
ℹ️ After testing on calab912 and eureka, it doesn't seem to be an issue. Instead, it is more or less related to memory on the machine. Though we still don't know why and where is the bug.
✔️ Particle plot (
ParticlePlot
,ParticleProjectionPlot
) can successfully run in parallel in inline-analysis. Though some memory related issues may occurred.
All of the images are results from inline-analysis. MPI=1
is the correct and expected outcome.
gamer
Plummerimport yt
yt.enable_parallelism()
def yt_inline():
ds = yt.frontends.libyt.libytDataset()
par = yt.ParticlePlot( ds, 'particle_position_x', 'particle_position_y', "particle_mass", center='c' )
if yt.is_root():
par.save()
import yt
yt.enable_parallelism()
def yt_inline():
ds = yt.frontends.libyt.libytDataset()
par = yt.ParticleProjectionPlot( ds, "z")
if yt.is_root():
par.save()
If yt can connect to ParaView (link), then maybe libyt can bypass Catalyst which is a tool for simulation codes to do inline-analysis (in situ) in ParaView.
Although there are things worth notice in real-time volume rendering:
annotate_marker
annotate_text
annotate_arrow
annotate_clumps
(failed in odd MPI size, succeed in MPI of even size ❌ )
annotate_contour
annotate_quiver
annotate_cquiver
annotate_grids
annotate_cell_edges
annotate_streamlines
annotate_velocity
annotate_scale
annotate_timestamp
annotate_ray
annotate_line
annotate_particles
(figure are slightly different in MPI = 1 and MPI > 1 ❌ )
annotate_halo
annotate_magnetice_field
annotate_sphere
annotate_line_integral_convolution
annotate_title
save()
and Operation Inside yt.is_root()
Clauseannotate_streamlines
annotate_velocity
annotate_line_integral_convolution
annotate_particles
annotate_quiver
annotate_cquiver
annotate_magnetic_field
IOHandlerlibyt
annotate_clumps
(failed in MPI = 3, succeed in MPI = 1, 4)save()
outside of the if yt.is_root()
clause.io.py
.libyt frontend
.yt
main branch.libyt
unknown message [DEBUG]
libyt
get non local grids from other rank, and pass it back to yt
, just like how derived fields did.yt
through Wrapping the Existing ArrayIf the particle data is store in contiguous memory block, we can directly wrap them and pass in to yt
. So that we don't have to copy it again.
This new api should co-existing with the original one ( user input get_attr
which returns the particle attributes array ).
In order to merge enzo, I introduced the API for wrapping particle data array in libyt
at (#79). That PR doesn't support particle array just yet, still fixing some bugs.
Plot time series dataset. This can be done in inline script already, but the script will be a little bit messy.
YT_Inline()
Currently, using MPI size N in simulations will make yt
in-situ analysis run with MPI size N.
We want to make it more flexible by giving user freedom to group what MPI ranks should run simulation and what ranks for in-situ analysis.
GAMER
for libyt
by calling Hydro_Con2Temp()ClusterMerger
test problem and compare with the post-processing script gamer/example/test_problem/Hydro/ClusterMerger/yt_script/plot_slice-z.py
Check on libyt
derived function functionality
libyt
derived function and post-processing temperature output directly by gamer
.This document is more like a user guide, and how they can reach developer guide. They will be put inside README.md
and libyt/doc
folder.
yt
functionality
libyt
. See #26 )yt_rma_field
and yt_rma_particle
faster.yt_field
struct.
short field_ghost_cell[6]
is defined as number of cells to ignore at the beginning and the end of the data in each dimensions. Which means field_ghost_cell[0]
is number of cells to ignore at the beginning of 0-dim of the data, and field_ghost_cell[1]
is number of cells to ignore at the end of 0-dim.field_list
dictionary.grid_dimensions
and data_dim
:
grid_dimension
, they are just dimensions read by yt
.data_dim
define in yt_data
are the actual data dimension of the data_ptr
to be wrapped, it contains ghost cell.grid_dimensions
: [x][y][z] dimension read by yt
. (yt_getGridInfo_Dimensions
API)data_dim
: The actual data dimension of the pointer. (yt_getGridInfo_FieldData
API)ghost cell
inside yt_field
.
yt_type_field.h
.new
call for constructor on initialization.append_grid.cpp
)field_list
in libyt.param_yt
dictionary.derived_func
, they should only generate grid without ghost zone.
yt_getGridInfo
, since we define grid_dimension
and data_dim
in yt_data
separately.
grid_dimensions
: [x][y][z] dimension read by yt
. (yt_getGridInfo_Dimensions
API)data_dim
: The actual data dimension of the pointer. (yt_getGridInfo_FieldData
API)gamer
derived function. (No need to change, but still ...)yt_rma_field
transfer full grid, including ghost zone.
yt_rma_grid_info
should change as well.yt_rma_field::prepare_data
, the data dimension should include ghost cell.MPI_PATH
to Makefile
.libyt yt frontend
.libyt/example
ProjectionPlot
SlicePlot
libyt
FieldsFrontend Native Fields: Fields defined in XXXFieldInfo
in yt frontends. They can be fields derived from other existing fields with function defined inside XXXFieldInfo
.
Field defined in XXXFieldInfo
might not have the same name used inside libyt
. For example, MagX
and CCMagX
in gamer
. The first one is used inside inline-analysis, while the second one used in post-processing, even though both of them represent the same field.
Function defined inside XXXFieldInfo
which derived these added fields don't know they are the same, and it should get MagX
instead of CCMagX
in inline-analysis. So it ends up an error.
Should add another data member in yt_field
to match to an already exist definition inside XXXFieldInfo
.
Support new features:
python3.8
libyt
to run parallel inline-analysis with yt
.face-centered
magnetic fields to cell-centered
data when yt
needs them. This can save memory.grid_levels
from NPY_LONG to NPY_INT. (allocate_hierarchy.cpp, append_grid.cpp)yt_getGridInfo_*
look up in libyt
Python module.
yt_getGridInfo_FieldData
.
PyArray_DATA
, PyArray_DIMS
, PyArray_DESCR
particle_count_list
: load each ptype separately, and then sum them up in libyt frontend.
grid_particle_count
through frontend. (data_structures.py, yt_commit.cpp, )libyt.h
.grids_local
under g_param_yt
, once yt_commit
is done.libyt.grid_data
.libyt
's hierarchy directly in yt
.
_initialize_grid_arrays
_parse_index
libyt
's hierarchy new allocated buffer.long
, since we now assign libyt allocated array in yt frontend.yt_hierarchy_mpi_type
initialize once only. (yt_commit_grids.cpp)yt_rma_grid_info_mpi_type
initialize once only. (yt_rma_field.cpp)yt_rma_particle_info_mpi_type
initialize once only. (yt_rma_particle.cpp)example
more simple.libyt
API.YT_ERROR
: you want to implement your new yt_dtype, you should modify both yt_dtype Enum and get_npy_dtype function.
MyRank
and MySize
global.no_locks
, do I even need that.
libyt
unable to finalize successfully on twnia3.check_data
at yt_param_libyt
)
gamer
itself supports OpenMP, but not libyt
.libyt
only uses OpenMPI. Maybe we can use OpenMP when preparing grid data and particle data etc.Test on halo analysis.
libyt
MPI_Gatherv
not support send count > INT_MAX.data_dim
and grid_dimensions
, data_dtype
and field_dtype
.data_ptr = NULL
, libyt
shouldn't abort when data_dtype
or field_dtype
is not set and check_data==false
. We don't need to wrap this array, hence no need to set data type.
append_grid.cpp
.YT_Inline()
in the sub-step updates (i.e., in EvolveLevel()
)amr->patch-fluid[]
) to libyt
YT_Inline()
libyt
in scientific production runs
yt_field
yt_field
struct, add data member derived_func_with_name
void (*derived_func_with_name) (long, char *, double *);
func(gid, "Dens", data)
and func(gid, "MomX", data)
to get different derived field by passing different field name.field_define_type == "derived_field"
, the order libyt
will use the derived function:
derived_func
derived_func_with_name
yt
Save FunctionIn inline script, we need save()
outside the if yt.is_root()
clause, because annotate_cquiver
(and other annotations) makes save
does data IO. When doing data IO in libyt
(using function inside io.py
), each MPI rank must call the same method. See:
import yt
yt.enable_parallelism()
def yt_inline():
ds = yt.frontends.libyt.libytDataset()
slc = yt.OffAxisSlicePlot(ds, [1, 1, 0], [("gas", "density")], center="c")
slc.annotate_cquiver(("gas", "cutting_plane_velocity_x"), ("gas", "cutting_plane_velocity_y"), factor=10, plot_args={"color":"orange"}, )
slc.save()
Sometimes, there will be some missing figure in the output series of figures. This may happen if each rank is writing and creating a file with identical name. (link)
We might want to analyze the data dynamically and get the response from the inline analysis directly, just like using ipython
during runtime.
- [ ] Colorful python prompt terminal
- [ ] Indent
try
--> stop if it goes wrongLIBYT_STOP
file --> stop if detected* Inline function execute status:
* yt_inline() ...... finished!
* yt_inline_arg() .. failed
* Traceback message ...
>>>
:
>>> if a == "something":
... print("run somthing")
inline_script
namespace, for every input from user.exec
and pass in sys.modules["inline_script"].__dict__
yt.enable_parallelism()
might run multiple times, if reloading script does not set correctly.
ncurses
INTERACTIVE_MODE
Makefile
.try
: execute inline scriptexcept
: only root prints full traceback msg, the other ranks prints no error msg.
yt_interactive_mode
.finally
: sync status?%libyt exit
for exit. (temp)%libyt
, like %libyt ...
%libyt exit
: exit interactive mode, and enter next iteration of simulation.Initially, particle plots may generate false figure, if the MPI size is too large. After testing on different machine, this doesn't seems to be an issue in libyt
. This is more or less related to memory space of one machine has. But we still cannot find where actually is this issue.
(I haven't reproduce the issue.)
###Tasks
Dask is a flexible library for parallel computing in Python. It is growing its popularity among Python ecosystems. Because libyt
does the in-situ analysis by running Python script, it is important to support this feature as well.
libyt
structureEach MPI rank initializes a Python interpreter, and they work together through mpi4py
.
MPI 0 ~ (N-1) |
---|
Python |
libyt Python Module |
libyt C/C++ library |
Simulation |
dask
be set up inside embedded Python?We can make two additional ranks specifically for scheduler and client (not necessarily to be MPI 0 and 1), and the rest of MPI nodes for workers. Each simulation also runs inside workers. By following how dask-mpi
initialize()
initializes scheduler, client, and workers, it is possible to wrap this inside libyt
.
MPI 0 | MPI 1 | MPI 2 | ... | MPI (N-1) |
---|---|---|---|---|
Scheduler | Client | Worker | Worker | Worker |
libyt Python Module | libyt Python Module | libyt Python Module | libyt Python Module | libyt Python Module |
libyt C/C++ library | libyt C/C++ library | libyt C/C++ library | libyt C/C++ library | libyt C/C++ library |
Empty | Empty | Simulation | Simulation | Simulation |
Because we use Remote Memory Access (one-sided MPI) with some settings that required every rank to participate in the procedure (#26). libyt
suffers from data exchange process between MPI nodes. Every time yt
reads data, all ranks should wait for each other and synchronize.
However, if we move this data exchange process from C/C++ to Python side, then it is possible to exchange data with more flexibility using dask
and exchange data in a asynchronous way. By encoding what MPI ranks should get into a Dask graph, asking worker to prepare local grid data, and exchanging data between workers, it will be much easier.
(At least much easier than using C/C++. 😅 )
yt
Supported Operationslibyt
APIlibyt python module
libyt
to yt
by libyt frontend
mpi4py
yt
yt
libytGrid
libyt
API (future, if have time. I think the comments in the source code is clear enough.)libyt
Data Typelibyt python module
yt_set_UserParameter
For other yt code frontends to use their own definition of fields in in-situ analysis, we need to create APIs that input all kinds of parameters. But I'm not sure what to implement yet.
yt.ParticlePlot
with different ptype
particles.Originally, we thought this issue is related to particle functionalities. Since it's not, we move it to here.
annotate_particle
)
PlotCallback
ParticlePlot
, ParticleProjectionPlot
)
PlotCallBack
libyt
example
in libyt
.gamer
MHD Vorteximport numpy as np
import yt
from yt.data_objects.level_sets.api import Clump, find_clumps
yt.enable_parallelism()
def yt_inline():
ds = yt.frontends.libyt.libytDataset()
data_source = ds.all_data()
c_min = 10 ** np.floor(np.log10(data_source[("gas", "density")]).min())
c_max = 10 ** np.floor(np.log10(data_source[("gas", "density")]).max() + 1)
master_clump = Clump(data_source, ("gas", "density"))
master_clump.add_validator("min_cells", 20)
find_clumps(master_clump, c_min, c_max, 2.0)
leaf_clumps = master_clump.leaves
prj = yt.ProjectionPlot(ds, "z", ("gamer","Dens"), center="c")
prj.annotate_clumps(leaf_clumps)
# Either having this is clause or not, it will still failed when MPI = 3.
if yt.is_root():
prj.save()
These are some TODOs in ( #11 ), but found them unnecessary to accomplish them.
yt_add_user_paramter_*
Support More Input TypeXXXDataset
attributes in yt
.A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.