GithubHelp home page GithubHelp logo

yt-project / libyt Goto Github PK

View Code? Open in Web Editor NEW
9.0 9.0 3.0 2.68 MB

In-situ analysis with yt

Home Page: https://libyt.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Shell 0.01% C++ 89.26% Python 0.28% C 7.68% Makefile 1.60% CMake 1.16%

libyt's People

Contributors

cindytsai avatar hyschive avatar matthewturk avatar pre-commit-ci[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

libyt's Issues

Memory Consumption Check

Memory Consumption/Leakage Check

  • Check each libyt API memory consumption and see if there is any memory leakage.
  • Memory Consumption Check:
    • Using large for loops to run each API.
    • Use valgrind tool massif.
    • Check with MPI.
    • Execute bash script:
      # run with "sh run.sh <log_folder_name>"
      # naming
      LOG_FOLDER=$1
      EXE=API-test.out
      # execute
      export LD_PRELOAD=/home/calab912/software/valgrind-3.19.0/lib/valgrind/libmpiwrap-amd64-linux.so
      export PYTHONMALLOC=malloc
      mpirun -np 4 --output-filename $LOG_FOLDER /home/calab912/software/valgrind-3.19.0/bin/valgrind -v --tool=massif --time-unit=B --detailed-freq=1 ./$EXE 
    • Makefile:
      # path
      MPI_PATH   := /home/calab912/software/openmpi/3.1.5-gnu
      LIBYT_PATH := /home/calab912/Documents/GitHub/libyt
      
      # output name
      BIN  := API-test.out
      FILE := main.cpp
      COMPILER := $(MPI_PATH)/bin/mpic++
      
      # command
      $(BIN): $(FILE)
          $(COMPILER) -o $(BIN) $(FILE) -I$(LIBYT_PATH)/include -L$(LIBYT_PATH)/lib -lyt
      
      clean:
          rm -f $(BIN)
          rm -rf log
          rm -rf __pycache__
          rm -f *.png *.gif RecordTime_* massif.*
  • Python Memory Leakage Check:
    • Count every libyt data member's reference count using sys.getrefcount.
    • sys.getrefcount will increase original count by 1, since it refers to that value to get the count.(link)
      • Every value should have reference count 2.

Memory Consumption Caused By Each API

Other yt Operations

Covering Grid

Python API check

  • yt_init()
    • PyInit_libyt (init_libyt_module.cpp)
      • libyt, grid_data, hierarchy, param_yt, param_user
    • libyt_field_derived_func
      • Return 3d NumPy array owned by Python.
    • libyt_particle_get_attr
      • Return 1d NumPy array owned by Python.
    • libyt_field_get_field_remote
      • Return data[gid]["field"][3d-NumPyArray]
    • libyt_particle_get_attr_remote
      • Return data[gid]["ptype"]["attr"][1d-NumPyArray]
  • yt_set_parameter()
    • add_dict_string (add_dict.cpp)
      • param_yt
        • frontend, fig_basename
    • add_dict_scalar (add_dict.cpp)
      • param_yt
        • current_time, current_redshift, omega_lambda, omega_matter, hubble_constant, length_unit, mass_unit, time_unit, magnetic_unit, cosmological_simulation, dimensionality, refine_by, num_grids, domain_left_edge, domain_right_edge, periodicity, domain_dimensions
    • allocate_hierarchy() (allocate_hierarchy.cpp)
      • hierarchy
        • grid_left_edge, grid_right_edge, grid_dimensions, grid_particle_count, grid_parent_id, grid_levels, proc_num
  • yt_add_user_parameter_*
    • add_dict_string
    • add_dict_scalar
    • add_dict_vector3
  • yt_commit_grids()
    • add_dict_field_list (add_dict.cpp)
      • Dictionary param_yt["field_list"]
        • Key-Value: field name-Dictionary
          • "attribute": [ <field_unit>, [<field_name_alias>,], <field_display_name>]
          • "field_define_type": <field_define_type>
          • "swap_axes": true/false
          • "ghost_cell": [ ]
    • add_dict_particle_list (add_dict.cpp)
      • Dictionary param_yt["particle_list"]
        • Key-Value: ptype-Dictionary
          • "attribute": Dictionary
            • <attr_name1> : [ <attr_unit>, [<attr_name_alias>], <attr_display_name> ]
          • "particle_coor_label": [ <coor_x>, <coor_y>, <coor_z> ]
    • append_grid (append_grid.cpp)
      • grid_data
        • grid_data[gid]["field"]
  • yt_free()
  • yt_finalize()

Python Reference Count Check

Under these structure, sys.getrefcount are all 2.

  • libyt: 9
    • Dictionary
      • grid_data: 3
      • hierarchy: 3
      • param_yt: 3
      • param_user: 3
    • Method
      • derived_func: 3
      • get_attr: 3
      • get_field_remote: 3
      • get_attr_remote: 3

Support Derived Function and Particle Get Attribute Function Prepare Multiple Data Chunks At A Time

Support Derived Function and Get Attribute Function Prepare Multiple Data Chunks At A Time

Problem

  • Currently, derived_func and derived_func_with_name prepare one data grid at a time.
  • This may cause poor performance for code heavily use hybrid OpenMP/MPI (ex: gamer) to generate derived field.
  • This limits a derived function style. One must write derived function in a way libyt likes.
  • Though this problem isn't actually affecting libyt's overall performance for now.

Solution

libyt

Data Structure

  • #56
  • Update data members derived_func_chunk and derived_func_with_name_chunk in yt_field struct.
    • void (*derived_func_chunk) ( int list_length, long *list_gid, yt_array *data )
    • void (*derived_func_with_name_chunk) ( int list_length, long *list_gid, char *field, yt_array *data )
  • Update data member get_attr in yt_particle struct.
    • void (*get_attr) ( int list_length, long *list_gid, char *attr_name, yt_array *data_array)

Derived Function/Get Attribute Function C++ Extended Python Method

  • Python dictionary structure for wrapping them and storing them inside NumPy Array.

RMA

  • Since one will only need remote memory access when MPI size > 1. Do we really need to prepare many grid at a time if one already have parallelized in MPI?
  • Update yt_rma_field.cpp when preparing grid data for remote ranks.
  • Update init_libyt_module.cpp function libyt_field_get_field_remote so that it asks many grids at a time.

yt

  • Update io.py, find places to store grids.
  • Redesign _get_field_from_libyt, _read_chunk_data, and _read_fluid_selection.

Polishment

Tasks

  • Validate hierarchy: similar to _validate_parent_children_relasionship in yt/frontends/gamer/data_structures.py but make it more general
    • Probably embed it in yt_check_grid
    • The grid id should be in range 0 ~ num_grids - 1
    • Root level starts at 0.
    • If there is no parent grid, parent grid id should set to -1.
  • Communicate between libyt frontend and simulation frontends
    • Already support loading fields from a simulation frontend
    • Used option: Load all key-value pair in libyt.param_user dictionary to class libytDataset.
  • Support dimensionality < 3
  • MPI can’t transfer array length bigger than int , so libyt abort when long num_grids exceed size of int
  • Update the check at edge when periodic condition. periodic[3] and grid_left_edge[3] and grid_right_edge[3] ( #5 )
  • yt inline python script
    • file name is fixed, should we make it changeable in every round?
    • #45
  • Rename confusing function names in yt_get_gridsPtr and yt_add_grids . Change yt_add_grids to yt_commit_grids
  • Determine API procedure
    • Option 1 (now): yt_get_gridsPtr -> yt_commit_grids() -> yt_inline -> yt_free_grids()
    • Option 2: Call yt_add_grid() num_grids_local times ->yt_inline
  • Input called python function name in every round of inline-analyze
  • User should input different fig_basename, otherwise they will be over-written. Solution: Append number of calls to inline script at the end of the fig_basename.
  • Do inline-analysis other than yt_inline_ProjectionPlot(). Make it can perform input parameter to function like yt_inline_ProjectionPlot(a,b,c) ...etc.
  • Support yt_add_user_parameter_* can add type other than scalar or a 3-dim vector.
  • Check the python reference counting

Polishment and Optimization

  • Set MPI root rank, for now, it's fixed to rank0.
  • Error message format
  • Names for input parameter, some might be a little bit confusing
    • Change data_dim to data_dimensions.
  • Name of the libyt API

Update `libyt` Milestone

Update libyt Milestone

  • Support getting non-local grids
  • Support ghost zone
  • Minor changes and restriction to field name.
  • A workaround for big send count MPI.
  • Supported yt functionalities.

Extend yt support

Tasks

  • Support the following yt functionalities
    • OffAxisProjectionPlot
    • SlicePlot
    • OffAxisSlicePlot
    • Halo Analysis
    • Isocontours
    • volume_render only if size of MPI is even.
    • ParticlePlot
    • ParticleProjectionPlot
    • LinePlot
  • Distinguish what yt operation should be inside if suite:
    if yt.is_root():
    • I think the core of parallelism is on accessing data, so probably all the other operation that has nothing to do with accessing data should be put inside here. (But this is my guessing, should be further checked.)
    • For volume rendering, saving rendering figure should NOT be inside if yt.is_root(): clause. #26
    • For some of the annotations, saving figure should NOT be inside if yt.is_root(). #35

Notes

  • Better to work with Matt on this.
  • Some of the above functionalities have not been parallelized with grid decompositions in yt. Which they will get grids that aren't exist on local rank.
  • Halo Analysis and Isocontours have not been tested yet.
  • Enzo embedded python analysis may not support particles?
  • Related issue #14

Support Particle Functionalities

Support Particle Functionalities

Initially, particle plots may generate false figure, if the MPI size is too large. After testing on different machine, this doesn't seems to be an issue in libyt. This is more or less related to memory space of one machine has. But we still cannot find where actually is this issue, we move to another issue.

ℹ️ After testing on calab912 and eureka, it doesn't seem to be an issue. Instead, it is more or less related to memory on the machine. Though we still don't know why and where is the bug.

✔️ Particle plot (ParticlePlot, ParticleProjectionPlot) can successfully run in parallel in inline-analysis. Though some memory related issues may occurred.


Test Run on My Laptop with RAM = 16

All of the images are results from inline-analysis. MPI=1 is the correct and expected outcome.

  • Test Problem: gamer Plummer
  • Machine: My Laptop
ParticlePlot
import yt
yt.enable_parallelism()
def yt_inline():
    ds = yt.frontends.libyt.libytDataset()
    par = yt.ParticlePlot( ds, 'particle_position_x', 'particle_position_y', "particle_mass", center='c' )
    if yt.is_root():
        par.save()
  • MPI=1
    Fig000000000_Particle_z_particle_mass
  • MPI=2
    Fig000000000_Particle_z_particle_mass
ParticleProjectionPlot
import yt
yt.enable_parallelism()
def yt_inline():
    ds = yt.frontends.libyt.libytDataset()
    par = yt.ParticleProjectionPlot( ds, "z")
    if yt.is_root():
        par.save()
  • MPI=1
    Fig000000000_Particle_z_particle_ones
  • MPI=2
    Fig000000000_Particle_z_particle_ones
  • The upper-right cluster is different.

Fortran API

Tasks

  • Add a Fortran API, which is necessary for FLASH

Extend to ParaView

Extend libyt to ParaView

If yt can connect to ParaView (link), then maybe libyt can bypass Catalyst which is a tool for simulation codes to do inline-analysis (in situ) in ParaView.

Although there are things worth notice in real-time volume rendering:

  • If we wish to do volume rendering in ParaView, we need Nvidia IndeX. But it only supports serial process. It has additional fees if run on multi-node system. (link)
  • Which means if libyt really wants to support ParaView real-time volume rendering, only one node can be in charge of this inline-analysis. And libyt is not designed for this kind of workflow yet.

Plot Modifications / Annotations Test

Plot Modifications Test

Cookbook Example

  • annotate_marker
  • annotate_text
  • annotate_arrow
  • annotate_clumps (failed in odd MPI size, succeed in MPI of even size ❌ )
  • annotate_contour
  • annotate_quiver
  • annotate_cquiver
  • annotate_grids
  • annotate_cell_edges
  • annotate_streamlines
  • annotate_velocity
  • annotate_scale
  • annotate_timestamp
  • annotate_ray
  • annotate_line
  • annotate_particles (figure are slightly different in MPI = 1 and MPI > 1 ❌ )
  • annotate_halo
  • annotate_magnetice_field
  • annotate_sphere
  • annotate_line_integral_convolution
  • annotate_title

Should Not Put save() and Operation Inside yt.is_root() Clause

  • annotate_streamlines
  • annotate_velocity
  • annotate_line_integral_convolution
  • annotate_particles
  • annotate_quiver
  • annotate_cquiver
  • annotate_magnetic_field

Error Message

Failed at Somewhere Else Other than IOHandlerlibyt

  • annotate_clumps (failed in MPI = 3, succeed in MPI = 1, 4)

Failed at Opening the Saved Figure

  • This happens randomly.
  • Not sure if it is caused by moving save() outside of the if yt.is_root() clause.
  • #38

Miscellaneous issues

Tasks

  • libyt unknown message [DEBUG]
  • Code units and CGS units are mixed when plotting derived fields

Load Particle Data to `yt` through Wrapping the Existing Array

Load Particle Data to yt through Wrapping the Existing Array

If the particle data is store in contiguous memory block, we can directly wrap them and pass in to yt. So that we don't have to copy it again.
This new api should co-existing with the original one ( user input get_attr which returns the particle attributes array ).

Notes

In order to merge enzo, I introduced the API for wrapping particle data array in libyt at (#79). That PR doesn't support particle array just yet, still fixing some bugs.

Support more codes

Tasks

  • FLASH
  • Athena++
  • Enzo-e
  • ...

Notes

  • For FLASH, we need to support Fortran first (#6)

Support Time Series Figure

Support Time Series Figure

Plot time series dataset. This can be done in inline script already, but the script will be a little bit messy.

Timing libyt in GAMER

Tasks

  • Add a timer in GAMER for YT_Inline()
  • Measure its performance in real applications (e.g., cluster merger and isolated FDM halo)

Related Tasks

Group Different MPI Size or Nodes for In-Situ Analysis

Group Different MPI Size or Nodes for In-Situ Analysis

Currently, using MPI size N in simulations will make yt in-situ analysis run with MPI size N.
We want to make it more flexible by giving user freedom to group what MPI ranks should run simulation and what ranks for in-situ analysis.

Feature

  • Use different number of MPI processes for simulations and in-situ analysis.

Demonstrate derived field with EoS

Tasks

  • Add a temperature derived field in GAMER for libyt by calling Hydro_Con2Temp()
  • Compute gas temperature in the ClusterMerger test problem and compare with the post-processing script gamer/example/test_problem/Hydro/ClusterMerger/yt_script/plot_slice-z.py
  • If possible, compute temperature/pressure/entropy in the CCSN simulations. (We tested entropy instead)

Check on libyt derived function functionality

  • Compare temperature calculate through EoS libyt derived function and post-processing temperature output directly by gamer.

Related tasks

`libyt` Document

Document

This document is more like a user guide, and how they can reach developer guide. They will be put inside README.md and libyt/doc folder.

  • User Guide
    • Welcome msg
      • guides to reach information.
    • Supported yt functionality
      • What functionalities will fail. (Even though we have support getting non-local grids and particle data using libyt. See #26 )
    • User Guide content table at main readme.
    • Example code.

Code Optimize

  • Make searching in yt_rma_field and yt_rma_particle faster.

Support Ghost Zone

Support Ghost Zone

Definitions and Terms

  • Ghost cell are defined inside yt_field struct.
    • We assume that different fields can have different ghost cell in each dimension of the data array.
    • Field in different grids must have the same number of ghost cell.
    • short field_ghost_cell[6] is defined as number of cells to ignore at the beginning and the end of the data in each dimensions. Which means field_ghost_cell[0] is number of cells to ignore at the beginning of 0-dim of the data, and field_ghost_cell[1] is number of cells to ignore at the end of 0-dim.
  • We don't pass ghost cell in hierarchy.
  • We load them to python along with loading field_list dictionary.
  • grid_dimensions and data_dim:
    • Ghost cell does not include in grid_dimension, they are just dimensions read by yt.
    • data_dim define in yt_data are the actual data dimension of the data_ptr to be wrapped, it contains ghost cell.
    • API:
      • grid_dimensions: [x][y][z] dimension read by yt. (yt_getGridInfo_Dimensions API)
      • data_dim: The actual data dimension of the pointer. (yt_getGridInfo_FieldData API)

TODOs

  • Define ghost cell inside yt_field.
    • Each side can have different number of ghost zone.
    • Update yt_type_field.h.
  • Remove redundant assignment, since new call for constructor on initialization.
  • Wrap the data array correctly. (append_grid.cpp)
  • Pass in ghost cell in field_list in libyt.param_yt dictionary.
  • For derived_func, they should only generate grid without ghost zone.
    • No need to update yt_getGridInfo, since we define grid_dimension and data_dim in yt_data separately.
      • grid_dimensions: [x][y][z] dimension read by yt. (yt_getGridInfo_Dimensions API)
      • data_dim: The actual data dimension of the pointer. (yt_getGridInfo_FieldData API)
    • Check gamer derived function. (No need to change, but still ...)
  • yt_rma_field transfer full grid, including ghost zone.
    • yt_rma_grid_info should change as well.
    • Be aware of yt_rma_field::prepare_data, the data dimension should include ghost cell.
  • Add MPI_PATH to Makefile.
  • Update libyt yt frontend.

Test Run

  • libyt/example
    • ProjectionPlot
    • SlicePlot

Related Issue

Naming in `libyt` Fields

Naming in libyt Fields

Frontend Native Fields: Fields defined in XXXFieldInfo in yt frontends. They can be fields derived from other existing fields with function defined inside XXXFieldInfo.

Things Should be Aware of

Field defined in XXXFieldInfo might not have the same name used inside libyt. For example, MagX and CCMagX in gamer. The first one is used inside inline-analysis, while the second one used in post-processing, even though both of them represent the same field.
Function defined inside XXXFieldInfo which derived these added fields don't know they are the same, and it should get MagX instead of CCMagX in inline-analysis. So it ends up an error.

Enhancement

Should add another data member in yt_field to match to an already exist definition inside XXXFieldInfo.

Support python3, parallelization, derived fields

Support new features:

  • Work with python3.8
  • Simulation code can use libyt to run parallel inline-analysis with yt.
  • Make user input their own derived function
    • For example, convert face-centered magnetic fields to cell-centered data when yt needs them. This can save memory.

Improve Memory Usage Efficiency and Other Miscellaneous

Improve Memory Usage Efficiency and Other Miscellaneous

Improve Memory Usage Efficiency

  • Change grid_levels from NPY_LONG to NPY_INT. (allocate_hierarchy.cpp, append_grid.cpp)

Look up grid info API using NumPy API

  • Always wrap data passed in, and we can look up data info using NumPy API, instead of searching in libyt yt_grid array.
  • Make yt_getGridInfo_* look up in libyt Python module.
    • I should use this, otherwise we are always searching gid in local grids array.
  • Return data buffer from libyt Python module in yt_getGridInfo_FieldData.
  • Dealing with particle_count_list: load each ptype separately, and then sum them up in libyt frontend.
    • Loading particle. (allocate_hierarchy.cpp, append_grid.cpp)
    • Create grid_particle_count through frontend. (data_structures.py, yt_commit.cpp, )
    • Look up API.
    • Get each ptype number through ptype number array, and determine do they need RMA process.
  • Add prototype to libyt.h.
  • Make sure every search uses this look up API.
  • Check RMA process.
  • Clean up: delete unused resource.
  • Check setting data dimension in append_grid.cpp. Do I need this?
  • Remember to free grids_local under g_param_yt, once yt_commit is done.
  • Update paper section 2.6.

A Better Way to Gather and Pass Hierarchy to Python

  • Always wrap data pass in
  • When passing data to Python
    • Do not create key-value pair if no field data pointer to wrap when creating libyt.grid_data.
    • Check RMA process, don't create key-value pair if some data did not found on some rank.
      • It looks like I did this already...
  • When passing hierarchy to Python
    • Make Python stores one copy of full hierarchy only.
      • Assign libyt's hierarchy directly in yt.
        • Override _initialize_grid_arrays
        • Rewrite abstract _parse_index
      • Reexamine libyt's hierarchy new allocated buffer.
      • Reexamine particle count data type.
        • Stay with long, since we now assign libyt allocated array in yt frontend.

Other Miscellaneous

  • Make MPI_Datatype yt_hierarchy_mpi_type initialize once only. (yt_commit_grids.cpp)
  • Make MPI_Datatype yt_rma_grid_info_mpi_type initialize once only. (yt_rma_field.cpp)
  • Make MPI_Datatype yt_rma_particle_info_mpi_type initialize once only. (yt_rma_particle.cpp)
  • Make GitHub Action generate data through code, not directly store data in txt file.
  • Make example more simple.
  • Make INFO log only print in root rank.
  • Mark in-memory field data read-only API when wrapping data array.
  • #62
  • Change user guide overview to a list of libyt API.
  • YT_ERROR: you want to implement your new yt_dtype, you should modify both yt_dtype Enum and get_npy_dtype function.
    • Caused by if get_dtype_property cannot get NumPy data type, it prints error log.
    • libyt first look through data_dtype then field_dtype, it is ok if data_dtype is not set.
  • Make general info like MyRank and MySize global.
  • Use function templates in big_mpi.cpp (not feasible, have to declare every time we used it.)
  • Don't do copy when using append_grid.cpp

RMA

Bug

  • libyt unable to finalize successfully on twnia3.

Add Timer for Performance Test

Add Timer for Performance Test

Section

  • Loading parameter: adding python variable from Python C++ API.
  • Wrapping grids to NumPy array.
  • Executing inline script
  • Getting data from derived fields.
  • Data transition in RMA operation (One-Sided MPI)
  • Clean up.

Links

  • A really ugly timer at cindytsai/yt branch libyt-timer: link
  • To use timer in libyt, see: link

Test GAMER

Tasks

  • Test libyt with GAMER
    • Various yt functionalites
      • Check table
      • When selecting data. (For example, sphere objects)
    • Derived fields
    • MHD
    • Particles
    • Check Data Test on Periodic Condition. (Remember to turn on check_data at yt_param_libyt)
      • The check list does not consider period condition. Test if it works.
    • MPI parallelization
    • OpenMP parallelization
      • gamer itself supports OpenMP, but not libyt.
    • Performance
    • Memory consumption and deallocation

Test on Gradient Functionality

Test on Gradient Functionality

  • We expect it to be failed, since this requires nearby grids. So yt will definitly ask for non local grids.
  • Related Issue: #26

TODO

  • Write on Document that this would fail. (Move to #23 )

Code release

Tasks

  • Official repo
  • Documentation
  • License
  • Release Tag (?)

Update Minor Issues

Update libyt

  • MPI_Gatherv not support send count > INT_MAX.
  • Simplify choosing data dimension and data type between data_dim and grid_dimensions, data_dtype and field_dtype.
  • If data_ptr = NULL, libyt shouldn't abort when data_dtype or field_dtype is not set and check_data==false. We don't need to wrap this array, hence no need to set data type.
    • This is done inside append_grid.cpp.
  • Support more particle type, YT_LONG.
    • Enum loop.

Invoke libyt in GAMER substeps

Tasks

  • Call YT_Inline() in the sub-step updates (i.e., in EvolveLevel())
  • May need to perform temporal interpolation on lower levels
    • Need to allocate additional arrays, perform interpolation, and then pass these arrays (instead of amr->patch-fluid[]) to libyt
    • Add a runtime option for it
  • Add new criteria for determining when to call YT_Inline()

Production runs

Tasks

  • Test libyt in scientific production runs
    • Cluster merger
    • MW-sized FDM halo
    • FDM soliton random walk
    • CCSN?
  • What metrics to be collected?
  • Enzo too?

Add New Prototype For Derived Field Function

Add New Prototype For Derived Field Function in Struct yt_field

  • Under yt_field struct, add data member derived_func_with_name
    • void (*derived_func_with_name) (long, char *, double *);
    • This is for storing universal derived field, so that one can call func(gid, "Dens", data) and func(gid, "MomX", data) to get different derived field by passing different field name.
    • If one set field_define_type == "derived_field", the order libyt will use the derived function:
      • derived_func
      • derived_func_with_name

Check `yt` `save()` Function

Check yt Save Function

Description

In inline script, we need save() outside the if yt.is_root() clause, because annotate_cquiver (and other annotations) makes save does data IO. When doing data IO in libyt (using function inside io.py), each MPI rank must call the same method. See:

import yt
yt.enable_parallelism()
def yt_inline():
    ds = yt.frontends.libyt.libytDataset()
    slc = yt.OffAxisSlicePlot(ds, [1, 1, 0], [("gas", "density")], center="c")
    slc.annotate_cquiver(("gas", "cutting_plane_velocity_x"), ("gas", "cutting_plane_velocity_y"), factor=10, plot_args={"color":"orange"}, )
    slc.save()

Sometimes, there will be some missing figure in the output series of figures. This may happen if each rank is writing and creating a file with identical name. (link)
SourceFileMissing

Reload/Refresh Inline Script During Runtime

Reload/Refresh Inline Script During Runtime

We might want to analyze the data dynamically and get the response from the inline analysis directly, just like using ipython during runtime.

Features

  • Getting error messages from Python and informing users, instead of terminating the whole process.
  • Load the script and interact dynamically during the code runtime.
  • Changes maintain throughout the rest of the process
  • Export the current functions.
  • Can determine and see what functions will run in the following steps.

Enhancement

- [ ] Colorful python prompt terminal
- [ ] Indent

  • Parsing traceback errors
  • string or char? change to string.

Working Procedure

When to enter and exit interactive mode

  1. Run user script in try --> stop if it goes wrong
  2. Detect LIBYT_STOP file --> stop if detected

In interactive mode

  • Each inline function execute results/status:
    * Inline function execute status:
      * yt_inline() ...... finished!
      * yt_inline_arg() .. failed
        * Traceback message ...
    
  • Enter interactive mode >>>:
    >>> if a == "something":
    ...     print("run somthing")
    

TODOs

Problems

  • Scope problem: originally, every inline function are done in inline script's namespace. When we dynamically load and update these functions, we are in global namespace, but we need to be in module's namespace.
    • Possible solution:
      • use inline_script namespace, for every input from user.
      • use exec and pass in sys.modules["inline_script"].__dict__
  • yt.enable_parallelism() might run multiple times, if reloading script does not set correctly.
    • It's OK to run multiple times, but it's waste of time to do this.
    • Possible solution:
  • Indentation, everything must have the same indentation with the inline script.
    • Possible solution:
      • if it test each line of code to see they compile or not, then we don't have this problem.
  • How to tell if it is a valid input?
    • Possible solution:
      • Test can these line of code can compile or not.
      • IPython?
  • Cannot change previous line inputs. (put it to the last to solve)

Makefile

  • Add compile option INTERACTIVE_MODE
    - [ ] Cleaner Makefile.

yt Finalize

  • Python objects (freed by Py_Finalize())
  • Function Status Vector (Don't need to.)

Try, Except, Finally

  • try: execute inline script
  • except: only root prints full traceback msg, the other ranks prints no error msg.
    • store error msg to somewhere else, so that we can print it in yt_interactive_mode.
    • parse traceback msg, make it more readable for user
  • finally: sync status?
  • Use %libyt exit for exit. (temp)

libyt command

  • Should start with %libyt, like %libyt ...
  • %libyt exit: exit interactive mode, and enter next iteration of simulation.

Tests

  • Errors raise by libyt module
    • C
    • Python
  • Errors raise by yt
  • Test in cluster

Particle functionalities may generate false figure if memory is almost full

Particle Functionalities May Generate False Figure if Memory is Almost Full

Related Issue

Description

Initially, particle plots may generate false figure, if the MPI size is too large. After testing on different machine, this doesn't seems to be an issue in libyt. This is more or less related to memory space of one machine has. But we still cannot find where actually is this issue.

(I haven't reproduce the issue.)

Paper

###Tasks

  • Release the code (#8)
  • Work with Matt
  • Where to submit (e.g., ApJS)?

Performance

Performance

  • #43
    • Scaling is bad in OpenMP and OpenMPI in volume rendering.

Support Dask

Support Dask

Dask is a flexible library for parallel computing in Python. It is growing its popularity among Python ecosystems. Because libyt does the in-situ analysis by running Python script, it is important to support this feature as well.

Current libyt structure

Each MPI rank initializes a Python interpreter, and they work together through mpi4py.

MPI 0 ~ (N-1)
Python
libyt Python Module
libyt C/C++ library
Simulation

How should dask be set up inside embedded Python?

We can make two additional ranks specifically for scheduler and client (not necessarily to be MPI 0 and 1), and the rest of MPI nodes for workers. Each simulation also runs inside workers. By following how dask-mpi initialize() initializes scheduler, client, and workers, it is possible to wrap this inside libyt.

MPI 0 MPI 1 MPI 2 ... MPI (N-1)
Scheduler Client Worker Worker Worker
libyt Python Module libyt Python Module libyt Python Module libyt Python Module libyt Python Module
libyt C/C++ library libyt C/C++ library libyt C/C++ library libyt C/C++ library libyt C/C++ library
Empty Empty Simulation Simulation Simulation

Solve data exchange problem

Because we use Remote Memory Access (one-sided MPI) with some settings that required every rank to participate in the procedure (#26). libyt suffers from data exchange process between MPI nodes. Every time yt reads data, all ranks should wait for each other and synchronize.
However, if we move this data exchange process from C/C++ to Python side, then it is possible to exchange data with more flexibility using dask and exchange data in a asynchronous way. By encoding what MPI ranks should get into a Dask graph, asking worker to prepare local grid data, and exchanging data between workers, it will be much easier.
(At least much easier than using C/C++. 😅 )

Update milestones

Tasks

  • Update Milestones
    • Contents
    • Overview
      • Basic Idea
      • Support Python Versions
      • yt Supported Operations
      • Procedure and libyt API
      • Inline Python Script
      • Example
    • Embedded Python in C/C++ Application
      • Create libyt python module
      • Initialize Embedded Python and Load User Script
      • Load Data Through Python API
      • Load Data Through NumPy API
      • Free Resource in Python and Finalize Embedded Python
      • Connect libyt to yt by libyt frontend
    • Parallelization
      • Install mpi4py
      • Enable Parallelism and Inline-Analysis in yt
      • Parallel Process in yt
      • Record MPI Rank in libytGrid
    • Support In-Memory Fields and Collect Hierarchy
      • Load Hierarchy Information and Field Data at Local Rank
      • Collect Hierarchy in Each Rank
      • Connect In-Memory Field Data and Information to Python
    • Support Derived Fields
      • Set Field Information for Derived Fields
      • Connect Derived Field Data to Python
    • Support Particles
      • Set Particle Information
      • Connect Particle Data to Python
    • Code Structure
      • Configuration
        • Check Data or Not
      • libyt API (future, if have time. I think the comments in the source code is clear enough.)
      • libyt Data Type
      • libyt python module
  • Examine all links
  • Review each section.
  • Book mode
  • GitHub wiki

Links

https://hackmd.io/@Viukb0eMS-aeoZQudVyJ2w/ryCYwu0xF

Support more `yt_set_UserParameter*`.

Support more yt_set_UserParameter

For other yt code frontends to use their own definition of fields in in-situ analysis, we need to create APIs that input all kinds of parameters. But I'm not sure what to implement yet.

Support Enzo

Tasks

  • Support libyt in Enzo
  • Test on yt.ParticlePlot with different ptype particles.
  • Test ghost zone.
  • Test on face-centered data.

Notes

  • Work with Matt on this

Annotate Particles Generate False Figure

Annotate Particles Generate False Figure

Originally, we thought this issue is related to particle functionalities. Since it's not, we move it to here.

  • Annotations in a plot (annotate_particle)
    • Subclass of PlotCallback
    • Create plot via this function.
  • Particle functionalities (ParticlePlot, ParticleProjectionPlot)
    • Different class hierarchy from PlotCallBack

Particles

Tasks

  • Small module: call C function in python script to collect particle data
    • libyt should provide a function pointer to this function
    • Simulation codes should fill in this function
  • Additional check how yt operates particle objects
    • Parents might not have their children’s particle data, need to get them from other MPI ranks.
    • When not in highest AMR level, how yt get particles in that level?
      • Try covering_grid for example.
  • Support multiple particle types
    • libyt
      • Read: GAMER particle in HDF5
      • Bridge _read_particle_fields in io.py with simulation code using libyt
        • Add read particle function when initializing libyt python module
        • Determine what should user input to libyt
        • Update yt_type_grid.h, so that it stores particle count of different ptype.
        • Update example in libyt.
        • Check procedure
    • yt
      • Read: _read_particle_fields in io.py
      • Connect libyt frontend with this function
  • Test with GAMER
  • Test particle filter (e.g., see this example)

Annotate Clumps Not Working in Odd MPI Size

Annotate Clumps Not Working in Odd MPI Size

  • Test Problem: gamer MHD Vortex
  • Inline Script:
    import numpy as np
    import yt
    from yt.data_objects.level_sets.api import Clump, find_clumps
    yt.enable_parallelism()
    def yt_inline():
        ds = yt.frontends.libyt.libytDataset()
        data_source = ds.all_data()
    
        c_min = 10 ** np.floor(np.log10(data_source[("gas", "density")]).min())
        c_max = 10 ** np.floor(np.log10(data_source[("gas", "density")]).max() + 1)
    
        master_clump = Clump(data_source, ("gas", "density"))
        master_clump.add_validator("min_cells", 20)
    
        find_clumps(master_clump, c_min, c_max, 2.0)
        leaf_clumps = master_clump.leaves
    
        prj = yt.ProjectionPlot(ds, "z", ("gamer","Dens"), center="c")
        prj.annotate_clumps(leaf_clumps)
    
        # Either having this is clause or not, it will still failed when MPI = 3.
        if yt.is_root():
            prj.save()
  • Description:
    • FAILED in MPI = 3, stuck at somewhere else, other than class IOHandlerlibyt.
    • SUCCEED in MPI = 1, and MPI = 4.
      • Just like the projection plot, because i didn't set a range that actually grab a clump.
        Fig000000001_Projection_z_Dens

Polishment and Enhancement

Polishment and Enhancement

These are some TODOs in ( #11 ), but found them unnecessary to accomplish them.

Support dimensionality < 3

  • It only supports dimensionality = 3.

Make Inline Python Script Changeable

  • File name is fixed through out the whole runtime. And cannot be altered.

Make yt_add_user_paramter_* Support More Input Type

  • Support only scalar and 3-dim vector.
  • This API is used when adding frontend specific XXXDataset attributes in yt.

Set MPI Root Rank

  • We assume root rank is 0.
  • Since in most cases all nodes should have similar performance, setting root rank is unnecessary.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.