openshmem-org / specification Goto Github PK
View Code? Open in Web Editor NEWOpenSHMEM Application Programming Interface
Home Page: http://www.openshmem.org
OpenSHMEM Application Programming Interface
Home Page: http://www.openshmem.org
Semantic of shmem_realloc on a buffer that was allocated using shmem_calloc is unspecified.
Unix realloc has the following semantic:
When extending a region allocated with calloc(3), realloc(3) does not guarantee that the additional memory is also zero-filled.
Now that shmem_wait
has been deprecated, should we rename the section (appears in the index and the section heading) from "SHMEM_WAIT" to "SHMEM_WAIT_UNTIL"?
Is there a reason why bitwise atomic operations include two (2) signed types (int32
and int64
) out of the seven (7) types?
uint
ulong
ulonglong
int32
int64
uint32
uint64
This seems inconsistent. I'm not sure what it means to have a bitwise operation on signed data.
Also, it probably doesn't make sense to add size
, despite being unsigned. If others concur, I will submit the pull request to remove int32
and int64
.
For reference:
https://github.com/openshmem-org/specification/blob/osh_spec_next/content/atomics_intro.tex
For consideration: Section committee may want to merge the new threading routines into Section 8.1, so that shmem_init_thread
appears immediately after shmem_init
. If this is done, the preceding "Thread Support" text (which defines threading semantics) could be moved to an earlier section, such as to "Execution Model" or to a separate subsection before API.
This issue spins off the removal aspect of revising the IN, OUT, and INOUT "modes" of function arguments from #147, explicitly targeted for OpenSHMEM 1.5.
I am adding this as an Issue to open it up for discussion.
The non-blocking RMA routines provide operations for updating memory locations on a remote PE. All explicit RMA routines either return a handle to a request object if the input handle is set to NULL,
or will merge the request into the request object given to the function as a parameter.
Mergeable explicit request handles provide a way to group related RMA operations into a single request object. This approach enables the OpenSHMEM implementation to better allocate the available resources to different streams of RMA operations. Another advantage of request objects is the possibility to separate communication streams in an environment where a PE is running multiple threads. The introduction of shmem_request_wait will remove the need to call shmem_quiet to complete outstanding RMA operations. Using shmem_request_wait can enable a finer granularity of progressing RMA operations in the runtime. Furthermore, the status of request objects can be tracked using the newly introduced shmem_request_test API function and can be used to improve the overlap between communication and computation.
There are two ways of creating request objects. The first is to pass a NULL handle into an explicit RMA function. The runtime will create the request object and return the handle. The second is to use the newly introduced shmem_request_alloc function. Using the explicit allocation method will provide means to pass hints to the runtime during request creation that can be used to optimize the communication further.
The paper Evaluating OpenSHMEM Explicit Remote Memory Access Operations and Merged Requests discusses an experimental implementation of mergeable requests.
There is a ticket in redmine that is related to this work (see Extension #113).
shmem_TYPE_put_nbe (TYPE *target, const TYPE *source, size_t nelems, int pe, void **request);
shmem_putSIZE_nbe (TYPE *target, const TYPE *source, size_t nelems, int pe, void **request);
shmem_TYPE_get_nbe (TYPE *target, const TYPE *source, size_t nelems, int pe, void **request);
shmem_getSIZE_nbe (TYPE *target, const TYPE *source, size_t nelems, int pe, void **request);
shmem_request_wait (void **request);
shmem_request_test (void **request);
shmem_request_alloc (shmem_request_params_t params, void **request);
shmem_request_free (void **request);
shmem_merge_request (size_t count, void *request, void **requests);
The shmem_align
function does not place any requirements on the alignment
argument. posix_memalign
has the following requirement: "The requested alignment must be a power of 2 at least as large as sizeof(void *)." Should shmem_align
have this same requirement?
(This chiefly affects Section 8.2, although there are some references to "NULL pointers" in Annex C.)
The specification is inconsistent in its use of "null pointer" and "NULL pointer" (or "pointers"). For example, the C Standard says about malloc
that:
The
malloc
function returns either a null pointer or a pointer to the allocated space.
while shmem_malloc
says:
The
shmem_malloc
routine returns a pointer to the allocated space; otherwise, it returns aNULL
pointer.
To be pedantically aligned with the C Standard -- always my preference ๐ -- we should be using:
NULL
when referring to the macro for a null pointer constant provided in stddef.h
and elsewhereIn general, this means the OpenSHMEM Specification should use "null pointer" (or "pointers") except when we specifically mean something else.
Relevant C Standard references:
<stddef.h>
: provides the macro NULL
Before we completely move away from Redmine and use only github for proposals, we need to backup the Issues, PRs, and Conversations. This issue is to capture the available solutions and track issues related to this.
This issue was brought up in nspark#5.
@nspark said:
Do we typically favor \emph over \textit?
@BryantLam said:
It depends on whether you/we intend the text to be italicized, or simply emphasized. You can redefine
\emph
to be another style (e.g., underlining) and can nest\emph
.Intent is not the same as favor though. Search results for
\textit
and\emph
indicate that we use both about equally and both "incorrectly" in a few places if you accept the definition of italicization vs. emphasis. Def macros (VAR, CONST, TYPE, TYPENAME) should probably be\textit
because that's the styling that we intended those to be and textual emphasis should be\emph
. I would argue that this line should be nested\emph
ed, or should just be a singular\emph
since there is no other emphasized text around it.
figures/quiet.pdf has puts from PE0 to PE1 mislabeled with destination "PE 0" instead of correct "PE 1". I don't have OmniGraffle to fix source file figures/quiet.graffle.
This image appears in "Synchronization and Communication Ordering in OpenSHMEM".
Queries on handling a corner case while forming active sets. Should we validate the logPE_stride value comparing against the PE_size?
I suppose the following triplet values is considered valid for forming an active set.
PE_start = 0; logPE_stride = 1; PE_size = 1
In the above example, the logPE_stride doesn't matter since the active set size is just 1.
PE_start = 0; logPE_stride = 25; PE_size = 1
Is this a valid triplet value?
Some comments from Huansong.
I do not have major issues to bring up but only have the following minor issues. Therefore, I am going to give my approval to the draft (#118 (comment)).
In Annex B.2: "The program arguments for oshrun are..."
The term "program arguments" here, in the same paragraph, is also used to indicate the parameters for the executable ("... "). I think the former one could be changed to "The arguments for oshrun are..." for clarification.
In Annex C: "Non-symmetric Memory Allocation"
I found this term a little bit confusing as it could also mean the allocation of non-symmetric memory, which is a totally different thing. I think "Non-symmetric Allocation of Symmetric Memory" might be more accurate here.
In Annex F.1: "For OpenSHMEM API library users, said API must be..."
I am not sure what "said API" means here. Might be "deprecated API" or "all API"?
In Annex F.2: "has been deprecated...", "has been replaced...", "were deprecated..."
This might be totally unnecessary but the tense here could be the same. Change all to "has been"?
The following sentence should be updated (or better yet, deleted) now that thread safety (#43) has introduced shmem_init_thread:
shmem_barrier_all.tex:27 "This routine must be used with \acp{PE} started by \FUNC{shmem_init}.
As we frequently discuss, the Fortran API is not used and not particularly conformant Fortran anyway. The recommendation of those in the community with Fortran expertise is to use Fortran's bind(C)
features to wrap the C API for OpenSHMEM. The Fortran API should be deprecated.
Both variants are found:
Although all Fortran APIs are marked as deprecated, the related text description is still there. E.g., "When using Fortran, it must be of default kind" in Section 8.8.1 shmem_wait_until - Arguments.
We need to figure out the way to mark a sentence as deprecated, and make the change for the whole spec.
The following constants are introduced in 1.4 and should be added to the constants table:
The following existing constants should also be added to the constants table:
Annex F table has "Implicit finalization" listed as still supported under the current version of OpenSHMEM. This is clearly not intended behavior since implicit finalization was removed in #37 in favor of explicit finalization. Does this fall within the scope of a DocEdit (change "Current" to "1.3")?
Ref: #112
Please apply the attached patch (e.g. using 'git am') to the back matter.
From Khaled: I have a minor comment/question regarding the section 8.3.1. The text for the allocation function does not explicitly says that these operations are collective operation. It is implicit when it says that they call barrier. But I thought it will be better if they explicitly mention that these are collective operations.
Library constants --> \CONST
Library variables --> \VAR
Environment variables?
Only existing actual use is for SHMEM_SYMMETRIC_SIZE
, which does not feel correct.
We probably need a new macro.
$ grep 'SHMEM\\_SYMMETRIC\\_SIZE' * -R
content/environment_variables.tex:\texttt{SHMEM\_SYMMETRIC\_SIZE} & non-negative integer & number of bytes to
content/shmem_malloc.tex: adjust the size of the heap using the \CONST{SHMEM\_SYMMETRIC\_SIZE} environment
content/shpalloc.tex: adjust the size of the heap using the \CONST{SHMEM\_SYMMETRIC\_SIZE} environment
Affects deprecated SMA_*
environment variables in Annex F.
I'm currently LaTeX-less (at least before the next specification cutoff), and would like to submit a change that adds a return value to shmem_init()
to indicate whether the call was successful or not.
The rationale, in case it is not clear, is for libraries that build upon OpenSHMEM to detect a failure to initialize the OpenSHMEM library and fail gracefully. For example, OSHMEM, Cray SHMEM, and Sandia OpenSHMEM all abort or seg-fault if the OpenSHMEM program was not launched with an appropriate launcher (e.g., oshrun, aprun, srun).
Improve formatting for keywords in thread levels and ensure that they appear in the index.
Section committees: update text indicating that routines call shmem_barrier_all
to be consistent with the new text used in shmem_malloc
: call to a procedure that is semantically equivalent to \FUNC{shmem_barrier_all}.
With #14, we deprecated the SMA_*
environment variables and added equivalent ones with the SHMEM_*
prefix. Since this change, the precedence of SMA_*
vs. SHMEM_*
env-vars is unspecified and should be clarified. From a practical perspective, this is particularly important for SHMEM_SYMMETRIC_SIZE
vs SMA_SYMMETRIC_SIZE
.
(This issue was split off from #98, where it was originally identified.)
What, if any, alignment requirements are there in a call to shmem_realloc
when the input buffer was allocated using shmem_align
? The posix_memalign
function has the following caveat, should SHMEM also have it?
memory that is allocated via posix_memalign() can be used as an argument in subsequent calls to realloc(3), reallocf(3), and free(3). (Note however, that the allocation returned by realloc(3) or reallocf(3) is not guaranteed to preserve the original alignment).
Hi all,
consider the following (pseudo) code running on two PEs:
int reduction_arg = 1, dest = 0;
int just_two = 2;
shmem_int_sum_to_all(&dest, &reduction_arg, 1, ...);
if (ownPE == 1)
shmem_int_put(&reduction_arg, &just_two, 1, 0);
else
printf("%d", dest);
Will this always print 2 according to the spec? Or might it print 3 sometimes?
Consider the following scenario: both PEs enter the reduction at nearly the same time. At the start of the reduction processing they send the value of reduction_arg (1) to the respective other PE.
Then, for some reason, PE 0 is delayed. Meanwhile, PE 1 receives the value of PE 0, adds it to its own reduction_arg, stores it to dest and thus can complete the reduction and leave. Afterwards it puts 2 in the reduction_arg of PE 0. This seems to constitute a race, because if PE 0 now resumes its execution, it finds a value of 2 in its own reduction_arg, which it then uses to calculate the result.
Storing the original values would help, but the worker array is too small for this.
Is there something I have missed in the spec?
Thank you for any clarification
Olaf Krzikalla
For CMake users out there, it would be great to have a FindOpenSHMEM.cmake package locator to help find platform/vendor-specific paths.
We can upstream it to the CMake repository once it's solidified. Note that recent CMake (~3.6) package files are transitioning to using imported targets which were introduced in CMake ~3.0, so that may end up being the minimum CMake version.
The "Environment Variables" section in OpenSHMEM 1.3 says:
The OpenSHMEM specification provides a set of environment variables that allows users to configure the OpenSHMEM implementation, and receive information about the implementation. The implementations of the specification are free to define additional variables. [...]
Currently, the specification is not explicit whether SHMEM_SYMMETRIC_SIZE
is an input (configuration variable), an output (an informational variable), or both. While it is consistently handled as an input, it is inconsistently handled as an output across implementations, especially if an additional, related environment variable is provided by an implementation.
For example:
SMA_SYMMETRIC_SIZE
in shmem_init
if its (non-standard) alternate env-var SHMEM_SYMMETRIC_HEAP_SIZE
is setXT_SYMMETRIC_HEAP_SIZE
Points for potential clarification:
SHMEM_*
environment variables?shmem_init
?SHMEM_SYMMETRIC_SIZE
be set during shmem_init
to whatever the heap size actually is, assuming the implementation uses a fixed-size symmetric heap?(Moved precedence of SMA_*
vs. SHMEM_*
env-vars to #111.)
This repository uses osh_spec_next
to track what other projects refer to as master
. We should move to master
as the devel branch for OpenSHMEM 1.5.
The behavior of shmem_malloc with zero size argument is currently undefined.
Update the shmem_test
example added by #32 to add main function and make the example complete.
With the deprecation of Fortran (and start_pes, etc), there are a number of examples that are demonstrating deprecated usage.
Deprecated examples should be removed and/or substituted with C11 examples.
Myself and other members of the OpenSHMEM community are in favor of adding a glossary to the 1.4 spec. Assuming that one is added, could I get some feedback as to what we'd like to see in said glossary?
Changes to notes from this pull request have been overwritten in a latter commit.
Document edits to the \openshmem
LaTeX macro has broken the rendering of PDF bookmarks. Text "OpenSHMEM" is now blank. (E.g., "History of" instead of "History of OpenSHMEM")
The compatibility note in OpenSHMEM 1.0 for header files mpp/shmem.h
and mpp/shmem.fh
is not present in 1.1 and newer. I don't mind that it's gone, but this change is not documented in the changelog for 1.1. Are implementations expected to maintain this backwards compatibility?
@Min-Si observed that the IN and OUT argument "modes" are inconsistently used and submitted a set of changes (nspark#3) to the RMA/AMO Section Committee working draft. While this primarily affects the RMA and AMO sections, it also affects others; e.g., collectives, synchronization. I think this inconsistency is pervasive (and confusing) enough to need consideration by the whole committee.
In general, I think the whole IN/OUT distinction is superfluous (at best) and confusing (at worst). C already has a way of expressing the contract of an API that a function shall not change the object pointed to by a pointer-type argument; it's called const
. I would love to see them removed entirely from the specification, but I think that will need to wait for 1.5.
Assuming we live with them for 1.4, I think there are three models for these sorts of "modes", as could be (or are) used by OpenSHMEM.
Specifically, a non-const
pointer argument to a function call has a mode of:
When the object pointed to by the argument... | Model 1 | Model 2 | Model 3 | Example |
---|---|---|---|---|
...will only be modified if the target PE is the calling PE | IN | IN/OUT | OUT | dest of shmem_put |
...is unconditionally modified by the call | OUT | OUT | OUT | dest of shmem_get |
A const
-modified pointer argument to a function call always has a mode of IN.
On review, I think that the OpenSHMEM RMA and AMO routines are pretty evenly split between Model 1 and Model 3, which only complicates deciding which one should be favored for the sake of consistency. The only reference to IN/OUT that I see in the specification is in shpclmove
.
In addition, there are strange exceptions to these models. For example, the dest
argument of shmem_broadcast
will only be modified when the calling PE is not the root.
This won't be fixed in time for the RCM draft, but I think we should fix before ratification.
We've merged the overhaul of the point-to-point API #32, which deprecated shmem_TYPE_wait
routines, specifically, shmem_wait
, shmem_short_wait
, shmem_int_wait
, shmem_long_wait
, and shmem_longlong_wait
. And #32 also expanded the type support for the point-to-point routines, shmem_TYPE_wait_until
and added shmem_TYPE_test
.
The documentation for shmem_wait states that the deprecated shmem_TYPE_wait
routines support the same (newly expanded) point-to-point types. So did we expand support for new deprecated routines? Specifically, must an implementation of OpenSHMEM 1.4 support the following?
shmem_ushort_wait
shmem_uint_wait
shmem_ulong_wait
shmem_ulonglong_wait
shmem_int32_wait
shmem_int64_wait
shmem_uint32_wait
shmem_uint64_wait
shmem_size_wait
shmem_ptrdiff_wait
pp.20 l.20: "as if the calls executed in some order even if their execution is interleaved."
This block of text may be read as condoning that overlapping shmem_put to the same target memory region would be "atomic" in the sense that you'd read the exact result of the first put for the entire target buffer, OR the second put for the entire buffer. I do not think this is the intended semantic; if multiple put target the same region without barriers, the output buffer is undefined.
The spec is not clear enough about what should be contained in shmem.fh. At present, shmem.fh varies across implementations, falling roughly into two camps:
The requirements on shmem.fh should be clarified.
OSSS in the text is correct. Open Source Software Solutions, Inc.
It is incorrect in the expansion in the cover page.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.