pnnl / chgl Goto Github PK

View Code? Open in Web Editor NEW

28.0 9.0 8.0 57.88 MB

Chapel HyperGraph Library (CHGL) - HPC-class Hypergraphs in Chapel

Home Page: https://pnnl.github.io/chgl/

License: MIT License

Shell 1.76% Chapel 87.37% Dockerfile 0.05% Python 3.25% Jupyter Notebook 0.25% C++ 7.33%

hypergraph hypergraphs distributed-datastructure partitioned-global-address-space chapel graph

chgl's Introduction

Chapel Hypergraph Library

The Chapel Hypergraph Library (CHGL) is a library for hypergraph computation in the emerging Chapel language. Hypergraphs generalize graphs, where a hypergraph edge can connect any number of vertices. Thus, hypergraphs capture high-order, high-dimensional interactions between multiple entities that are not directly expressible in graphs. CHGL is designed to provide HPC-class computation with high-level abstractions and modern language support for parallel computing on shared memory and distributed memory systems.

License

CHGL is developed under the MIT license. See the LICENSE file in this directory for further details.

chgl's People

Contributors

Stargazers

Watchers

Forkers

mandysack louisjenkinscs yoichiozaki tcarneirop tubbz-alt hikylemorris chqlee ricklentz

chgl's Issues

Use 'walk` to determine duplicates in `collapse`

Currently I iterate through all edges and vertices and add it to a hash map to determine duplicates. This is complicated as I need my own custom hashing mechanism and collision detection, this can be avoided and made embarrassingly parallel if we just check edges that we can walk to with s set to the cardinality of the current vertex or edge. Representative can just be the lowest and first we find.

Compute s-closeness via Dynamic Programming

Given the definition, to compute d(f,g), if we have already encountered d(h,g) then we can compute d(f,h) + d(h,g) to save a substantial amount of time.

Introduce a more efficient `getToplexes` algorithm

Currently, computing toplexes is performed in an O(N^2) way. This can be vastly improved by simply determining if we can s-walk from an edge e to e', where s is |e|. This easily identifies edges which are toplexes as those without any edges to s-walk to will be by-definition toplex edges as no other edges share all of the same vertices as it.

One alternative that @marcinz suggested was to make s=1 and then segment edges into edges that were toplexes and those that are not. This definitely is an optimization as duplicate work can be avoided since we prune off searches for edges that can easily be determined by a toplex edge to be non-toplexes.

Add a 'PropertyMap'

One thing that is required to read in the DNS dataset is a property map as the data represents mappings of domains (hyperedges) to their registered IP addresses, and this is performed by-name. The PropertyMap will be a distributed hash table which maps generic properties defined by the user to vertex or hyperedge identifier (note, not their wrappers but the integer id).

For a property to be a hashable, the user must define the method chgl_hash on it which will return a uint(64). This uint(64) can be used to determine the locale that it will be allocated on it via modulus division. The hash will be defined for standard primitive types like such...

proc string.chgl_hash() : uint(64) { }
proc int(64).chgl_hash() : uint(64) { }
proc uint(64).chgl_hash() : uint(64) { }
// ...

These will make it easy to use for the DNS dataset as well as a vast majority of user-data. The PropertyMap will be privatized such that there will be a hash table allocated on each locale. The hash table will be protected by a very coarse-grained lock, but to speed things up we will use the Aggregator to batch up fine-grained communication and excessive locking. To speed up the locking, we will use CC-Synch algorithm described in "Revisiting the combining synchronization technique" for higher performance.

The property map will maintain a dual relationship for each locale, so if the property "google.com" maps to hyperedge id 123, then there will be a mapping from 123 to "google.com". This will enable usage of the PropertyMap like such...

forall v in graph.getVertices() do
   writeln(graph.getProperty(v));

Where the user can query the property via iterating over the hypergraph. As well it will be possible to construct a graph from this PropertyMap as that is how the graph will obtain the properties to begin with.

Implement a Domain-Specific-Language with a REPL for CHGL

As Chapel currently lacks its own REPL, and since CHGL only needs to keep track of a much smaller subset of operations that a full on Chapel REPL would, I'm thinking that I can create a DSL that the user can invoke to easily process their data. Such a REPL could be build via make and its potential usage can be seen below...

CHGL> graph = load("dataset.txt")
CHGL> vDegDist = vertexDegreeDistribution(graph)
CHGL> eDegDist = edgeDegreeDistribution(graph)
CHGL> graph2 = makeGraph(numVertices(graph), numEdges(graph))
CHGL> generateChungLu(graph2, vDegDist, eDegDist)
CHGL> plot(graph2)

@marcinz In case you wanted to see this for yourself.

Implement Leader-Follower iterators, not just standalone

Currently I have implemented standalone iterators which work fine for a forall, but in the case where we attempt to zip two parallel iterators, a leader-follower is required.

Fix Error when trying to compile test.chpl

Original Error
chpl -o test --module-dir chgl/src/modules/ --module-dir chgl/src/aggregation/ test.chpl
chgl/src/modules//AdjListHyperGraph.chpl:489: In function 'this':
chgl/src/modules//AdjListHyperGraph.chpl:491: error: can't apply '#' to a range with idxType int(64) using a count of type AtomicT(int(64))

*Fix* 
@@ -1493,7 +1496,7 @@ module AdjListHyperGraph {
           var n = getEdge(e).degree;
           assert(n > 0, e, " has no neighbors... n=", n);
           if n == 1 {
-            var v = getEdge(e)[0]; (Original)
+           var v = getEdge(e).incident[0]; (Fix)

Local Neighborhood produces duplicates

When printing out local neighborhood of blacklisted vertices and edges, it will print duplicates due to it not marking/keeping track of vertices and edges that it has already visited. Should handle this by using an associate domain.

Convert normal arrays to custom Vectors as they show order of magnitude improvement

The Vector implementation in Vector.chpl outperforms Chapel Array push_back by 10x and so we should see significant improvement once this conversion has completed.

Push-Back: 8.92253 seconds
Append: 0.91275 seconds

implement converting a degree sequence to a probability distribution (as well as converting back)

a degree sequence is an array holding the degree for each vertex/edge in the graph

a probability distribution is an array with domain {0..maxDegreeValue}. Its values are the probability of a vertex/edge having a degree of that value.

To convert from sequence to distribution is pretty simple.

To convert from distribution to sequence, we will have to sample from the probability distribution (using getRandomElement() perhaps).

Investigate issue where obtaining intersection after sorting results in non-deterministic behavior

For some reason, sorting both neighborList will result in undefined behavior and causes the Connected Components code to fail but without it and it works just fine. Weird.

Implement a parallel distributed CSV reader

Right now only the BinReader is parallel and distributed, but in reality it shouldn't be too difficult to make the CSV reader parallel and distributed as well. I had a decent sketch on how to do this in a way that is rather intuitive and makes sense (to me at least)…

iter readCSV(file : string) : string {
   var chunk : atomic int;
   coforall loc in Locales do on loc {
      coforall tid in 1..#here.maxTaskPar {
         var f = open(file, iomode.r).reader();
         var currentIdx = 0;
         label readChunks while true {
            // Claim a chunk...
            var ix = chunk.fetchAdd(chunkSize);
            // Skip ahead to chunk we claimed...
            var tmp : string;
            for ix - currentIdx do f.readline(tmp);
            // Begin processing our chunk...
            for 1..#chunkSize {
               if f.readline(tmp) then yield tmp;
               else break readChunks;
            }
         }
      }
   }
}

In the above, we create a file handle per task per locale and claim chunks via atomics (network atomics are also extremely quick on Gemini/Aries + uGNI + hugepages*). It also yields lines so that it maintains maximum reusability.

Add a 'FrozenAdjListHyperGraph' that performs non-thread safe operations

We should implement a FrozenAdjListHyperGraph record-wrapper that takes the original AdjListHyperGraph's privatization id and instance and then provides a method for all mutating operations that just halt and it forwards to the rest of them. forwarding has an explicit clause except that allows us to not forward to these mutating methods and halt instead. The FrozenAdjListHyperGraph can contain the non-thread safe operations such as collapse and removeIsolatedComponents so that the user kind of makes an explicit contract that they will not use the original AdjListHyperGraph (nothing is stopping them from doing so)

include per-degree metamorphosis coef in BTER verification test

Implement a cache for metrics

A lot of the times, metrics are reused and can therefore be recycled when used later. It would be nice to have something like, say, memcached which maps keys to values in a way that is relatively efficient. That way it be something like...

const componentsKey = "connected components";
var components : ComponentsMetric;
if metrics.contains(componentsKey) {
   components = metrics.get(componentsKey);
} else {
   components = getComponents();
   metrics.put(componentsKey, components);
}

That can be relatively useful in the future and even in the present, as testing on DNS data commonly requires metrics to be passed around, which is very awkward right now.

do performance testing of BTER

DNS Test should only perform BFS once and count vertices and edges in components

Can perform BFS over just edges, no vertices required.

Chapel Graph Library - Generalizations of HyperGraphs to the Extreme

I think that the new Chapel Graph Library (CGL) should definitely create all graphs in terms of hypergraphs, and should take this to the extreme! I think that CHGL should be optimized as much as we can such that we can create some interesting data structures from it...

Undirected Graph is composed of HyperGraph...
BinaryTree is composed of UndirectedGraph...

All of the underlying logic, aggregation, privatization, performance features, etc., all transfer over!

Add an 'Intersects(other, s)' method for short-circuit intersection

Right now, S-Walking performs a full on intersection even if its an S-Walk of size 1. We can significantly improve performance by just checking for an intersection of s without copying.

Change 'numNeighbors' to 'degree'

I believe degree is more intuitive to data scientists.

var deg = [v in graph.getVertices()] graph.degree(v);

Looks a lot more aesthetically pleasing and more intuitive than...

var deg = [v in graph.getVertices()] graph.numNeighbors(v);

Implement Object-Based MCAS Transactions

Currently, in my study group at Rochester, one of my peers is working on an object-based transaction interface using a software implementation of MCAS (Multiword Compare-And-Swap). It actually be made more intuitive and easy to use once we make use of Chapel's features. Basically it involves wrapping transactional objects in forwarding records, Task-Private Variables, defined with the intent with (var taskPrivateVariable : tpvType), can also handle the issue of thread-local storage.

Add operators for `Wrapper` types.

Need operators so that Wrapper types can be treated more first-class like which will make it significantly easier to use.

AdjListHyperGraph Correctness Tests

'- [x] addInclusion

Make a variadic 'addInclusion` and `+=` overloads

While writing up a hard-coded hyper graph, it is becoming rather tedious. For example, the following hypergraph...

var graph = new AdjListHyperGraph(9, 10);

graph += (0, 0);
graph += (0, 1);
graph += (0, 2);
graph += (1, 0);
graph += (1, 1);
graph += (2, 0);
graph += (2, 2);
graph += (3, 1);
graph += (3, 2);
graph += (4, 2);
graph += (4, 3);
graph += (5, 3);
graph += (5, 4);

graph += (6, 6);
graph += (7, 6);
graph += (7, 7);
graph += (7, 8);
graph += (8, 8);
graph += (9, 7);
graph += (9, 8);

This ideally could be done like such...

var graph = new AdjListHyperGraph(9, 10);

graph += (0, (0, 1, 2));
graph += (1, (0, 1));
graph += (2, (0, 2));
graph += (3, (1, 2));
graph += (4, (2, 3));
graph += (5, (3, 4));

graph += (6, 6);
graph += (7, (6, 7, 8));
graph += (8, 8);
graph += (9, (7, 8));

As well, it should be flipped to have hyperedges first, then vertices. That way graph[e] = X makes sense, and graph.addInclusion(e, v) becomes more intuitive when adding multiple vertices to an edge.

Implement metric wrappers for metrics

I believe that there should be a wrapper for metrics that allows intuitive retrieval of information from them. This can also be used for the cache in #38 as Chapel's generic types are very weird for arrays sometimes.

implement "create new" and "append to existing graph" versions of Erdos-Renyi and Chung-Lu

Currently, these procedures append inclusions to an existing graph. But we should have versions that create a new graph as well.

Add 'On-Statement'

chgl/src/modules/AdjListHyperGraph.chpl

Line 688 in 2768bc1

getPrivatizedInstance().collapse();

Explicit Undirected and Directed Hypergraph Support

Common Classes

class Vertex {
   var edge : Set(Edge);
   // Base Class Methods
}

class Edge {
   // Base Class Methods
}

Directed and Undirected Hypergraphs

Document describing Directed Hypergraphs: Gallo, Giorgio, et al. "Directed hypergraphs and applications." Discrete applied mathematics 42.2-3 (1993): 177-201.

Summary: Need a new hypergraph where hyperedges are composed of sets of tail vertices and head vertices.

class DirectedHyperEdge : Edge {
   var tail : Set(Vertex);
   var head : Set(Vertex);
   // Directed Edge Specific Methods
}

class UndirectedHyperEdge : Edge {
   var vertices : Set(Vertex);
   // Undirected Edge Specific Methods
}

Refactor AdjListHyperGraph to make use of UndirectedHyperEdge, create new one to make use of DirectedHyperEdge

Create a Distributed, Privatized, and Parallel RandomStream

Implement the concept described in issue #10741. Will allow querying random values from a locale-private RandomStream which shares the same seed as all of the other locale-private RandomStream but have different offsets, where offsets get coordinated over network atomic 'fetch-and-add' counter. Offsets can be incremented in a fixed batched amount to allow prefetching of multiple values.

.member deprecated, replace with .contains

Hi Louis,

While trying to compile the test.chpl file there were warnings for the ".member" function being deprecated. I modified the files to replace ".member" with ".contains" on my end and tried to commit them but it seems as though I cannot push them to the master branch.

Fix PDMC_test

This unit test needs to be fixed. Currently the acceptable threshold (+/- 0.0001) for the difference in per-degree metamorphosis coefficients is insufficient. @hmedal May be able to suggest a better unit test for testing vertex and hyperedge PDMC, as I believe there will be some variance between Sinan's output and from what we obtain, but we need a way to automate this testing (I.E non-visual). I'll help Chris out with this a bit.

Implement parallel and distributed-efficient Connected Components

Currently we process things serially due to the fact that we need to keep an associative domain to keep track of two separate types of wrappers. I can remedy this as per @marcinz suggestion, by keeping two different arrays of vertices and edges and processing them based on the type of the vertex or edge. I need two recursive functions, visitVertex and visitEdge that are called in an intertwining manner so that the type isn't an actual issue.

Convert 'neighbors' array to unrolled linked list

'Benefits:

Less space requirement - no need to increase the size of each block by some constant factor (1.5x), we can increase by a constant amount (the size of each unroll block).
No need to allocate huge pages
Might be easier to sort in parallel
Faster insertion (possibly parallel), may even be able to eliminate coarse-grained lock for a finer-grained lock.
Might be easier to distribute in a '1.5D' fashion by allocating blocks on different nodes.

Fix remove seg fault from having more than 2 data files

In file src/modules/Components.chpl

in function:

proc getEdgeComponentMappings(graph, s = 1)

There is recursive function that gets called visit (below) that tries to read in the component. However, there is no check to make sure there is another component to read in. I added a check to remove the seg fault error:

proc visit(e : graph._value.eDescType, id) : int {
        var currId = id;

        while true {
            var eid = components[e.id].read();

Fix:

      proc visit(e : graph._value.eDescType, id) : int {
        var currId = id;

        while true {
            if components[e.id].read() == max(int){
                return currId;
            }
            var eid = components[e.id].read();

Implement toplex search and toplex intersection sizes

Implement this similarly to our C++ code.

Fix Array out of bounds error

Problem:
In the test.chpl file would go to Collapse Vertices, it would be looking for duplicates.

    writeln("Marking and Deleting Vertices...");
            var vertexSetDomain : domain(ArrayWrapper);
            var vertexSet : [vertexSetDomain] int;
            var l$ : sync bool;
            var numUnique : int;
            forall v in _verticesDomain with (+ reduce numUnique, ref vertexSetDomain, ref vertexSet) {
              var tmp = [e in _vertices[v].incident[0..#_vertices[v].degree]] e.id;
              var vertexArr = new ArrayWrapper();
              vertexArr.dom = {0..#_vertices[v].degree};
              **vertexArr.arr = tmp;**

When it would try to set the vertexArr.arr to tmp it would get an error chgl/src/modules/AdjListHyperGraph.chpl:521: error: halt reached - array index out of bounds: (3)
To solve this issue, we added a check to see if the a.arr.size was >= than the b.arr.size to remove the error.

   proc ==(a: ArrayWrapper, b: ArrayWrapper) {
   +    if a.arr.size >= b.arr.size {
   +       return || reduce (b.arr == a.arr) ;
   +    }
        return && reduce (a.arr == b.arr);
   }

For DNS Test, should output to individual files

Can concatenate back into a single file if needed later, but having individual file for each stage and components can be very helpful.

Custom error messages no longer propagate down to first non-inline calls

Custom error messages, generated via compilerError, allowed us to provide the user more helpful error messages on how to use our data structure to get around the ambiguity that arise from the default compiler error messages. These custom error messages used to show the callsite (file name and line number) of the first non-inlined function, usually where it was called by the user. Now compilerError stops at the first parent, despite it being inlined or not, meaning the custom error messages are no longer helpful but harmful...

I need to create an issue on this or adapt to the new changes (which may mean no function that generates errors for me but good-old-fashioned copy-paste)

Redirect collapsed vertices and edges

It appears I do not do this correctly.

compute numerical value for measuring the difference of two distributions

'Use case: comparing the similarity of the vertex/edge degree distributions for an input graph vs an output graph for an algorithm. This will allow us to automate the verification testing for Chung-Lu and BTER. It would also us to report a numerical value in papers/slides/etc.

There are a number of different ways of doing this. One is K-L divergence, which is implemented in Python (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.entropy.html).

@saksoy mentioned a few measures also that we could consider.

As far as I am concerned, we can start by just implementing this as a post-processing python script and then calling that script from a Chapel unit test. I don't think it needs to be in our Chapel library.

After we implement this, it would be good to add it to the C-L and BTER unit tests that compare the COND-MAT dataset to the C-L/BTER output.

Pipeline BTER to generate affinity blocks in parallel W.R.T their calculation

Currently, BTER will calculate an affinity block and then generate it, and then calculate the next, and generate that, and so on.

chgl/src/modules/Generation.chpl

Lines 496 to 542 in 47dd023

 while (idV <= numV && idE <= numE){ 

 var (dV, dE) = (vd[idV], ed[idE]); 

 var (mV, mE) = (vmc[dV - 1], emc[dE - 1]); 

 (nV, nE, rho) = computeAffinityBlocks(dV, dE, mV, mE); 

 var nV_int = nV:int; 

 var nE_int = nE:int; 

 blockID += 1; 

 // Check to ensure that blocks are only applied when it fits 

 // within the range of the number of vertices and edges provided. 

 // This avoids processing a most likely "wrong" value of rho as 

 // mentioned by Sinan. 

 if (((idV + nV_int) <= numV) && ((idE + nE_int) <= numE)) { 

 if nE_int < 0 || nV_int < 0 { 

 writeln("idV = ", idV); 

 writeln("idE = ", idE); 

 writeln("numV = ", numV); 

 writeln("numE = ", numE); 

 writeln("nV = ", nV); 

 writeln("nV : int = ", nV : int); 

 writeln("nE = ", nE); 

 writeln("nE : int = ", nE : int); 

 writeln("rho = ", rho); 

 writeln("dV = ", dV); 

 writeln("dE = ", dE); 

 writeln("mV = ", mV); 

 writeln("mE = ", mE); 

 writeln("blockID = ", blockID); 

 halt("Bad idx"); 

 } 

 const ref verticesDomain = graph.verticesDomain[idV..#nV_int]; 

 const ref edgesDomain = graph.edgesDomain[idE..#nE_int]; 

 expectedDuplicates += round((nV_int * nE_int * log(1/(1-rho))) - (nV_int * nE_int * rho)) : int; 

 // Compute affinity blocks 

 var rng = new owned RandomStream(int, parSafe=true); 

 forall v in verticesDomain { 

 for (e, p) in zip(edgesDomain, rng.iterate(edgesDomain)) { 

 if p > rho then graph.addInclusion(v,e); 

 } 

 } 

 idV += nV_int; 

 idE += nE_int; 

 } else { 

 break; 

 } 

 }

The issue here is that the calculation of the affinity blocks are orthogonal to their generation, and since affinity blocks are by definition disjoint from each other, it would be safe to generate them in parallel. One optimization is that we calculate all affinity blocks in advance and then handle dispatching of them in parallel. This optimization should yield some result but is still restricted by the fact that after generating the affinity block, we have to await termination to begin processing the next affinity block.

The optimization I am thinking of involves a work queue where we have one thread generating the affinity blocks (asynchronously), which will then add meta data about the affinity block to be generated to a work queue. The work queue will be setup such that we have one task per core per locale, and each locale will handle generating the portion of the hypergraph that is allocated on that locale. This has the benefit of adding parallelism to generating these affinity blocks, but also prevents the issue where we have a lot of time spent spawning a task on each core per locale and then waiting for termination over and over. Its from O(NumAffinityBlocks) to O(1).

Add "count of 4-cycles" to BTER unit test

The number of 4-cycles for our output graph should not be too much more or less than the number of 4-cycles in COND-MAT. You could check that:

(1/5)num4CyclesInCondMat < num4CyclesInOutput < 5num4CyclesInCondMat

Unable to push changes to mandysack-patch branch

Hi Louis,

It seems as though I am unable to push changes to mandysack-patch branch. Is it locked from being written to.

NodeData `neighborList` is not thread-safe to access

Technically, it is possible to iterate through the neighbors of a vertex or edge while it is being modified, such as when an inclusion is added... For example, is this really safe?

for neighbor in graph.getNeighbors(graph.toVertex(0)) {
   graph.addInclusion(0, 0);
}

In the above, we add an inclusion to the neighborList of vertex#0 while we are iterating over it. This is likely not thread-safe; one immediate solution is to acquire the lock, make a copy of the current neighborList, release the lock, and then iterate over that. If we do not release the lock before iterating, then the above code would deadlock as its not reentrant. However, its possible to enter a deadlock scenario if we are not careful here. This is a huge motivator for STM.

Revised Leader-Follower interface

In the future, it would be nice to be able to zip a finite stream with an infinite one. Attempt the problem that @bradcray had mentioned on gitter...

"If you want a challenge to work on, invent a revised leader/follower interface that efficiently supports (a) a reasonable error message for forall (i,j) in zip(1..3, 1..6) and (b) the ability to maintain a cursor for a conceptually infinite follower like a random stream or file channel (i.e., one that supports forall (i,j) in zip(1..n, myStream) followed by forall (i,j) in zip(1..n, myStream) and doesn’t return the same n values each time, but items n+1..2*n the second time)."

"The challenge is this: You’re writing a follower iterator with the current interface. Four leader tasks call it, each requesting its corresponding n/4 chunk of items. How/where/when do you update the stream’s global cursor to indicate that n elements have been consumed so that the next time k leader tasks call it requesting n/k chunks, it continues where it left off? The current interface makes this somewhere between hard and impossible because there’s no cooperation between individual calls to the follower iterators.

(Ditto for case (a) where each follower yields what was requested, but nobody ever has the global picture to see that only a subset of the finite items were requested and that a size mismatch error should be generated)."

"Create a counter() iterator that when zipped with 1..n will return the integers 1..n in sorted order (maybe assign them to an n-element array to be sure). And then when zipped a second time returns the integers n+1..2*n in sorted order."

Something like this is relevant to CHGL in that zippered iteration is a fundamental parallel language construct, and any improvements on this area will help not only CHGL but Chapel in general.

Paper on "User-Defined Parallel Zippered Iterators" can be seen here: http://pgas11.rice.edu/papers/ChamberlainEtAl-Chapel-Iterators-PGAS11.pdf

Traversal algorithms should only check intersection sizes after checking if explored

This will ensure that we can quickly avoid re-computing thrown-away results, especially useful for high S-connectivity

Implement parallel and distributed DFS for traversal to be used in Connected Components

This can alleviate the bottleneck and allow fast distributed metrics. My idea is to do parallel DFS per component which should do the trick rather well, especially once PropertyMap becomes distributed.

Vector has 'ambiguous call' issue for 'append' with '_owned' type

Investigate the report this bug: sometimes when using Vector.append the compiler complains that the call is ambiguous, but it is resolved when I make it borrowed or unmanaged.

Sketch for RCUArray incident lists

Currently, the incident lists (formerly referred to as neighborList) is not thread-safe to access while being resized, where resizing occurs during addInclusion or makeDistinct (formerly known as removeDuplicateNeighbors). Now, this can be made thread-safe again by borrowing concepts from RCUArray and once #36 is completed. The way to do this is to not privatize the array but instead just use a single-node snapshot. The lifetime of the snapshot can be managed by Interval-Based Reclamation, but resizing can easily append blocks distributed across multiple nodes to make the 1.5D adjacency list idea work for large vertices and hyperedges.

Implement Interval-Based Memory Reclamation

Interval-Based Reclamation is a new memory reclamation technique that is actually developed and created in my current study group at Rochester. One way to get around the lack of TLS is by using the new "task-private variables", usable with the following intent: with (var taskPrivateVariable : tpvType). Extremely useful, and with forwarding we can wrap the objects we are interested in protecting in them.

	while (idV <= numV && idE <= numE){
	var (dV, dE) = (vd[idV], ed[idE]);
	var (mV, mE) = (vmc[dV - 1], emc[dE - 1]);
	(nV, nE, rho) = computeAffinityBlocks(dV, dE, mV, mE);
	var nV_int = nV:int;
	var nE_int = nE:int;
	blockID += 1;

	// Check to ensure that blocks are only applied when it fits
	// within the range of the number of vertices and edges provided.
	// This avoids processing a most likely "wrong" value of rho as
	// mentioned by Sinan.
	if (((idV + nV_int) <= numV) && ((idE + nE_int) <= numE)) {
	if nE_int < 0 \|\| nV_int < 0 {
	writeln("idV = ", idV);
	writeln("idE = ", idE);
	writeln("numV = ", numV);
	writeln("numE = ", numE);
	writeln("nV = ", nV);
	writeln("nV : int = ", nV : int);
	writeln("nE = ", nE);
	writeln("nE : int = ", nE : int);
	writeln("rho = ", rho);
	writeln("dV = ", dV);
	writeln("dE = ", dE);
	writeln("mV = ", mV);
	writeln("mE = ", mE);
	writeln("blockID = ", blockID);
	halt("Bad idx");
	}
	const ref verticesDomain = graph.verticesDomain[idV..#nV_int];
	const ref edgesDomain = graph.edgesDomain[idE..#nE_int];
	expectedDuplicates += round((nV_int * nE_int * log(1/(1-rho))) - (nV_int * nE_int * rho)) : int;
	// Compute affinity blocks
	var rng = new owned RandomStream(int, parSafe=true);
	forall v in verticesDomain {
	for (e, p) in zip(edgesDomain, rng.iterate(edgesDomain)) {
	if p > rho then graph.addInclusion(v,e);
	}
	}

	idV += nV_int;
	idE += nE_int;
	} else {
	break;
	}
	}