Comments (11)
Have you tried:
double maxVal = eval(max(a,max(b,SomeFunction(a,b))));
?
I must admit I have never tried this, but as far as I can tell it should work
from aboria.
Thanks! That works!
However, it seems to be running in serial and extremely slowly...
Do you think I should give up on it and switch to the lower level API for this part of the code or is there something I can do to make it run faster?
from aboria.
You are right: the outer max() isn't being parallised, which is an oversight on my part. I should be able to fix that up.
By "extremely slowly", do you mean slowly compared to running in parallel, or very slow even for serial code?
from aboria.
from aboria.
Ok, there might be other problems with this then, I'll look into it. In the meantime, you could swap to the lower level interface. This would just involve a double loop to evaluate all the particle pairs. You could make this parallel with openmp.
double my_max = 0;
#pragma omp for reduction (max:my_max)
for (size_t i = 0; i < particles.size(); ++i) {
for (size_t j = 0; j < particles.size(); ++j) {
const double val = function(particles[i],particles[j]);
if (val > my_max) {
my_max = val;
}
}
}
from aboria.
Thanks! That works however that openmp control predicate does not seem to work for my case and would not compile. This is not a huge deal in this particular case since this loop only adds about 1% to the time it takes to run the whole code, but it would of course be nice to be able to run such loops in parallel, just in case more work was needed to be done by such a loop.
For your reference, I was trying to calculate the viscous diffusion constraint on time step for SPH according to Monaghan. Below is the code I ended up with.
for (auto n = container.get_query().get_subtree(); n != false; ++n) {
for (auto i = container.get_query().get_bucket_particles(*n);
i != false; ++i) {
//i is first particle, loop for which is here
auto& Xi=get<position>(*i);
auto& Vi = get<cVelocity>(*i);
auto& RHOi=get<cDensity>(*i);
for (auto j = i + 1; j != false; ++j) {
auto& Xj=get<position>(*j);
auto& Vj = get<cVelocity>(*j);
auto dx= (Xj-Xi);
auto Vij= (Vj-Vi);
double ldxl= dx.squaredNorm();
if (ldxl<2*h)
{
thisMaxVisc= h* (Vj-Vi).dot(dx)/(ldxl*ldxl+epsilon);
thisMaxVisc>maxVisc? maxVisc=thisMaxVisc : 0;
}
}
thisMINcv= h/(ss0*std::pow(RHOi/rho0,eta) + maxVisc);
thisMINcv<MINcv? MINcv=thisMINcv:0;
}
}
Any optimization suggestions or comments are welcomed!
Also, I noticed something else that you might find interesting or want to look into. An expression like eval(max1(a,max2(b,SomeFunction(a,b))))
would not work if max1
was an Accumulate object and max2
was an AccumulateWithinDistance Object.
from aboria.
I notice that you are only looping through particles within each bucket, so you are not taking into account interactions between particles that are in different buckets.
You could write something like this to loop though all neighbour pairs within radius:
for (auto i: particles) {
for (auto j = euclidean_search(particles.get_query(), get<position>(i), radius);
j != false; ++j) {
// do something with i and j
}
}
Note that openmp will only be able to parallise the outter loop if it is an index loop, so you can change to:
for (int ii = 0; ii < particles.size(); ++ii) {
auto& i = particles[ii];
for (auto j = euclidean_search(particles.get_query(), get<position>(i), radius);
j != false; ++j) {
// do something with i and j
}
}
Note that if you want to do something on a per-bucket basis, and you don't want to double evaluate pairs of particles, and you are using the cell list data structure, then you can use the fast cell list neighbour search described here: https://martinjrobins.github.io/Aboria/aboria/neighbourhood_searching.html#aboria.neighbourhood_searching.fast_cell_list_neighbour_search
from aboria.
Thank you for spotting the mistake! It works well now.
I am not sure I understand the concept of a "bucket". Is it the same as cell in a cell list data structure, a list of all particles that are within some radius of each other, or something else? It is honestly the most confusing bit in the documentation, the rest is pretty well written I must say.
I tried to use the fast cell-list neighbour search you recommended but I seem to get faster results with the second code you recommended in your last post with openmp.
from aboria.
Yes, "bucket" is the same as "cell". A cell list is sometimes referred to as a "bucket-search" algorithm, and I started using the bucket terminology, then swapped to "cells" once I realised this was more prevalent. I need to go though and be consistent in the terminology!
I'd say go for the second code with openmp then, it takes extra operations since you are doing each pair twice, but you can make it parallel, which is good. Its also a lot easer to read :)
from aboria.
I see now!
I look forward to trying out new features in Aboria, especially running it on GPU! :)
from aboria.
the code eval(max1(a,max2(b,SomeFunction(a,b))))
with max2 being an AccumulateWithinDistance works now, and is run in parallel if OpenMP is used, fixed in 8080b04
from aboria.
Related Issues (20)
- H2 matrix inversion bugs
- NeighbourQueryBase needs the concept of a root child_iterator
- example code with CUDA? HOT 7
- Question: is Aboria::Particles<>'s design similar to std::variant? HOT 2
- Aboria with OpenMP Issue HOT 8
- Passing a Particle Label to a Function HOT 5
- Bug when Changing Particles Positions HOT 5
- query iterators should dereference to child_iterators
- Condition as an Argument for Symbolic Expressions HOT 9
- Feature Suggestion HOT 7
- Not orthorhombic unit cell HOT 12
- NanoFlannAdaptor non-copyable
- Error spurt out from compiling first example HOT 16
- Maximum Number of Aboria Variables HOT 1
- Variable Manipulation HOT 5
- refactor out common code in Balltree and Kdtree
- CRS Arrays of Sparse Matrices HOT 16
- MPI Parallelization? HOT 2
- Installation problem HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aboria.