Comments (8)
terrific! It seems to be working well
Thank you!
from parallel-hashmap.
Hi @gf777 ,
Thanks for the kind workds, and also for using phmap.
You have the right idea, and indeed for your purpose this would work very well. The two tables to merge would have to have the same N
template parameter (which is 4
by default), so that the two maps to merge have the same number of submaps, and the values are inserted into the submaps with the same index.
So say you are using N=4
and therefore you have 16 submaps, you could create 16 threads and have each thread merge the two matching submaps (with the same index). To access the submaps, you can use the get_inner
function:
auto& inner = map1.get_inner(0); // to retrieve the submap at index 0
auto& submap1 = inner.set_; // can be a set or a map, depending on the type of map1
from parallel-hashmap.
Thanks for the prompt feedback @greg7mdp
Dumb question then, how do I achieve this:
auto& inner = map1->get_inner(subMapIndex); // to retrieve the submap at given index
auto& submap1 = inner.set_; // can be a set or a map, depending on the type of map1
inner = map2->get_inner(subMapIndex);
auto& submap2 = inner.set_;
for (auto pair : submap1) { // for each element in map1, find it in map2 and increase its value
DBGkmer &dbgkmerMap = submap2[pair.first]; // insert or find this kmer in the hash table
Error:
src//kreeq.cpp:944:38: error: type 'EmbeddedSet' (aka 'raw_hash_set<phmap::priv::FlatHashMapPolicy<unsigned long long, DBGkmer>, phmap::Hash<uint64_t>, phmap::EqualTo, std::allocator<std::pair<const unsigned long long, DBGkmer>>>') does not provide a subscript operator
I tried a number of things with .insert and .find but to no avail..
Thank you for the help!
from parallel-hashmap.
auto& inner = map1->get_inner(subMapIndex); // to retrieve the submap at given index
auto& submap1 = inner.set_; // can be a set or a map, depending on the type of map1
auto& inner2 = map2->get_inner(subMapIndex);
auto& submap2 = inner2.set_;
for (auto pair : submap1) { // for each element in map1, find it in map2 and increase its value
auto got = submap2.find(pair.first); // insert or find this kmer in the hash table
if (got == submap2.end()){
submap2.insert(pair);
}else{
DBGkmer& dbgkmerMap = got->second;
Apparently this works (verbose though)
from parallel-hashmap.
One more question: if I now want to increase the number of maps in the template, say when I declare it here:
std::vector<phmap::parallel_flat_hash_map<uint64_t, VALUE>*> maps; // all hash maps where VALUES are stored
Is there an easy way of declaring it ?
from parallel-hashmap.
Hum, did you find the answer to your question? I was looking in Kreeq
and I see that you figured out how to declare parallelMap
with 256 submaps (using N=8), and also your code for merging the submaps looks good.
Let me know if you have any other question, I'm happy to help.
from parallel-hashmap.
This seems to work like a charm, thanks! I would go as far as saying that it should be a core function XD
While I have your attention I could pick your brain on how to best estimate map size in constant time, currently I have this from another project:
uint64_t mapSize(parallelMap& m) {
return (m.size() * (sizeof(DBGkmer) + sizeof(void*)) + // data list
m.bucket_count() * (sizeof(void*) + sizeof(uint64_t))) // bucket index
* 1.3; // estimated allocation overheads
}
I noticed you have some calculations in the readme as well. Thoughts?
from parallel-hashmap.
Correct size is:
uint64_t mapSize(parallelMap& m) {
return m.capacity() * (sizeof(parallelMap::value_type) + 1) + sizeof(parallelMap);
}
Map is in essence an array of values + an array of bytes (hence the + 1
). I think the above is pretty accurate.
from parallel-hashmap.
Related Issues (20)
- LLDB pretty-printer is buggy HOT 2
- Avoid Memory Reallocations in flat_hash_map clear for Trivial Classes HOT 3
- Suggestion: lazy_emplace_l without last argument HOT 12
- Unused variable warning HOT 2
- question about performance of `at(...)` HOT 4
- Iteration Order Differs on Arm64 Architecture HOT 4
- Very minor optimization: _mm_abs_epi8 instead of _mm_sign_epi8 HOT 1
- Slow in a specific case HOT 20
- asan build on linux failed with memory error HOT 3
- In template: constexpr variable 'kFirst' must be initialized by a constant expression HOT 6
- phmap_dump saving a lot more than needed HOT 4
- Thread safe way to check existence of items HOT 8
- is it possible to miss the necessary .cmake in the conda packaging? HOT 1
- nvc++: integer conversion resulted in a change of sign HOT 9
- Need a tag about `reserve` bug HOT 3
- Need help speeding up large hash map HOT 6
- Release memory after shrinking HOT 6
- Memory not reclaimed after calling map.clear() HOT 4
- Ability to reset the inner sub map HOT 64
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from parallel-hashmap.