orcaman / concurrent-map Goto Github PK
View Code? Open in Web Editor NEWa thread-safe concurrent map for go
License: MIT License
a thread-safe concurrent map for go
License: MIT License
concurrent_map_bench_test.go
function BenchmarkMultiInsertDifferentSyncMap exist status 1
jeff@jeff-ryzen:~/go/go_concurrent-map/concurrent-map$ go version
go version go1.9 linux/amd64
$ go test -v -run="none" -bench="BenchmarkMultiInsert"
--- FAIL: BenchmarkMultiInsertDifferentSyncMap
goos: linux
goarch: amd64
BenchmarkMultiInsertDifferent_1_Shard-16 300000 11303 ns/op
BenchmarkMultiInsertDifferent_16_Shard-16 2000000 857 ns/op
BenchmarkMultiInsertDifferent_32_Shard-16 2000000 904 ns/op
BenchmarkMultiInsertDifferent_256_Shard-16 1000000 1556 ns/op
BenchmarkMultiInsertSame-16 500000 7450 ns/op
BenchmarkMultiInsertSameSyncMap-16 500000 9161 ns/op
FAIL
exit status 1
FAIL _/home/jeff/go/go_concurrent-map/concurrent-map 20.217s
Enhancement:
If concurrent-map is supposed to behave the same as the plain map, then the allowed key types should be the same and the type map[interface{}]interface{} should be supported.
in fnv32()... len will calculate each time.
It will be little but faster If we move it out of the loop
func fnv32(key string) uint32 { hash := uint32(2166136261) const prime32 = uint32(16777619) keysLen := len(key) for i := 0; i < keysLen; i++ { hash *= prime32 hash ^= uint32(key[i]) } return hash }
Most go projects add tags at various points in the development so that consumers of libraries can guarantee that they are getting a specific version that they know works with their code.
This project looks interesting and I was going to use it, however the fact that every time I build I am pulling from master and just hoping that nothing has changed that makes it incompatible with my application is a deal-breaker for me.
Please consider adding tags for stable versions of the library.
golang1.18 release now
will suppoort type paramter?
Hi, folks.
I've decided to use your CHM implemendation. But i have specifical use case. I use
map for counting, actually freqeuency of words. So you guarantee synchronization of each particular
operation. How can i reach consistency when i
Of source i can use my own global lock, but it will totaly destoys idea of map concurrency.
Shoud i take ConcurrentShard and use it internal RWLock?
Hello ~ community maintainer:
the community has not released a version for a long time, can you release a version?
I have personally had a requirement to do multiple consecutive operations using the same key with the requirement that nothing can change from start to end. To enable this, I propose explicitly named versions of the Get, Set and Has functions which do not take out locks on the shard. Instead, it is up to the user to use the GetShard method and lock/unlock it correctly.
The three functions would look like
func (m ConcurrentMap) UnlockedSet(key string, value interface{})
func (m ConcurrentMap) UnlockedGet(key string) (interface{}, bool)
func (m ConcurrentMap) UnlockedHas(key string) bool
I'm not sure about the naming. Maybe UnsafeGet
, UnsafeSet
and UnsafeHas
is better?
Here's an example of how I think this could be useful
shard := conMap.GetShard(key)
shard.RLock()
pointerToMyStruct, ok := conMap.UnlockedGet(key)
// Some other thread might be trying to write to this same key but, we still have a lock!
if ok {
pointerToMyStruct.ReadSomeCoolInfo()
}
shard.RUnlock()
// Now the write on the other thread will go-ahead
In my case, I'm developing a cache where entries expire. I want to ensure that if the value has expired when the Get call is made, it's still expired when when ReadSomeCoolInfo
call is made. If another thread were allowed to refresh the value between the two functions, it would cause me to return the wrong information.
The alternative to these functions would be to expose items
on the shards to other packages and let the developer dig in and do it all manually.
for iter := range cMap.Iter() { cMap.Remove(iter.Key) }
like this will deadlock
cann't help asking this: why is shard count fixed 32, not a variable? :)
Hi, this is my codes.
package main
import "net/http"
import "fmt"
import "github.com/streamrail/concurrent-map"
import (
"crypto/rand"
"encoding/hex"
"log"
_ "net/http/pprof"
"runtime"
"strconv"
)
var myMap = cmap.New()
func main() {
runtime.GOMAXPROCS(runtime.NumCPU())
log.Println("here start test")
http.HandleFunc("/", myHandler)
http.HandleFunc("/delete", deHandler)
log.Fatal(http.ListenAndServe(":9090", nil))
}
func myHandler(w http.ResponseWriter, r *http.Request) {
str := "aaaaaaaaaaaaaaaaaaaaaaaa"
for i := 0; i < 10000000; i++ {
key := str + strconv.Itoa(i)
myMap.Set(key, "8ff98326-2187-4de2-924e-af5098921aba")
}
// myMap.Set(randString(), "1000")
fmt.Println(myMap.Count())
w.Write([]byte(strconv.Itoa(myMap.Count())))
}
func deHandler(w http.ResponseWriter, r *http.Request) {
fmt.Println("delete")
fmt.Println(myMap.Count())
str := "aaaaaaaaaaaaaaaaaaaaaaaa"
for i := 0; i < 10000000; i++ {
key := str + strconv.Itoa(i)
myMap.Remove(key)
}
runtime.GC()
fmt.Println("after delete")
fmt.Println(myMap.Count())
w.Write([]byte(strconv.Itoa(myMap.Count())))
}
func randString() string {
b := make([]byte, 10)
if _, err := rand.Read(b); err != nil {
panic(err)
}
return hex.EncodeToString(b)
}
The server consumes about 1.4G memory. And when I request the "localhost:9090/delete", myMap.Count() is 0. But the memory is not reduced.
I use pprof, the results is:
go tool pprof http://localhost:9090/debug/pprof/heap
Fetching profile from http://localhost:9090/debug/pprof/heap
Saved profile in /home/user/pprof/pprof.for_go.localhost:9090.alloc_objects.alloc_space.inuse_objects.inuse_space.018.pb.gz
Entering interactive mode (type "help" for commands)
(pprof) top
582.01MB of 582.01MB total ( 100%)
Dropped 9 nodes (cum <= 2.91MB)
Showing top 10 nodes out of 11 (cum >= 582.01MB)
flat flat% sum% cum cum%
544MB 93.47% 93.47% 544MB 93.47% runtime.hashGrow
32.51MB 5.59% 99.05% 582.01MB 100% runtime.mapassign
5.50MB 0.95% 100% 5.50MB 0.95% runtime.evacuate
0 0% 100% 582.01MB 100% github.com/streamrail/concurrent-map.(*ConcurrentMap).Set
0 0% 100% 582.01MB 100% main.myHandler
0 0% 100% 582.01MB 100% net/http.(*ServeMux).ServeHTTP
0 0% 100% 582.01MB 100% net/http.(*conn).serve
0 0% 100% 582.01MB 100% net/http.HandlerFunc.ServeHTTP
0 0% 100% 582.01MB 100% net/http.serverHandler.ServeHTTP
0 0% 100% 582.01MB 100% runtime.goexit
And I read some posts, maybe map is not good at storing huge key/value ?
The current implementation of iterCB is quite simple:
// Callback based iterator, cheapest way to read
// all elements in a map.
func (m ConcurrentMap) IterCb(fn IterCb) {
for idx := range m {
shard := (m)[idx]
shard.RLock()
for key, value := range shard.items {
fn(key, value)
}
shard.RUnlock()
}
}
We could create SHARD_COUNT
go routines , one for each shard, and do some parrallelism here.
If one callback function does lock the shard, that will keep the other ones unlocked.
Any specific reason why this is implemented this way ?
func (m *ConcurrentMap) Upsert(key string, value interface{}, cb UpsertCb) (res interface{})
Here the value is the updated value or the value which is present in the key ???
And in order to update the value at given key can we use Upsertcb as nil ???
like this
func (m *ConcurrentMap) Upsert(key string, value interface{}, nil) (res interface{})
For example,
In regular map
var data = map[string]map[string]string{}
data["a"] = map[string]string{}
data["b"] = make(map[string]string)
data["c"] = make(map[string]string)
data["a"]["w"] = "x"
data["b"]["w"] = "x"
data["c"]["w"] = "x"
How to do this kind of nested map using concurrent Map? Any Idea
hi,i use buffer channels as map value for thread safe,when test use 10 goroutines the value got from channel was not same with the one send in,any suggestion?
testmap := cmap.New()
fmt.Println("SyncMapNew: ", TestInParallel(&testmap, 10))
func TestInParallel(g *cmap.ConcurrentMap, n int) time.Duration {
start := time.Now()
var wait sync.WaitGroup
for i := 0; i < n; i++ {
wait.Add(1)
go func() {
TheTest(g, rand.New(rand.NewSource(int64(i*500))))
wait.Done()
}()
}
wait.Wait()
return time.Now().Sub(start)
}
func TheTest(g *cmap.ConcurrentMap, rnd *rand.Rand) time.Duration {
start := time.Now()
var key string
var value time.Time
//var got time.Time
for i := 0; i < 10000; i++ {
key = strconv.Itoa(int(rnd.Int31n(50000)))
if g.Has(key) == false {
g.Set(key, make(chan time.Time, 100))
}
tchan, _ := g.Get(key)
castchan := tchan.(chan time.Time)
//castchan := make(chan time.Time, 100)
value = time.Now()
castchan <- value
got := <-castchan
g.Set(key, castchan)
if value != got {
panic(fmt.Sprintf("ERROR: expected %v, got %v", value, got))
}
}
return time.Now().Sub(start)
}
Context: concurrent-map uses FNV-1
(see: http://www.isthe.com/chongo/tech/comp/fnv/) to consistently map keys to one of its internal shards.
Surprisingly, it does not make use of hash/fnv, but instead relies on a custom implementation (see: https://github.com/orcaman/concurrent-map/blob/master/concurrent_map.go#L285).
I was about to make a PR to correct this, but I noticed that in the tests we actually check the behavior of the custom implementation against hash/fnv (see: https://github.com/orcaman/concurrent-map/blob/master/concurrent_map_test.go#L389).
What is the reason for that? I see that the native implementation is stateful but this can be work-around easily:
// Returns shard under given key
func (m ConcurrentMap) GetShard(key string) *ConcurrentMapShared {
return m[uint(fnv32(key))%uint(SHARD_COUNT)]
}
would become:
// Returns shard under given key
func (m ConcurrentMap) GetShard(key string) *ConcurrentMapShared {
hash := fnv.New32()
hash.Write([]byte(key))
hashKey := hash.Sum32()
return m[hashKey%uint32(SHARD_COUNT)]
}
which is slightly more verbose, but makes us rely on a maintained implementation of FNV-1. Is there a reason I am not seeing for this?
I have a use case where I need to get the value stored in the map, and depending on the value I might insert (if it didn't exist already), update, or remove it entirely. Currently this isn't possible atomically, but it's not hard to implement and then Upsert and RemoveCb can be re-implemented to just call UpsertOrRemove.
test code:
`
package main
import (
"fmt"
"github.com/streamrail/concurrent-map"
"sync"
)
var m = cmap.New()
func fun(wg *sync.WaitGroup) {
v := 1
if value, ok := m.Get("key"); ok {
v += value.(int)
}
m.Set("key",v)
wg.Done()
}
func main(){
var wg sync.WaitGroup
for i := 0; i < 10; i++ {
wg.Add(1)
go fun(&wg)
}
wg.Wait()
fmt.Println(m.Get("key"))
}
`
output:
Correct me if I'm wrong but isn't there race conditions on lines 63, 109 and 127?
Since you can insert and/or remove items from shards previous or before your current shard in any of those for loops, you can potentially have a conflicting size and your iterator may fewer or more items than were in your cmap at the time of your call.
This doesn't look thread safe.
https://github.com/streamrail/concurrent-map/blob/master/concurrent_map.go#L30
The big the better or what strategy?
If you declare a concurrentMap item without creating it with New()
then calling map.Items blocks forever.
var work cmap.ConcurrentMap
for k, _ := range work.Items() {
fmt.Println(k)
}
//Never reached
I think this should be treated as a bug, it should panic just like how other methods panic when the map hasn't been created.
Thanks for your project!
I have a question that why you rewrite a fnv32 function while there is already a hash/fnv package.
would be nice to have them, keep up the good work, thanks !
Why is ConcurrentMap sometimes passed by value and sometimes by reference?
// Returns shard under given key
func (m ConcurrentMap) GetShard(key <KEY>) *ConcurrentMapShared {
hasher := fnv.New32()
hasher.Write([]byte(key))
return m[int(hasher.Sum32())%SHARD_COUNT]
}
// Sets the given value under the specified key.
func (m *ConcurrentMap) Set(key <KEY>, value <VAL>) {
// Get map shard.
shard := m.GetShard(key)
shard.Lock()
defer shard.Unlock()
shard.items[key] = value
}
When we are traversing frequently, using IterBuffered() will generate a lot of memory garbage, which will cause a lot of burden on the GC of the program. At this time, we need to traverse externally, so we have to provide the items method
deadlock will be detected in following test case
i think it's because m.Count() and it's count value is not lock protected in m.IterBuffered
func TestBufferedIterator(t *testing.T) {
m := New()
// Insert 100 elements.
for i := 0; i < 100; i++ {
m.Set(strconv.Itoa(i), Animal{strconv.Itoa(i)})
}
// Iterate over elements.
//iter while put
tcounter := new(int32)
wg := &sync.WaitGroup{}
var ct <-chan Tuple
for j := 0; j < 10000; j++ {
i := j
if i == 999 {
wg.Add(1)
go func() {
defer wg.Done()
ct = m.IterBuffered()
/*
for item := range m.IterBuffered() {
val := item.Val
if val == nil {
t.Error("Expecting an object.")
}
atomic.AddInt32(tcounter, 1)
}
*/
}()
} else {
wg.Add(1)
go func() {
defer wg.Done()
m.Set(strconv.Itoa(i), Animal{strconv.Itoa(i)})
}()
}
}
wg.Wait()
fin := atomic.LoadInt32(tcounter)
fmt.Println("tmp counter", fin)
counter := 0
for item := range m.IterBuffered() {
val := item.Val
if val == nil {
t.Error("Expecting an object.")
}
counter++
}
if counter != 10000-1 {
t.Error("We should have counted 10000-1 elements.", counter)
} else {
fmt.Println("iter buffered count", counter)
}
wg = &sync.WaitGroup{}
for j := 0; j < 1000; j++ {
i := j
wg.Add(1)
go func() {
defer wg.Done()
m.Set(strconv.Itoa(i), Animal{strconv.Itoa(i)})
}()
}
wg.Wait()
ctcount := 0
for item := range ct {
val := item.Val
if val == nil {
t.Error("Expecting an object.")
}
ctcount++
}
fmt.Println("ct count", ctcount)
}
Sorry for my english
when i try
map.Set("10.0.0.1", "bar")
i get panic: runtime error: index out of range
because in concurrent_map.go:36
int(hasher.Sum32())%SHARD_COUNT
evalute in -31
so use this
diff --git a/concurrent_map.go b/concurrent_map.go
index f044dee..97e7942 100644
--- a/concurrent_map.go
+++ b/concurrent_map.go
@@ -33,7 +33,7 @@ func New() ConcurrentMap {
func (m ConcurrentMap) GetShard(key string) *ConcurrentMapShared {
hasher := fnv.New32()
hasher.Write([]byte(key))
- return m[int(hasher.Sum32())%SHARD_COUNT]
+ return m[uint(hasher.Sum32())%uint(SHARD_COUNT)]
}
This is more of a question than a issue. But - lets say each key of Map falls in different shard and each key is being read/written by unique goroutine (i.e - there is a 1:1 mapping between keys and goroutines) - then this implementation implies that we don't need locking.
But in golang 1.6 - if multiple goroutines write to same map even though they all are reading/writing different keys it will result in - "Concurrent write error" and program will crash.
In fact, I have written a sample program that reproduces this - https://gist.github.com/gnufied/402523b0601f008beb8d627899c73ef8
By this definition - how is this map threadsafe?
I found that a large part of the time-consuming operations are hash functions, so please try MemHash, it may be much faster
Hi, first of all. Thanks for the good map code for golang, we love it and use it in our project:
https://github.com/runner365/livego
and we find one question about IterBuffered performance in concurrent-map.
func (m ConcurrentMap) IterBuffered() <-chan Tuple {
chans := snapshot(m)
....
}
and in function snapshot:
func snapshot(m ConcurrentMap) (chans []chan Tuple) {
.....
for index, shard := range m {
go func(index int, shard *ConcurrentMapShared) {
// Foreach key, value pair.
shard.RLock()
chans[index] = make(chan Tuple, len(shard.items))
wg.Done()
for key, val := range shard.items {
chans[index] <- Tuple{key, val}
}
shard.RUnlock()
close(chans[index])
}(index, shard)
}
......
}
in function snapshot, There is a loop while run 32 goroutine each time, even if there are only one or two items in ConcurrentMap。And I think the 32 goroutine do not help to run faster.
When I use IterBuffered function in my project, it cost a lot of cpu even if there are a little number of item in c-map.
so I modify it:
func snapshot(m ConcurrentMap) (chans []chan Tuple) {
.....
for index, shard := range m {
go func(index int, shard *ConcurrentMapShared) {
// Foreach key, value pair.
shard.RLock()
chans[index] = make(chan Tuple, len(shard.items))
wg.Done()
for key, val := range shard.items {
chans[index] <- Tuple{key, val}
}
shard.RUnlock()
close(chans[index])
}(index, shard)
}
}
after modification the performance increase a lot(more than 50% in my project)
my fored site: https://github.com/runner365/concurrent-map
please let us know whether it's right to modify in the way above.
thanks again, and best regard.
Hi,
I noticed that the template based generator has been removed from this lib. All I could find was this https://github.com/streamrail/concurrent-map/pull/40 . Do you mind elaborating a bit?
I have some code that uses the generated libs from before. So, I am curious.
Thanks.
Hi, I use cmap in my app, I store 2w+ key-value in the cmp hash.
My app is go-socket.io, it maintains 1w+ socket connections
It probably make a big memory consumption ?
Now my app consumes 14G memory
It is terrible, I am searching the problem
hello,as the title ,can you add a clone func to deep copy a cmap to a new one?
and for the values like string,add a function maybe named append to append new string to the old ones stored before,because if the string stored in cmap is large,i get it and cancat with another large string will cost a lot of memory
I noticed a generic version was merged to main
, but the tag created is v1.18
instead of v1.18.0
/v1.1.8
/etc which is required by go get [-u]
.
For example:
go get -u github.com/orcaman/concurrent-map
go get -u github.com/orcaman/concurrent-map@latest
...will always return v1.0.0 as that is the only valid semver2 tag available in the repository.
Additionally, if the tag is renamed from v1.18
to v1.1.8
/v1.18.0
it will start upgrading people from v1.0.0
even though the APIs are not compatible.
I sugggest changing the v1.18
tag to v2.0.0
so that the major version change stops auto upgrades from v1.0.0
but allows users to get v2.0.0 if they want it via go get -u github.com/orcaman/[email protected]
. This does have some other effects. See: https://go.dev/ref/mod under "Major version suffixes" or https://go.dev/blog/v2-go-modules . The short of it is that the go.mod
package name needs to have v2
added to it (i.e. module github.com/orcaman/concurrent-map/v2
or module github.com/orcaman/concurrent-map.v2
).
As an undesirable workaround, to get the generics version (v1.18
) one can reference the commit SHA.
go get -u github.com/orcaman/concurrent-map@b1f44ce2372495e453fe5ff13e5cf54c855a9b4c
However, this has other version implications as the go.mod
file will reference it as v1.0.1-<x-y>
as a pseudo version.
I can create a PR to change the go.mod
file, but the tagging is what really matters and can't be represented in a PR.
Can I set the map's key to interface{}? So I can use any type to be the key.
Got index out of range panic, https://github.com/streamrail/concurrent-map/blob/master/concurrent_map_template.txt#L34
No idea how that happened, afaik it should not.
Marvell PJ4Bv7 Processor rev 2 (v7l)
There is trailing whitespace in README.md in three places... Will submit a cleanup PR shortly.
clear all key-value
Hello,
Thanks for this package that has been useful for years.
Golang 1.9
now provides a concurrent map in standard library. Consider mentioning it on the README
for those using latest version of Golang.
Though, the package is still very useful for older versions :)
Sometime get value is cost too many cpu. So we cache the value on a map.
So I want to change the SetIfAbsent like that:
func (m *ConcurrentMap) SetIfAbsent(key string, fn func() interface{}) bool {
// Get map shard.
shard := m.GetShard(key)
shard.Lock()
_, ok := shard.items[key]
if !ok {
shard.items[key] = fn()
}
shard.Unlock()
return !ok
}
Would you mind give suggestions for me?
go1.17是否适配?
func (m ConcurrentMap) Keys() []string {
count := m.Count()
time.Sleep(5*time.Second)
ch := make(chan string, count)
go func() {
// Foreach shard.
wg := sync.WaitGroup{}
wg.Add(SHARD_COUNT)
for _, shard := range m {
go func(shard *ConcurrentMapShared) {
// Foreach key, value pair.
shard.RLock()
for key := range shard.items {
ch <- key
}
shard.RUnlock()
wg.Done()
}(shard)
}
wg.Wait()
close(ch)
}()
// Generate keys
keys := make([]string, 0, count)
for k := range ch {
keys = append(keys, k)
}
return keys
}
as the func shows, m.Count() is called before loop keys from shards.
However, there was no direct connect between Cont() res and the length of keys.
So why the func is designed to count first ?
Hi all,
It looks like returning from Iter() and IterBuffered() before they have finished iterating results in a deadlock. Any idea of what is the suggested way to return properly (I would like to avoid looping through all elements)?
Example for deadlock:
for val := range yourmap.Iter() {
data := val.Val...
if data == 10 {
return true
}
}
Thanks,
Eugen
Hello
Thank you for this library.
I am trying to use a concurrent map to write from go routines, here is a test
func TestCMap(t *testing.T) {
var wg sync.WaitGroup
cm := cmap.New()
go func() {
defer wg.Done()
wg.Add(1)
for i := 0; i < 10; i++ {
cm.Set(strconv.Itoa(i), i)
}
}()
wg.Wait()
fmt.Println(cm.Items())
fmt.Println(cm.Items())
}
Output of this:
=== RUN TestCMap
map[0:0 2:2 3:3 4:4]
map[0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 8:8 9:9]
--- PASS: TestCMap (0.00s)
The map seems to be eventually consistent. Am I using this the right way?
Thanks
I see the code,
is this plagiarism ?
Upsert
allows you te execute a callback, and depending on value update/set/insert a new one, but there's now way to atomically remove elements from the map depending on its values.
We have a small pub-sub implementation based on this map, each map element contains a slice of subscribers for a given topic (key). I don't see any way to remove a topic once everybody de-subscribed from it without accidentally removing new subscribers.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.