GithubHelp home page GithubHelp logo

Crash in remove_node about uhashring HOT 18 CLOSED

ultrabug avatar ultrabug commented on May 11, 2024
Crash in remove_node

from uhashring.

Comments (18)

bjhockley avatar bjhockley commented on May 11, 2024

The following patch works around the crash, but a better fix may be possible/desirable:

diff --git a/uhashring/ring.py b/uhashring/ring.py
index 5ef4a13..1c94fec 100644
--- a/uhashring/ring.py
+++ b/uhashring/ring.py
@@ -233,7 +233,10 @@ class HashRing(object):
                 nodename, self._nodes.keys()))
 
         for h in self._hashi_weight_generator(nodename):
-            del self._ring[h]
+            try:
+                del self._ring[h]
+            except KeyError:
+                pass
             index = bisect_left(self._keys, h)
             del self._keys[index]
         self._distribution.pop(nodename)

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

Hey @bjhockley thanks a lot for reporting this. I'm puzzled by this I must say 🤕

Let me get into it, I'll report here as quick as I can.

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

Ok quick overview shows that you found how to abuse a weak point of mine which is precisely related to ring growth / shrink :)

The saner fix for now would be to regenerate the ring when we remove a node from it (just like we do when adding a node) but I hate the performance impact this has and I always wanted to address this issue once and for all.

With your allowance, I'd like to take this opportunity to work on the overall fix instead of introducing a patch to cover the issue you're pointing out.

How does that sound for you mate?

from uhashring.

bjhockley avatar bjhockley commented on May 11, 2024

That sounds absolutely perfect - I had an intuition that that sort of fix might be preferable.
Thank-you @ultrabug !

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

Great. I'm of course open to proposals @bjhockley ;) hint

from uhashring.

bjhockley avatar bjhockley commented on May 11, 2024

The other approach I was considering was making the values stored in self._ring a list (so we could cope with this kind of clash).

However I do very much like the elegance of your suggestion of making addition/removal more symmetrical and doing ring regeneration in both cases.

I'll have a play in the next few days and see where I get to - and let you know!

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

Thanks @bjhockley.

After playing a bit with this, I noticed that the problem is only present when using the ketama algorithm and that using compat=False doesn't trigger the error.

I'm gonna dig into the ketama hash implementation since I guess something may be wrong there.

from uhashring.

bjhockley avatar bjhockley commented on May 11, 2024

@ultrabug I also noticed that compat=False sidesteps the issue for this set of node names (I guessed the "longer" md5 hash has a lower chance of collision than the much shorter ketama hash? Not sure if that's a bug in this ketama hash implementation or just an algorithmic limitation of ketama itself?).

Anyway, I've tried two fixes today:

Simple fix to _remove_node: (simple and nicely symmetrical with adding nodes - but downside is that this much slower to remove nodes)

    def _remove_node(self, nodename):
        """Remove the given node from the continuum/ring.

        :param nodename: the node name.
        """
        if nodename not in self._nodes:
            raise KeyError('node \'{}\' not found, available nodes: {}'.format(
                nodename, self._nodes.keys()))
        self._nodes.pop(nodename)
        self._create_ring()

I also tried an alternative fix - more complex, but slightly more performant when removing nodes. I'm not entirely sure that there wouldn't be unwanted side effects but the unit tests pass :

diff --git a/uhashring/ring.py b/uhashring/ring.py
index 3d8ea2c..998be13 100644
--- a/uhashring/ring.py
+++ b/uhashring/ring.py
@@ -8,7 +8,11 @@
 # Disable 'Used builtin '                     pylint: disable=W0141
 
 from bisect import bisect, bisect_left, insort
-from collections import Counter
+from collections import Counter, defaultdict
 from hashlib import md5
 from sys import version_info
 
@@ -35,7 +39,7 @@ class HashRing(object):
         self._keys = []
         self._nodes = {}
         self._replicas = 4 if compat else replicas
-        self._ring = {}
+        self._ring = defaultdict(list)
 
         if weight_fn and not hasattr(weight_fn, '__call__'):
             raise TypeError('weight_fn should be a callable function')
@@ -128,10 +132,10 @@ class HashRing(object):
 
         _distribution = Counter()
         _keys = []
-        _ring = {}
+        _ring = defaultdict(list)
         for nodename in self._nodes:
             for h in self._hashi_weight_generator(nodename):
-                _ring[h] = nodename
+                _ring[h].append(nodename)
+                _ring[h].sort()
                 insort(_keys, h)
                 _distribution[nodename] += 1
         self._distribution = _distribution
@@ -160,7 +164,7 @@ class HashRing(object):
         if what == 'pos':
             return pos
 
-        nodename = self._ring[self._keys[pos]]
+        nodename = self._ring[self._keys[pos]][0]
         if what in ['hostname', 'instance', 'port', 'weight']:
             return self._nodes[nodename][what]
         elif what == 'dict':
@@ -232,21 +236,15 @@ class HashRing(object):
             raise KeyError('node \'{}\' not found, available nodes: {}'.format(
                 nodename, self._nodes.keys()))
 
-        for h in self._hashi_weight_generator(nodename):
-            del self._ring[h]
-            index = bisect_left(self._keys, h)
-            del self._keys[index]
-        self._distribution.pop(nodename)
-        self._weight_sum -= self._nodes[nodename]['weight']
-        self._nodes.pop(nodename)
-
+        for h in self._hashi_weight_generator(nodename):
+            self._ring[h].remove(nodename)
+            if not self._ring[h]:
+                del self._ring[h]
+            index = bisect_left(self._keys, h)
+            del self._keys[index]
+        self._distribution.pop(nodename)
+        self._weight_sum -= self._nodes[nodename]['weight']
         self._nodes.pop(nodename)
 
     def __delitem__(self, nodename):
         """Remove the given node.
@@ -368,7 +366,7 @@ class HashRing(object):
     def get_points(self):
         """Returns a ketama compatible list of (position, nodename) tuples.
         """
-        return [(k, self._ring[k]) for k in self._keys]
+        return [(k, self._ring[k][0]) for k in self._keys]
 
     def get_server(self, key):
         """Returns a ketama compatible (position, nodename) tuple.
@@ -421,7 +419,7 @@ class HashRing(object):
 
         pos = self._get_pos(key)
         for key in self._keys[pos:]:
-            nodename = self._ring[key]
+            nodename = self._ring[key][0]
             if unique:
                 if nodename in all_nodes:
                     continue
@@ -434,7 +432,7 @@ class HashRing(object):
         else:
             for i, key in enumerate(self._keys):
                 if i < pos:
-                    nodename = self._ring[key]
+                    nodename = self._ring[key][0]
                     if unique:
                         if nodename in all_nodes:
                             continue

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

Thanks for digging in @bjhockley, very much appreciated!

Yes, I still need to check the C implementation of the ketama hash algo but I did a lot of hash comparison tests so I guess collisions are indeed more frequent..

Indeed the first solution is not what we want. Our goal is to avoid recalculating the whole ring when adding or removing a node. Those operations should be atomic to have a constant performance.

Your second solution looks like an elegant way of handling collisions to avoid the KeyError exception 👍 I also thought of something else yesterday night that I will get back to you about today.

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

Ok tested C implementation of libketama and the hash function is good. Then I got deeper and found out that libketama indeed have collisions but we don't see it because of its ring implementation.

The other silly idea I tested was silly and didn't work.

So the conclusion imo is that uhashring should handle collisions gracefully because it doesn't care for the hash algorithm used by the user.

This means several things to me:

  • We should abstract the md5 and ketama algos from the ring implementation code (aka modules)
  • We should still make the ring implementation more efficient when adding / removing nodes

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

Oh BTW, what I meant by:

Then I got deeper and found out that libketama indeed have collisions but we don't see it because of its ring implementation

The libketama ring implementation is to regenerate the ring when the node list changes! Which is exactly what we're trying to run away from ;)

This means that there is no way to retain ketama compatibility with our goal of a performance steady ring topology change.

So we need to abstract the topology change logic from the ring implementation when the ketama algorithm is used.

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

Ok @bjhockley I have quite a proposal on the table now on the branch modular_rings.

Please read the commit description referenced above. If you could be so kind to test and report what you think about it, it would be awesome!

Last but not least, I have a quite big dilemma that I would like your opinion on: should I preserve backward compatibility?

I'm hesitating because the ketama implementation has been the default one and we now know that it's not the most efficient one so I am tempted to switch over to the new performant one but it may impact users depending on how they use the ring atm...

from uhashring.

bjhockley avatar bjhockley commented on May 11, 2024

@ultrabug Thank-you - I very much like the shape of what I'm seeing in the branch! I will try to make some time this week to try it in earnest.

Speaking personally, ketama compatibility is not an important consideration for the projects I'm working on (although I honestly have no idea where the balance lies with other users of the library).

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

Great, I'm eager to hear back from you!

For the second question, let's try to put it that way: what would be the impact for your appliations if I changed the default hashing from ketama to the more performant md5 one?

from uhashring.

bjhockley avatar bjhockley commented on May 11, 2024

I'd certainly be very happy for the default to be md5. I've just moved my code over from using ketama to using md5 - and initial testing suggests everything is working well.

(A bit of background here: the only reason I was using ketama compatible mode in the first place is because it was the library default and it seemed to work well enough :) )

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

Thanks for your answer mate.

Last question: How would you feel about me breaking the HashRing object signature and most probably breaking your code instead of silently changing from ketama to performant-md5 implementation?

from uhashring.

bjhockley avatar bjhockley commented on May 11, 2024

from uhashring.

ultrabug avatar ultrabug commented on May 11, 2024

I'm glad to announce 1.0 thanks to you @bjhockley thanks again!

Closing now :)

from uhashring.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.