GithubHelp home page GithubHelp logo

Comments (15)

lemire avatar lemire commented on May 29, 2024 2

We don’t apologize each time a bug is introduced.

from roaring.

richardartoul avatar richardartoul commented on May 29, 2024

@lemire I'll keep digging, but if you have an idea let me know.

from roaring.

richardartoul avatar richardartoul commented on May 29, 2024

@lemire That is unclear from the documentation, but I dont think thats quite right. If you check the implementations, FromBuffer and ReadFrom, and UnmarshalBinary all have almost the exact same implementation. They seem to be mostly convenience wrappers for slightly different APIs (maybe you were thinking of frozenview?)

I also tried updating my reproducer to use ReadFrom and it fails the same way. Similarly, I've already tried using Or(bm1, bm2) and that also fails as well so I think there is a real bug here.

from roaring.

richardartoul avatar richardartoul commented on May 29, 2024

@lemire Also I should add that I believe this bug is a regression as I only discovered it while testing the upgrade from 0.5.5 to 1.1 in our production shadows. Reverting to 0.5.5 immediately made the issue go away.

... which makes me realize we may be able to narrow down the issue with a git bisect.

from roaring.

richardartoul avatar richardartoul commented on May 29, 2024

@lemire ok i did a quick manual git bisect and I can say with confidence the issue was introduced in this P.R: #312

Before this commit this reproducer behaves properly, afterwards it fails.

EDIT1: Specifically the implementation change to orArray

from roaring.

richardartoul avatar richardartoul commented on May 29, 2024

Ok so I debugged it further and I think the problem is ~ something like:

  1. (rc *runContainer16) orArray calls result.toArrayContainer()
  2. (rc *runContainer16) toArrayContainer() calls ac.iaddRange() in a loop.
  3. iaddrange may modify ac in place, or it may return a completely new bitmap container (which is completely ignored by the caller)

Something in the space between #2 and #3 is wrong though and breaks the Or functionality. My recommendation is to revert that whole P.R, or at least the modification to the orArray method.

from roaring.

lemire avatar lemire commented on May 29, 2024

Ping @jacksonrnewhouse

from roaring.

lemire avatar lemire commented on May 29, 2024

@richardartoul Thanks for the analysis. Let us first try to get @jacksonrnewhouse to comment.

from roaring.

richardartoul avatar richardartoul commented on May 29, 2024

@lemire No problem, thanks for responding so quick!

from roaring.

jnewhouse avatar jnewhouse commented on May 29, 2024

Okay, found the bug. The method toArrayContainer() is unsafe if there are more than arrayDefaultMaxSize (4096) elements in it. The code tries very hard to not invoke it in such cases but there is an off-by-one error in runArrayUnionToRuns, introduced at https://github.com/RoaringBitmap/roaring/pull/312/files#diff-e78a8f6657508d16a490f61f55f8b9773edde74a3d36bc0a96686d402f4c5c31R2284. Within an interval the "length" field is really one less than the length, so we can use uint16 as the size. That function adds (previousInterval.length+1) together repeatedly, but then returns it as cardMinusOne. Switching that line to just cardMinusOne += previousInterval.length would avoid triggering the toArrayContainer() issue.

from roaring.

jnewhouse avatar jnewhouse commented on May 29, 2024

This only triggers when the run container and array container combine to be the full set, as the value wraps around and returns 0, causing it to try and pack the full set into an array container.

from roaring.

jnewhouse avatar jnewhouse commented on May 29, 2024

It also needs the returned run container to be [0, 65355] so that it doesn't even populate the backing uint16 slice of the array container.

from roaring.

lemire avatar lemire commented on May 29, 2024

Who wants to take a pick at a PR to fix this? Thanks to @richardartoul we have the test.

from roaring.

richardartoul avatar richardartoul commented on May 29, 2024

I would be happy to do it, but TBH after reading @jnewhouse explanation I think it would be better if someone familiar with the conventions in the codebase did it since it sounds like the root cause was quite subtle.

from roaring.

jacksonrnewhouse avatar jacksonrnewhouse commented on May 29, 2024

Okay, I put together a fix. Sorry for introducing this bug.

from roaring.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.