Comments (15)
We don’t apologize each time a bug is introduced.
from roaring.
@lemire I'll keep digging, but if you have an idea let me know.
from roaring.
@lemire That is unclear from the documentation, but I dont think thats quite right. If you check the implementations, FromBuffer
and ReadFrom
, and UnmarshalBinary
all have almost the exact same implementation. They seem to be mostly convenience wrappers for slightly different APIs (maybe you were thinking of frozenview?)
I also tried updating my reproducer to use ReadFrom
and it fails the same way. Similarly, I've already tried using Or(bm1, bm2)
and that also fails as well so I think there is a real bug here.
from roaring.
@lemire Also I should add that I believe this bug is a regression as I only discovered it while testing the upgrade from 0.5.5 to 1.1 in our production shadows. Reverting to 0.5.5 immediately made the issue go away.
... which makes me realize we may be able to narrow down the issue with a git bisect.
from roaring.
@lemire ok i did a quick manual git bisect and I can say with confidence the issue was introduced in this P.R: #312
Before this commit this reproducer behaves properly, afterwards it fails.
EDIT1: Specifically the implementation change to orArray
from roaring.
Ok so I debugged it further and I think the problem is ~ something like:
(rc *runContainer16) orArray
callsresult.toArrayContainer()
(rc *runContainer16) toArrayContainer()
callsac.iaddRange()
in a loop.iaddrange
may modifyac
in place, or it may return a completely new bitmap container (which is completely ignored by the caller)
Something in the space between #2 and #3 is wrong though and breaks the Or functionality. My recommendation is to revert that whole P.R, or at least the modification to the orArray
method.
from roaring.
Ping @jacksonrnewhouse
from roaring.
@richardartoul Thanks for the analysis. Let us first try to get @jacksonrnewhouse to comment.
from roaring.
@lemire No problem, thanks for responding so quick!
from roaring.
Okay, found the bug. The method toArrayContainer() is unsafe if there are more than arrayDefaultMaxSize (4096) elements in it. The code tries very hard to not invoke it in such cases but there is an off-by-one error in runArrayUnionToRuns, introduced at https://github.com/RoaringBitmap/roaring/pull/312/files#diff-e78a8f6657508d16a490f61f55f8b9773edde74a3d36bc0a96686d402f4c5c31R2284. Within an interval the "length" field is really one less than the length, so we can use uint16 as the size. That function adds (previousInterval.length+1)
together repeatedly, but then returns it as cardMinusOne
. Switching that line to just cardMinusOne += previousInterval.length
would avoid triggering the toArrayContainer() issue.
from roaring.
This only triggers when the run container and array container combine to be the full set, as the value wraps around and returns 0, causing it to try and pack the full set into an array container.
from roaring.
It also needs the returned run container to be [0, 65355] so that it doesn't even populate the backing uint16 slice of the array container.
from roaring.
Who wants to take a pick at a PR to fix this? Thanks to @richardartoul we have the test.
from roaring.
I would be happy to do it, but TBH after reading @jnewhouse explanation I think it would be better if someone familiar with the conventions in the codebase did it since it sounds like the root cause was quite subtle.
from roaring.
Okay, I put together a fix. Sorry for introducing this bug.
from roaring.
Related Issues (20)
- upper bound memory estimate HOT 3
- question: what is Freeze? HOT 2
- Failed to read runtime container content: unexpected EOF HOT 1
- External-memory roaring data structure HOT 2
- Add Bitmap.NextAbsentValue HOT 5
- error in roaringArray.readFrom: could not read initial cookie: unexpected EOF HOT 7
- [roaring64] Why Or function modify bitmap "a" in this example? HOT 7
- Regarding memory use of maximum size and removal of bit number HOT 2
- UnmarshalBinary has containers with needCopyOnWrite set to true HOT 1
- Implement roaring_bitmap_internal_validate HOT 2
- error in roaringArray.readFrom: did not find expected serialCookie in header HOT 2
- "error in roaringArray.readFrom: did not find expected serialCookie in header" HOT 4
- make qa fails for release 1.6.0
- incorrect GetSizeInBytes() value HOT 1
- "error in roaringArray.readFrom: did not find expected serialCookie in header" when reading a bitmap written by roaring64 HOT 5
- "Could not deserialize bitmap for key #0: error in roaringArray.readFrom: did not find expected serialCookie in header" on v1.8.0 when reading a bitmap written by roaring64 HOT 1
- Go get error HOT 2
- Feature request : mmap roaring bitmap for use in multi threaded inter-process/separate program HOT 1
- Feature request for 128bit for ipv6 usage. HOT 3
- possible to do an mmap version of roaring bitmap for golang? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from roaring.