GithubHelp home page GithubHelp logo

Cleanup headers about croaring HOT 25 OPEN

roaringbitmap avatar roaringbitmap commented on July 23, 2024
Cleanup headers

from croaring.

Comments (25)

lemire avatar lemire commented on July 23, 2024

Sure, but please wait a bit since I am doing some reengineering and if my work is fruitful, we could end up with a massive conflict from hell.

from croaring.

fsaintjacques avatar fsaintjacques commented on July 23, 2024

I am aware of the nasty conflicts it'll ensue and will wait for your work.

from croaring.

lemire avatar lemire commented on July 23, 2024

I'll know soon whether my idea helps or not... I am doing as fast as I can but it involves lots of code, sadly.

from croaring.

lemire avatar lemire commented on July 23, 2024

Some things to take into account...

  1. We'd like to keep the unity build in working order.
  2. The various wrappers,
    https://github.com/Ezibenroc/PyRoaringBitMap
    https://github.com/saulius/croaring-rs
    https://github.com/RoaringBitmap/gocroaring

should be preserved in working order as well.

from croaring.

lemire avatar lemire commented on July 23, 2024

Ok. I am done. Go ahead!

from croaring.

lemire avatar lemire commented on July 23, 2024

Do you have any idea when you'd like to work on this? We have this other issue (#47) which would require touching quite a bit of code. I'd like to avoid conflicts... so if you want to clean the headers soon, then we will delay work on issue 47 till you are done, otherwise, we may proceed with issue 47 first.

from croaring.

fsaintjacques avatar fsaintjacques commented on July 23, 2024

I'll try to tackle it today. I will do this in a separate branch. It is worth noting that it will probably break bindings?

from croaring.

fsaintjacques avatar fsaintjacques commented on July 23, 2024

Should I take the opportunity to change the signature of some functions, e.g. const parameters usually preceed non-const void roaring_bitmap_andnot_inplace(roaring_bitmap_t *x1, const roaring_bitmap_t *x2); would become void roaring_bitmap_andnot_inplace(const roaring_bitmap_t *x1, roaring_bitmap_t *x2);.

from croaring.

lemire avatar lemire commented on July 23, 2024

It is worth noting that it will probably break bindings?

Hopefully, it won't break the unity build.

It is probably best to aim to break as few things as possible, as a general rule, right?

separate concerns (basic operations, advanced operations, serializations) in multiple headers, such the user can only import what he needs.

That's a good idea but this does not prevent you from having a roaring.h file that offers the same API as it does now, just have it include all the subheaders so that users who do not need separation of concern can maintain the current convenience of having a single include.

hide definitions, does the user really need the full definition? We could probably hide in a forward declaration à la struct roaring_bitmap;

The inline definitions are a matter of performance. We had major performance issues due to lack of inlining. I did not move the definitions into header files for amusement.

 Remove unnecessary public that should really be private stuff.

Moving headers away from view is great. We should ensure that it does not come at the expense of performance. Being able to inline definitions is really important in some instances.

Should I take the opportunity to change the signature of some functions, e.g. const parameters usually preceed non-const

I have to disagree on this point. When we do in-place operations, the first parameter is usually the mutable one.

Here is the equivalent using operators:

x &= y

See how the x occurs before the y? I am sure that there are languages that read backward, but we are in C.

(C does not have a "andnot" operator unlike, say, Go, but it is not important.)

Here is how you'd implement it in C++:

type operator&(type & lhs, const type& rhs) {
   return lhs &= rhs;
}

See how the const parameter appears last?

from croaring.

lemire avatar lemire commented on July 23, 2024

Another wrapper (C#): https://github.com/RogueException/CRoaring.Net

from croaring.

fsaintjacques avatar fsaintjacques commented on July 23, 2024

Aren't inlining concerns fixed by proper LTO support? This small peculiarity forces some private header to be public.

I've come to understand that for a proper C library, parameter order should be: input, input-output, output. This is a matter of opinion.

I'm looking ahead, and this feels like wasted time. I'm just gonna drop this issue.

from croaring.

lemire avatar lemire commented on July 23, 2024

Aren't inlining concerns fixed by proper LTO support?

Good point. I think that this assumes that library users will rely on LTO. Otherwise, they will simply get subpar performance.

I've come to understand that for a proper C library, parameter order should be: input, input-output, output. This is a matter of opinion.

When you copy something you go

`destination = source;``

So it makes sense that we have

copy(destination, source);

Say you have a hash table and you insert something in it. Intuitively, you'd write

set[key] = value

And thus, I think you'd write
insert(hash_table *table, const int key, const int value )

I could be wrong, of course... that's just my intuition.

(You have just inspired a blog post.)

I'm looking ahead, and this feels like wasted time. I'm just gonna drop this issue.

AH!!!! I hope it is not because of the points I raised.

I will take note that we are delaying this issue.

from croaring.

lemire avatar lemire commented on July 23, 2024

Function signature: how do you order parameters? http://lemire.me/blog/2016/09/06/function-signature-how-do-you-order-parameters/

from croaring.

lemire avatar lemire commented on July 23, 2024

I did some minor cleaning... in commit 4fda6ba but it lead to a significant performance degradation (yet I just renamed the defines). I think that my experience illustrates the need to be careful.

from croaring.

lemire avatar lemire commented on July 23, 2024

I will painfully undo the commit.

from croaring.

lemire avatar lemire commented on July 23, 2024

I have undone my cleaning work. I have read and re-read the cleaned code and I do not see where it could have changed the generated code, but obviously, it did degrade the binaries quite a bit.

from croaring.

nkurz avatar nkurz commented on July 23, 2024

Which machine are you testing this on? I did upgrade the default compiler on Skylake (and maybe Haswell), and it's possible that might have effects on performance.

More generally, I think this is a reminder that performance in a high level language is fragile, and at the whim of the compiler. If we want robust performance, we probably need some way to lock things in once we have what we want.

--nate

from croaring.

lemire avatar lemire commented on July 23, 2024

@nkurz Nah. It is not a compiler issue.

With the magic of git, I can move back and forth in time and see exactly at which point the performance went down (using the same compiler and machine).

More generally, I think this is a reminder that performance in a high level language is fragile, and at the whim of the compiler. If we want robust performance, we probably need some way to lock things in once we have what we want.

My intention was just to rename the enums and the defines so that they are all prefixed by "ROARING_". This should not impact the generated binaries in the least. But it did something bad.

from croaring.

lemire avatar lemire commented on July 23, 2024

Here is the benchmark I use...

https://github.com/RoaringBitmap/CBitmapCompetition#results

(I published it so one could refer to it in the future.)

from croaring.

lemire avatar lemire commented on July 23, 2024

@nkurz There is one ill-effect from the compiler upgrade however... The sanitize mode is now broken. I doubt it is your doing per se, but I think it is related to the new compiler.

$ mkdir debug
dlemire@skylake:~/CVS/github/CRoaring$ cmake -DCMAKE_BUILD_TYPE=Debug -DSANITIZE=ON ..
CMake Error: The source directory "/home/dlemire/CVS/github" does not appear to contain CMakeLists.txt.
Specify --help for usage, or press the help button on the CMake GUI.
dlemire@skylake:~/CVS/github/CRoaring$ cd debug/
dlemire@skylake:~/CVS/github/CRoaring/debug$ cmake -DCMAKE_BUILD_TYPE=Debug -DSANITIZE=ON ..
-- The C compiler identification is GNU 6.2.0
-- The CXX compiler identification is GNU 6.2.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- BENCHMARK_DATA_DIR: /home/dlemire/CVS/github/CRoaring/benchmarks/realdata/
-- TEST_DATA_DIR: /home/dlemire/CVS/github/CRoaring/tests/testdata/
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- CMAKE_BUILD_TYPE: Debug
-- DISABLE_AVX: OFF
-- BUILD_STATIC: OFF
-- BUILD_LTO: OFF
-- SANITIZE: ON
-- CMAKE_C_COMPILER: /usr/bin/cc
-- CMAKE_C_FLAGS: -std=c11 -fPIC -march=native  -Wall -Winline -Wshadow -Wextra -pedantic -fsanitize=address -fno-omit-frame-pointer -fsanitize=undefined
-- CMAKE_C_FLAGS_DEBUG: -ggdb
-- CMAKE_C_FLAGS_RELEASE: -O3
-- Configuring done
-- Generating done
-- Build files have been written to: /home/dlemire/CVS/github/CRoaring/debug
dlemire@skylake:~/CVS/github/CRoaring/debug$ make
Scanning dependencies of target roaring
[  3%] Building C object src/CMakeFiles/roaring.dir/array_util.c.o
[  6%] Building C object src/CMakeFiles/roaring.dir/bitset_util.c.o
[  9%] Building C object src/CMakeFiles/roaring.dir/containers/array.c.o
[ 12%] Building C object src/CMakeFiles/roaring.dir/containers/bitset.c.o
[ 16%] Building C object src/CMakeFiles/roaring.dir/containers/containers.c.o
[ 19%] Building C object src/CMakeFiles/roaring.dir/containers/convert.c.o
[ 22%] Building C object src/CMakeFiles/roaring.dir/containers/mixed_intersection.c.o
[ 25%] Building C object src/CMakeFiles/roaring.dir/containers/mixed_union.c.o
[ 29%] Building C object src/CMakeFiles/roaring.dir/containers/mixed_equal.c.o
[ 32%] Building C object src/CMakeFiles/roaring.dir/containers/mixed_negation.c.o
[ 35%] Building C object src/CMakeFiles/roaring.dir/containers/mixed_xor.c.o
[ 38%] Building C object src/CMakeFiles/roaring.dir/containers/mixed_andnot.c.o
[ 41%] Building C object src/CMakeFiles/roaring.dir/containers/run.c.o
[ 45%] Building C object src/CMakeFiles/roaring.dir/roaring.c.o
[ 48%] Building C object src/CMakeFiles/roaring.dir/roaring_priority_queue.c.o
[ 51%] Building C object src/CMakeFiles/roaring.dir/roaring_array.c.o
Linking C shared library ../libroaring.so
/usr/bin/ld: unrecognized option '--push-state'
/usr/bin/ld: use the --help option for usage information
collect2: error: ld returned 1 exit status
make[2]: *** [libroaring.so] Error 1
make[1]: *** [src/CMakeFiles/roaring.dir/all] Error 2
make: *** [all] Error 2

from croaring.

lemire avatar lemire commented on July 23, 2024

@nkurz (Thankfully I can run the tests with the sanitizer on other machines so it is not bad.)

from croaring.

nkurz avatar nkurz commented on July 23, 2024

With the magic of git, I can move back and forth in time and see exactly
at which point the performance went down (using the same compiler and
machine).

Yes, I didn't know if you'd tested the new version with the old compiler to be sure, and vice versa.

It is not a compiler issue.
...
My intention was just to rename the enums and the defines so that
they are all prefixed by "ROARING_". This should not impact the
generated binaries in the least. But it did something bad.

When you rename a variable and the performance drops, I tend to think of that as a compiler issue.

There is one ill-effect from the compiler upgrade however...
The sanitize mode is now broken. I doubt it is your doing per se,
but I think it is related to the new compiler.

Seems like Ubuntu's doing rather than mine, but I think I patched it with d5a081a. Separately, I noticed that the "make test" has failed tests on Skylake, but I presume you are already aware of this.

from croaring.

lemire avatar lemire commented on July 23, 2024

Yes, I was working on fixing a minor issue.

from croaring.

andreigudkov avatar andreigudkov commented on July 23, 2024
  1. Header files which are not part of public interface (cannot be reached from roaring.h) should be moved from include/ into src/ directory. Where possible, public interface should be reduced.
  2. Public interface should not depend on USEAVX and other similar definitions. For example, currently I see that run.h is referenced by roaring.h (roaring.h -> roaring_array.h -> containers.h -> run.h) and its body depends on USEAVX. Simple prerelase test: grep -R USEAVX include/ >/dev/null && echo "error".

from croaring.

lemire avatar lemire commented on July 23, 2024

@andreigudkov PR invited!!!

But care is needed because we have an outstanding unmerged PR.

from croaring.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.