boostorg / dynamic_bitset Goto Github PK
View Code? Open in Web Editor NEWBoost.org dynamic_bitset module
Home Page: http://boost.org/libs/dynamic_bitset
License: Boost Software License 1.0
Boost.org dynamic_bitset module
Home Page: http://boost.org/libs/dynamic_bitset
License: Boost Software License 1.0
I found that the latest develop branch(commit id: 8e20aa1) will generate compile failures because of std::hash related error when using below CI options on Appveyor:
To regenerate this issues, I create a redundant draft pull request by adding only single line comment (sorry for my unruly behavior), but the failures still occur, details can be found in this links.
The same failure has affected several recent pull requests
Failure digest:
.\boost/container_hash/hash.hpp(669): error C2668: 'stdext::hash_value': ambiguous call to overloaded function
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\include\xhash(29): note: could be 'size_t stdext::hash_value<T>(const _Kty &)' [found using argument-dependent lookup]
with
[
T=std::vector<unsigned long,std::allocator<unsigned long>>,
_Kty=std::vector<unsigned long,std::allocator<unsigned long>>
]
.\boost/container_hash/hash.hpp(406): note: or 'unsigned int boost::hash_value<T>(const T &)'
with
[
T=std::vector<unsigned long,std::allocator<unsigned long>>
]
.\boost/container_hash/hash.hpp(669): note: while trying to match the argument list '(const T)'
with
[
T=std::vector<unsigned long,std::allocator<unsigned long>>
]
In #25 support was added for hardware-assisted popcount.
In #33 it was noted that the implementation for MSVC failed to check for CPU support, and the MSVC documentation states the behavior is undefined in this case.
In #35 the MSVC implementation was disabled to resolve #33, which leaves the library performance in this respect the same as it was in 1.68.0.
This issue is a placeholder to come back around and clean up the implementation so it can leverage hardware assist safely with MSVC. There are some thoughts in the comments for #33 regarding alternate implementations that are useful when making a decision here.
result_type result = 0;
3. assignment: Assigning: i = 0UL.
4. Condition i <= last_block, taking true branch.
6. incr: Incrementing i. The value of i is now 1.
7. Condition i <= last_block, taking true branch.
1319 for (size_type i = 0; i <= last_block; ++i) {
8. assignment: Assigning: offset = i * 64UL. The value of offset is now 64.
1320 const size_type offset = i * bits_per_block;
CID 120258 (#1-3 of 3): Bad bit shift operation (BAD_SHIFT)
9. large_shift: In expression static_cast<boost::dynamic_bitset::to_ulong() const::result_type (instance 42)>(this->m_bits[i]) << offset, left shifting by more than 63 bits has undefined behavior. The shift amount, offset, is 64.
1321 result |= (static_cast<result_type>(m_bits[i]) << offset);
5. Jumping back to the beginning of the loop.
1322 }
1323
1324 return result;
Mailing list discussion thread, there are many related things.
I was working with the Boost.DynamicBitset library and it turned out that the library doesn't use hardware supported fast bit operations when it could be possible, apparently due to dates when some functions were written (about ~15 years ago).
I found 2 bottlenecks so far:
__builtin_ctz(x)
(on GCC).I've done a pull request to solve the first issue using compile-time stuff. If the block type is smaller than needed for popcount functions, then it uses the old way (otherwise it obviously works slower).
I tested builtin popcount function with example/timing_tests.cpp
and got these results: before, after. As you can see, long types run faster. Testing only .count(x)
function (without creating many object) shows x2 times speed.
My computer runs on Debian 9, its CPU is AMD E1-2500 (old and weak mid-2013 stuff).
TODO:
I don't have Windows to test its code (the Windows code in the PR I wrote looking at other libraries' code and using common sense). Apparently it has to have runtime checks calling __cpuid
- docs: To determine hardware support <...> If you run code that uses this intrinsic on hardware that does not support the popcnt instruction, the results are unpredictable. I hope online build systems have different Windows systems to test, otherwise I'll need help from people who have Windows.
Inline the lowest_bit
function if possible. (The second bottleneck, after the first become verified).
Proofreading the compile-time stuff from people who have rich experience in such things and can find a mistake like "On a Penryn, __AVX__ will give a false negative."
Avoid copy-pasting preprocessor directives - it's better to keep a short alias somewhere at the top of a header file.
Discussion is welcome.
example/Jamfile uses wrong char for comments
When using boost v1.69 dynamic_bitset::count() will crash on CPUs without SSE4 (required for popcount) when compiled using MSVC v15.9.4 and run on Windows 7. The was compiled as Release x64
Attached is example project to recreate: PopcountCrash.zip
I've had a quick look at the PR, issue and linked mailing list discussion
It appears there was some concern about handling hardware support for MSVC but I can't see anything in the PR or commit that would handle CPUs without SSE4 (i.e __cpuid check)
I'm not sure if this is a false positive or not.
Environment: Using docker build environment from boostorg/boost#184
If I build with valgrind I end up with some errors in test 4, for example:
boost@2525d5b3fe47:/boost/libs/dynamic_bitset/test$ VALGRIND_OPTS=--error-exitcode=1 ../../../b2 toolset=gcc variant=debug testing.launcher=valgrind
==298== Conditional jump or move depends on uninitialised value(s)
==298== at 0x590F3FA: __wmemchr_avx2 (memchr-avx2.S:66)
==298== by 0x4EEC94C: std::codecvt<wchar_t, char, __mbstate_t>::do_out(__mbstate_t&, wchar_t const*, wchar_t const*, wchar_t const*&, char*, char*, char*&) const (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==298== by 0x4F308A5: std::basic_filebuf<wchar_t, std::char_traits<wchar_t> >::_M_convert_to_external(wchar_t*, long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==298== by 0x4F30D21: std::basic_filebuf<wchar_t, std::char_traits<wchar_t> >::overflow(unsigned int) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==298== by 0x4F30A44: std::basic_filebuf<wchar_t, std::char_traits<wchar_t> >::_M_terminate_output() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==298== by 0x4F34A60: std::basic_filebuf<wchar_t, std::char_traits<wchar_t> >::close() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==298== by 0x4F36940: std::basic_ofstream<wchar_t, std::char_traits<wchar_t> >::close() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==298== by 0x11F8C9: void bitset_test<boost::dynamic_bitset<unsigned char, std::allocator<unsigned char> > >::stream_inserter<std::basic_ofstream<wchar_t, std::char_traits<wchar_t> > >(boost::dynamic_bitset<unsigned char, std::allocator<unsigned char> > const&, std::basic_ofstream<wchar_t, std::char_traits<wchar_t> >&, char const*) (bitset_test.hpp:1217)
==298== by 0x115DC6: void run_test_cases<unsigned char>() (dyn_bitset_unit_tests4.cpp:147)
==298== by 0x112936: test_main(int, char**) (dyn_bitset_unit_tests4.cpp:326)
==298== by 0x113BD1: boost::minimal_test::caller::operator()() (minimal.hpp:111)
==298== by 0x14102E: boost::detail::function::function_obj_invoker0<boost::minimal_test::caller, int>::invoke(boost::detail::function::function_buffer&) (function_template.hpp:138)
and
==298== Conditional jump or move depends on uninitialised value(s)
==298== at 0x590F617: __wmemchr_avx2 (memchr-avx2.S:282)
==298== by 0x4EE8B54: std::basic_istream<wchar_t, std::char_traits<wchar_t> >& std::getline<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >(std::basic_istream<wchar_t, std::char_traits<wchar_t> >&, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >&, wchar_t) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
==298== by 0x12D6AE: void bitset_test<boost::dynamic_bitset<unsigned int, std::allocator<unsigned int> > >::stream_extractor<std::__cxx11::basic_istringstream<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >(boost::dynamic_bitset<unsigned int, std::allocator<unsigned int> >&, std::__cxx11::basic_istringstream<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >&, std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >&) (bitset_test.hpp:1374)
==298== by 0x119CAD: void run_test_cases<unsigned int>() (dyn_bitset_unit_tests4.cpp:305)
==298== by 0x112940: test_main(int, char**) (dyn_bitset_unit_tests4.cpp:328)
==298== by 0x113BD1: boost::minimal_test::caller::operator()() (minimal.hpp:111)
==298== by 0x14102E: boost::detail::function::function_obj_invoker0<boost::minimal_test::caller, int>::invoke(boost::detail::function::function_buffer&) (function_template.hpp:138)
==298== by 0x11D7E6: boost::function0<int>::operator()() const (function_template.hpp:769)
==298== by 0x114216: int boost::detail::do_invoke<boost::shared_ptr<boost::detail::translator_holder_base>, boost::function<int ()> >(boost::shared_ptr<boost::detail::translator_holder_base> const&, boost::function<int ()> const&) (execution_monitor.ipp:285)
==298== by 0x10F60B: boost::execution_monitor::catch_signals(boost::function<int ()> const&) (execution_monitor.ipp:874)
==298== by 0x10F78F: boost::execution_monitor::execute(boost::function<int ()> const&) (execution_monitor.ipp:1213)
==298== by 0x11245D: main (minimal.hpp:130)
If I add define=BOOST_DYNAMIC_BITSET_NO_WCHAR_T_TESTS then the problems go away. This looks like it might be an issue in libstdc++ with wchar_t support but I'm not sure. For now the valgrind tests are going to run without wide character tests. It's better than having none at all.
Ok, it can be used, but not in any reasonably efficient manner.
A natural way to use a bitset in a parallel program is to work on sections of the bitset.
This requires being able to see a "view" of the bitset.
An option to create a read-only view into a bitset in O(1) time would suffice in a lot of cases.
Consider the case where a set of bitsets are processed in parallel and a various sections of the bitsets are processed. In such a scenario, the threads can have local bitsets as storage, but would need to read slices of shared bitsets.
Writing back to a bitset in parallel could be more problematic, but a slice API that snaps to the closest block size could be used.
The current implementation for the is_proper_subset_of
can be slightly improved by breaking out of the loop after proper
has been set to true
. From that point forward, the remaining words only have to be checked for the subset-relation.
If the proper
part is on average found after half the number of currently used words, then in the remaining half of the words only half the number tests need to be done. So the savings would on average amount to a quarter of the number of if
-statements on random data.
The code would look like:
template <typename Block, typename Allocator>
bool dynamic_bitset<Block, Allocator>::
is_proper_subset_of(const dynamic_bitset<Block, Allocator>& a) const
{
assert(size() == a.size());
assert(num_blocks() == a.num_blocks());
size_type i = 0;
for (/* init-statement before loop */; i < num_blocks(); ++i) {
if (m_bits[i] & ~a.m_bits[i])
return false; // not a subset at all
if (a.m_bits[i] & ~m_bits[i])
break; // proper
}
if (i == num_blocks)
return false; // not proper, because break-statement not hit
++i; // the break-statement short-circuited the increment
for (/* re-use i from previous loop */; i < num_blocks(); ++i)
if (m_bits[i] & ~a.m_bits[i])
return false; // not a subset
return true;
}
If a PR along these lines would be welcome, then I'd be happy to submit one.
[phrase library..[@/libs/dynamic_bitset/ Dynamic Bitset]:]
Hello,
Consider the following code:
#include <boost/dynamic_bitset.hpp>
#include <iostream>
int main() {
auto bitset = boost::dynamic_bitset<>(10, true);
std::cout << bitset << std::endl;
return 0;
}
The result I expected:
1111111111
But the result is
0000000001
This behavior is it normal ?
On Debian with libboost1.74-dev:amd64.
Thanks in advance
After introducing #40, this test started failing.
It did not fail on trusty with gcc 4.8 from the ubuntu-toolchain-r-test.
It fails on xenial with gcc 4.8.5 (part of xenial).
It does not fail on anything else. I suspect the xenial environment has an issue.
The file has a comment in it regarding leveraging somewhat undocumented behavior.
I have allowed this job to fail in CI for the time being.
Here's a build where it fails:
https://travis-ci.org/boostorg/dynamic_bitset/jobs/523097058
Here's the failure output:
gcc.link ../../bin.v2/libs/dynamic_bitset/test/dyn_bitset_unit_tests4.test/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden/dyn_bitset_unit_tests4
testing.capture-output ../../bin.v2/libs/dynamic_bitset/test/dyn_bitset_unit_tests4.test/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden/dyn_bitset_unit_tests4.run
====== BEGIN OUTPUT ======
../../boost/test/minimal.hpp(139): exception "std::ios_base::failure[abi:cxx11]: basic_ios::clear: iostream error" caught in function: 'int main(int, char**)'
**** Testing aborted.
**** 1 error detected
EXIT STATUS: 201
====== END OUTPUT ======
LD_LIBRARY_PATH="/home/travis/build/boostorg/boost-root/bin.v2/libs/filesystem/build/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden:/home/travis/build/boostorg/boost-root/bin.v2/libs/system/build/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden:/usr/bin:/usr/lib:/usr/lib32:/usr/lib64:$LD_LIBRARY_PATH"
export LD_LIBRARY_PATH
status=0
if test $status -ne 0 ; then
echo Skipping test execution due to testing.execute=off
exit 0
fi
"../../bin.v2/libs/dynamic_bitset/test/dyn_bitset_unit_tests4.test/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden/dyn_bitset_unit_tests4" > "../../bin.v2/libs/dynamic_bitset/test/dyn_bitset_unit_tests4.test/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden/dyn_bitset_unit_tests4.output" 2>&1 < /dev/null
status=$?
echo >> "../../bin.v2/libs/dynamic_bitset/test/dyn_bitset_unit_tests4.test/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden/dyn_bitset_unit_tests4.output"
echo EXIT STATUS: $status >> "../../bin.v2/libs/dynamic_bitset/test/dyn_bitset_unit_tests4.test/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden/dyn_bitset_unit_tests4.output"
if test $status -eq 0 ; then
cp "../../bin.v2/libs/dynamic_bitset/test/dyn_bitset_unit_tests4.test/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden/dyn_bitset_unit_tests4.output" "../../bin.v2/libs/dynamic_bitset/test/dyn_bitset_unit_tests4.test/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden/dyn_bitset_unit_tests4.run"
fi
verbose=0
if test $status -ne 0 ; then
verbose=1
fi
if test $verbose -eq 1 ; then
echo ====== BEGIN OUTPUT ======
cat "../../bin.v2/libs/dynamic_bitset/test/dyn_bitset_unit_tests4.test/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden/dyn_bitset_unit_tests4.output"
echo ====== END OUTPUT ======
fi
exit $status
...failed testing.capture-output ../../bin.v2/libs/dynamic_bitset/test/dyn_bitset_unit_tests4.test/gcc-4.8/release/cxxstd-03-iso/threading-multi/visibility-hidden/dyn_bitset_unit_tests4.run...
Details: #67
There are a few methods that do not need to be public, perhaps, so according to @glenfe :
compiling unit-tests with gcc -W -Wall -Werror produces errors like:
../../../boost/dynamic_bitset/dynamic_bitset.hpp:1953:57: error: left shift of negative value [-Werror=shift-negative-value]
block_type const mask = (~static_cast(0) << extra_bits);
Hi,
I am using boost 1.66.0 version, gcc/g++ -7, and I found when using std::vector<boost::dynamic_bitset<>>, it is unable to free memory when destruction, even using function swap(), resize(), reset(). it doesn't work.
The test program is show below
int main()
{
cout << "[ memory start] " << getMemoryUsage();
{
std::vector<boost::dynamic_bitset<>> m_nodes;
m_nodes.resize(1000000 + 1);
uint64_t descriptor[4] = {1, 2, 3, 4};
for(size_t i = 0; i < m_nodes.size(); i++)
{
m_nodes[i] = boost::dynamic_bitset<>(descriptor, descriptor + 4);
}
}
cout << "[ memory end] " << getMemoryUsage() << endl;
}
result is:
[ memory start ] 13
[ memory end ] 59
As the result show:
The memory of boost::dynamic_bitset can not release, so it is the problem.
Could you please confirm it, and tell me how to solve it?
I want to use dynamic_bitset with boost::container::small_vector, but currently it is not possible because buffer type is not changeable.
Other adapter containers like boost::container::flat_set allows to pass an "allocator or container" template argument.
This modification can be made for dynamic_bitset too in a backward compatible way.
clang-linux.compile.c++ ../../../../bin.v2/libs/spirit/test/qi/qi_range_run.test/clang-linux-12/release/cxxstd-11-iso/stdlib-libc++/threading-multi/visibility-hidden/range_run.o
In file included from range_run.cpp:12:
In file included from ../../../../boost/dynamic_bitset.hpp:15:
../../../../boost/dynamic_bitset/dynamic_bitset.hpp:111:20: error: definition of implicit copy constructor for 'reference' is deprecated because it has a user-declared copy assignment operator [-Werror,-Wdeprecated-copy]
reference& operator=(const reference& rhs) { do_assign(rhs); return *this; } // for b[i] = b[j]
^
../../../../boost/dynamic_bitset/dynamic_bitset.hpp:306:16: note: in implicit copy constructor for 'boost::dynamic_bitset<unsigned int>::reference' first required here
return reference(m_bits[block_index(pos)], bit_index(pos));
^
range_run.cpp:67:13: note: in instantiation of member function 'boost::dynamic_bitset<unsigned int>::operator[]' requested here
bset[j-const_min] = set;
^
range_run.cpp:193:9: note: in instantiation of function template specialization 'acid_test<char>' requested here
acid_test<char>();
^
[phrase library..[@/libs/dynamic_bitset/ Dynamic Bitset]:]
I need the ability to cache computations made from various bitsets. However, dynamic_bitset is not usable as a key right now as std::hash does not support it. Please add support for this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.