Comments (3)
FWIW, it isn't just the overhead of wasted space that is a potential downside here. Bumpalo also supports use cases where, if care is taken on the types of things allocated within the arena, you can read the allocated chunks out from the arena and use them directly for things like zero-copy serialization/deserialization. Inserting undefined padding bytes can break that use case.
I would expect that, for scenarios where we are allocating a type of a particular, fixed layout (eg using bump.alloc(value)
and not bump.alloc_layout(some_dynamic_layout)
) that the cost of aligning the bump pointer would be minimal. Have you dug into any disassemblies to confirm that there isn't something somewhere that is not being inlined or something like that?
That said, I suppose shaving off ~2 instructions from a sequence of 10-15 instructions could very well lead to a 12% speed up.
I'm not necessarily opposed to something like this, but at the same time I don't want to proliferate cargo features endlessly. They have maintenance cost, especially with regards to testing and correctness (as you noted). One thing I'd like to see worked out before going too much deeper into this would be that story.
from bumpalo.
Those are all good considerations. Though as I am sure you already know using min alignment does not preclude reading allocated chunks directly, it just means that if you do so all allocations need to meet the minimum alignment.
For maintainability, I was thinking we could define a constant the specifies the minimum alignment.
#[cfg(feature = "min_align")]
pub const MIN_ALIGN: usize = core::mem::align_of::<usize>();
#[cfg(not(feature = "min_align"))]
pub const MIN_ALIGN: usize = core::mem::align_of::<u8>();
the only spot in non-test code that we would change is in try_alloc_layout_fast
from this:
let ptr = ptr.wrapping_sub(layout.size());
let aligned_ptr = round_mut_ptr_down_to(ptr, layout.align());
to this:
#[cfg(not(feature = "min_align"))]
let aligned_ptr = {
let ptr = ptr.wrapping_sub(layout.size());
round_mut_ptr_down_to(ptr, layout.align())
};
#[cfg(feature = "min_align")]
let aligned_ptr = {
let size = (layout.size() + MIN_ALIGN - 1) & !(MIN_ALIGN - 1);
let ptr = ptr.sub(size);
if layout.align() > MIN_ALIGN {
round_mut_ptr_down_to(ptr, layout.align())
} else {
ptr
}
};
Since the old behavior is equivalent to MIN_ALIGN=1
we would make the tests and quickcheck "MIN_ALIGN aware" using that constant, so they could handle any valid value. Though in practice it would only be 1 or word sized.
However this feature does rely on the compiler constant propagating layout. If I use alloc_layout
and black box Layout
so that it can't be constant propagated we see about a 15% performance regression with the min_align
feature.
fn alloc_layout<T: Default>(n: usize) {
let arena = bumpalo::Bump::with_capacity(n * std::mem::size_of::<T>());
for _ in 0..n {
let arena = black_box(&arena);
let layout = std::alloc::Layout::new::<T>();
let ptr = arena.alloc_layout(black_box(layout));
unsafe {ptr.as_ptr().write(Default::default())};
black_box(ptr);
}
}
This is because the compiler is not able to inline layout.size()
or layout.align()
and so can't eliminate the extra code. I expect that layout not getting inlined would be rare in the real world, but it is a consideration none the less. The performance gain here relies on the compiler being able to making that optimization.
from bumpalo.
the only spot in non-test code that we would change is in
try_alloc_layout_fast
from this:
I think this could benefit from being factored out into two different versions of a helper function that is marked #[inline]
.
make the tests and quickcheck "MIN_ALIGN aware"
This sounds good to me.
from bumpalo.
Related Issues (20)
- May panic due to unwrap on Err HOT 2
- Missing Clone for Box HOT 2
- Add rust-version to Cargo.toml HOT 1
- Request to yank bumpalo v3.12.1 HOT 1
- Implement `std::io::Write` for `Vec<'bump, u8>` HOT 1
- Add a `bumpalo::collections::ThinSlice` type HOT 1
- Memory leak may happen after calling `Bump::dealloc`. HOT 2
- `Box::pin_in` violates pin's drop guarantee HOT 1
- `allocator_api` tests failing HOT 2
- Remove a branch from `try_alloc_layout`? HOT 3
- Improve performance of `extend_from_slice` where `T: Copy` HOT 4
- Ability to create Bump from pre-existing memory allocation? HOT 9
- MSRV disagrees between Cargo.toml and CHANGELOG.md HOT 2
- Analysis of Bump Allocator Performance HOT 1
- `Bump` not respecting allocation limit set HOT 4
- Execution of test cases at a particular compilation optimization level appears to be stack overflow
- Miri error with `allocator_api` and `Box`-pointer roundtrips
- Please consider relaxing the MSRV of this crate HOT 1
- `alloc` should require `T: !Drop` or be `unsafe` HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bumpalo.