GithubHelp home page GithubHelp logo

lumol-org / soa-derive Goto Github PK

View Code? Open in Web Editor NEW
407.0 11.0 28.0 2.03 MB

Array of Struct to Struct of Array helpers in Rust

License: Apache License 2.0

Rust 99.97% HTML 0.03%
hacktoberfest rust data-oriented-programming

soa-derive's Introduction

Automatic Struct of Array generation for Rust

Test Crates.io

This crate provides a custom derive (#[derive(StructOfArray)]) to automatically generate code from a given struct T that allow to replace Vec<T> with a struct of arrays. For example, the following code

#[derive(StructOfArray)]
pub struct Cheese {
    pub smell: f64,
    pub color: (f64, f64, f64),
    pub with_mushrooms: bool,
    pub name: String,
}

will generate a CheeseVec struct that looks like this:

pub struct CheeseVec {
    pub smell: Vec<f64>,
    pub color: Vec<(f64, f64, f64)>,
    pub with_mushrooms: Vec<bool>,
    pub name: Vec<String>,
}

It will also generate the same functions that a Vec<Cheese> would have, and a few helper structs: CheeseSlice, CheeseSliceMut, CheeseRef and CheeseRefMut corresponding respectivly to &[Cheese], &mut [Cheese], &Cheese and &mut Cheese.

Any struct derived by StructOfArray will auto impl trait StructOfArray. You can use <Cheese as StructOfArray>::Type instead of the explicitly named type CheeseVec.

How to use it

Add #[derive(StructOfArray)] to each struct you want to derive a struct of array version. If you need the helper structs to derive additional traits (such as Debug or PartialEq), you can add an attribute #[soa_derive(Debug, PartialEq)] to the struct declaration.

#[derive(Debug, PartialEq, StructOfArray)]
#[soa_derive(Debug, PartialEq)]
pub struct Cheese {
    pub smell: f64,
    pub color: (f64, f64, f64),
    pub with_mushrooms: bool,
    pub name: String,
}

If you want to add attribute to a specific generated struct(such as #[cfg_attr(test, derive(PartialEq))] on CheeseVec), you can add an attribute #[soa_attr(Vec, cfg_attr(test, derive(PartialEq)))] to the struct declaration.

#[derive(Debug, PartialEq, StructOfArray)]
#[soa_attr(Vec, cfg_attr(test, derive(PartialEq)))]
pub struct Cheese {
    pub smell: f64,
    pub color: (f64, f64, f64),
    pub with_mushrooms: bool,
    pub name: String,
}

Mappings for first argument of soa_attr to the generated struct for Cheese:

  • Vec => CheeseVec
  • Slice => CheeseSlice
  • SliceMut => CheeseSliceMut
  • Ref => CheeseRef
  • RefMut => CheeseRefMut
  • Ptr => CheesePtr
  • PtrMut => CheesePtrMut

Usage and API

All the generated code have some generated documentation with it, so you should be able to use cargo doc on your crate and see the documentation for all the generated structs and functions. Most of the time, you should be able to replace Vec<Cheese> by CheeseVec, with exception of code using direct indexing in the vector and a few other caveats listed below.

Caveats and limitations

Vec<T> functionalities rely a lot on references and automatic deref feature, for getting function from [T] and indexing. But the SoA vector (let's call it CheeseVec, generated from the Cheese struct) generated by this crate can not implement Deref<Target=CheeseSlice>, because Deref is required to return a reference, and CheeseSlice is not a reference. The same applies to Index and IndexMut trait, that can not return CheeseRef/CheeseRefMut. This means that the we can not index into a CheeseVec, and that a few functions are duplicated, or require a call to as_ref()/as_mut() to change the type used.

Iteration

It is possible to iterate over the values in a CheeseVec

let mut vec = CheeseVec::new();
vec.push(Cheese::new("stilton"));
vec.push(Cheese::new("brie"));

for cheese in vec.iter() {
    // when iterating over a CheeseVec, we load all members from memory
    // in a CheeseRef
    let typeof_cheese: CheeseRef = cheese;
    println!("this is {}, with a smell power of {}", cheese.name, cheese.smell);
}

One of the main advantage of the SoA layout is to be able to only load some fields from memory when iterating over the vector. In order to do so, one can manually pick the needed fields:

for name in &vec.name {
    // We get referenes to the names
    let typeof_name: &String = name;
    println!("got cheese {}", name);
}

In order to iterate over multiple fields at the same time, one can use the soa_zip! macro.

for (name, smell, color) in soa_zip!(vec, [name, mut smell, color]) {
    println!("this is {}, with color {:#?}", name, color);
    // smell is a mutable reference
    *smell += 1.0;
}

Nested Struct of Arrays

In order to nest a struct of arrays inside another struct of arrays, one can use the #[nested_soa] attribute.

For example, the following code

#[derive(StructOfArray)]
pub struct Point {
    x: f32,
    y: f32,
}
#[derive(StructOfArray)]
pub struct Particle {
    #[nested_soa]
    point: Point,
    mass: f32,
}

will generate structs that looks like this:

pub struct PointVec {
    x: Vec<f32>,
    y: Vec<f32>,
}
pub struct ParticleVec {
    point: PointVec, // rather than Vec<Point>
    mass: Vec<f32>
}

All helper structs will be also nested, for example PointSlice will be nested in ParticleSlice.

Documentation

Please see http://lumol.org/soa-derive/soa_derive_example/ for a small example and the documentation of all the generated code.

Benchmarks

Here are a few simple benchmarks results, on my machine:

running 10 tests
test aos_big_do_work_100k   ... bench:     415,315 ns/iter (+/- 72,861)
test aos_big_do_work_10k    ... bench:      10,087 ns/iter (+/- 219)
test aos_big_push           ... bench:          50 ns/iter (+/- 10)
test aos_small_do_work_100k ... bench:      93,377 ns/iter (+/- 1,106)
test aos_small_push         ... bench:           3 ns/iter (+/- 1)
test soa_big_do_work_100k   ... bench:      93,719 ns/iter (+/- 2,793)
test soa_big_do_work_10k    ... bench:       9,253 ns/iter (+/- 103)
test soa_big_push           ... bench:          39 ns/iter (+/- 13)
test soa_small_do_work_100k ... bench:      93,301 ns/iter (+/- 1,765)
test soa_small_push         ... bench:           4 ns/iter (+/- 1)

Benchmarks tests exist for soa (struct of array) and aos (array of struct) versions of the same code, using a small (24 bytes) and a big (240 bytes) struct.

You can run the same benchmarks on your own system by cloning this repository and running cargo bench.

Licensing and contributions

This crate distributed under either the MIT or the Apache license, at your choice. Contributions are welcome, please open an issue before to discuss your changes !

Thanks to @maikklein for the initial idea: https://maikklein.github.io/soa-rust/

soa-derive's People

Contributors

adnoc avatar cgmossa avatar cherryblossom000 avatar christopher-s-25 avatar currypseudo avatar cwfitzgerald avatar fncontroloption avatar indubitablement2 avatar inv2004 avatar jojodeveloping avatar kristopherbullinger avatar luthaf avatar mangelats avatar mikialex avatar miloyip avatar remexre avatar schneiderfelipe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

soa-derive's Issues

#[serde(skip)] cause struct of array containing different length of field

Sometimes we derive serde's Serialize/Deserialize for Struct of Array, by #[soa_derive(Serialize, Deserialize)].
By feature from #42, we are able to skip some Struct of Array's fields by #[soa_attr(Vec, serde(skip))], for example:

#[derive(StructOfArray)]
#[soa_derive(Serialize, Deserialize)]
struct Point {
    x: f32,
    y: f32,
    // Skip serialization for PointVec::meta
    #[soa_attr(Vec, serde(skip))]
    meta: bool,
} 

Serialize is ok, but deserialize will get a invalid PointVec because it contains different length of field.
For example:

#[test]
fn serde_skip_test() -> Result<(), serde_json::Error> {
    let mut soa = PointVec::new();
    soa.push(Point { x: 1.0, y: 2.0, meta: true });
    soa.push(Point { x: 3.0, y: 4.0, meta: false });


    let json = serde_json::to_string(&soa)?;
    assert_eq!(json, r#"{"x":[1.0,3.0],"y":[2.0,4.0]}"#);
    let soa2: PointVec = serde_json::from_str(&json)?;
    assert_eq!(&soa2, &PointVec {
        x: vec![1.0, 3.0],
        y: vec![2.0, 4.0],
        meta: vec![] // This comes from Vec::default(), having different length with other fields
    });
}

Allow to nest structs of arrays

Basic idea

This would be using a new helper attribute to indicate that you'd like to add the fields as part of the SoA:

#[derive(StructOfArray)]
struct Point {
    x: f64,
    y: f64,
}

// Proposed added behaviour: generates a `PointVec` and a `Vec<f64>`
#[derive(StructOfArray)]
struct Raster {
    #[nested_soa]
    coords: Point,
    value: f64,
}

This would also effect pointers, references and slices. The idea is that the structure of the data is kept but it allows to use SoA all the way down.

Advantages

It's an opt-in, so it's not a breaking change in any way, shape or form.
It uses an attribute, which means that it can be easily changed (nice to have to measure if it affects performance).

Also this would give this project an edge because it's very hard to implement in a lot of other languages, specially in a strongly typed, low-level language like Rust (where performance really matters).

The implementation should be straightforward and rely on having the same interface as Vec. As such the generated code will call the functions that just so happens to exist in both Vec and PointVec.

How would it work

First, we need to add a few traits similar to StructOfArray but for the pointers, references and slices.
The attribute works as a marker. We generate the types based on it. For instance, if a field has the attribute we generate <Example as ::soa_derive::StructOfArray>::Type instead of the regular Vec<Example>. Pointers, references and slices follow a similar pattern with the other traits.
And that's it. It should just work because we have the same interfaces :)

Generics programming using soa-derive

Hi, apologies for not being able to infer this from the documentation and code, I am not very experienced with these topics. Also, sorry if I get any terms wrong, either way, my question is the following:

Say I have a generic struct that is meant to include various types of SoAs. What trait bounds am I meant to use? Should T be a Cheese or a CheeseVec?

pub struct SparseSet<T: ?> {
    dense: Vec<EntID>,
    sparse: Vec<u32>,
    data: ?,
}

From my understanding of the documentation, you are meant to apply the trait bound StructOfArray, making T a CheeseVec. However, that trait does not implement the usual methods (insert, pop, etc.), which would mean I can't do generic implementations. Is that the case, or did I miss something?

If T is a SoA, how do I get the original type, for declaring function parameter types? For example, the function get makes use of the sparse array to determine if an entity exists, after which it's meant to return the associated data, in this case a Cheese. What should ? be here, i.e. how do I get Cheese from CheeseVec in a generic way?

/// Gets the data associated with the entity from the set
pub fn get(&self, ent_id: EntID) -> Result<&?, Error> {
    self.ent_exists(ent_id)
        .then(|| &self.data[self.sparse[ent_id] as usize])
        .ok_or(Error::EntityNotInSet)
}

Thank you.

Const array SoA variant

I think it would be nice to generate a type like:

pub struct FooArray<const N: usize> {
    pub f1: [f32; N],
    pub f2: [u8; N],
    ...
}

This would allow you to do stuff like core::alloc::Layout::new::<FooArray<128>>() for free.

How to set #[pyclass] attribute for soa-derived struct?

Thank you for the great efforts in soa_derive! I'm still new to Rust, and I wonder how #[pyclass] can be applied to an soa-derived class?

The following will not work as #[pyclass] is applied to Element, not ElementVec.

#[pyclass]
#[derive(StructOfArray, Debug)]
pub struct Element {
   num: f64,
}

cannot move out of type `CheeseVec`, which implements the `Drop` trait

I wish to move out of the CheeseVec, which seems impossible because CheeseVec implements Drop:

use soa_derive::StructOfArray;

#[derive(StructOfArray)]
pub struct Cheese {
    pub smell: f64,
}

fn main() {
    let cheese0 = Cheese { smell: 10.0 };
    let cheese1 = Cheese { smell: -1000.0 };
    let mut cheeses = CheeseVec::with_capacity(2);
    cheeses.push(cheese0);
    cheeses.push(cheese1);
    let smell_vec: Vec<f64> = unpack_cheeses(cheeses);
}

fn unpack_cheeses(cheeses: CheeseVec) -> Vec<f64> {
    let CheeseVec { smell } = cheeses;
    smell
}
 1  error[E0509]: cannot move out of type `CheeseVec`, which implements the `Drop` trait
   --> src/main.rs:18:31
    |
 18 |     let CheeseVec { smell } = cheeses;
    |                     -----     ^^^^^^^ cannot move out of here
    |                     |
    |                     data moved here
    |                     move occurs because `smell` has type `Vec<f64>`, which does not implement the `Copy` trait
    |
 help: consider borrowing the pattern binding
    |
 18 |     let CheeseVec { ref smell } = cheeses;
    |                     +++

 For more information about this error, try `rustc --explain E0509`.

Is there any way to take ownership of the fields of a CheeseVec? Is this a feature that soa_derive could implement? Is there any workaround for the above error?

How do you go from `MyTypeRef` to `MyType` ?

Is there a method for "dereferencing" the MyTypeRef and getting the entire object by-value? I'm having trouble finding it in source or the docs.

The use-case I have is copying a MyTypeVec into another SOA vec while filtering out some of the values.

I apologize if I'm missing something obvious.

Cannot find trait `SoAIter` in crate `soa_derive`

After updating soa_derive dependency from 0.10.0 to 0.11.0 any code with #[derive(StructOfArray] fails to compile with the following message:

error[E0405]: cannot find trait `SoAIter` in crate `soa_derive`
 --> src\main.rs:3:10
  |
3 | #[derive(StructOfArray)]
  |          ^^^^^^^^^^^^^ not found in `soa_derive`
  |
  = note: this error originates in the derive macro `StructOfArray` (in Nightly builds, run with -Z macro-backtrace for more info)

Barebones example project at https://github.com/VitalyArtemiev/testsoa.git

Steps to reproduce:

git clone https://github.com/VitalyArtemiev/testsoa.git
cd testsoa
cargo run

Tried both with stable and nightly. Sorry if this is a known problem.

Sorting methods for Slice variant

Some useful methods I miss from Array Of Structs are the sorting methods, which are sort(), sort_by() and sort_by_key() for the slice primitive type. From what I've seen, the Slice variant produced by the macro does not offer any sorting methods.

I believe it is possible to implement sorting using the permutation crate, and adding a #[sort] attribute. For example, the sort() method would be like this (Example works if #[sort] is removed/commented out):

use soa_derive::StructOfArray;

#[derive(StructOfArray)]
#[soa_derive(Debug)]
struct Foo {
    // The type of the field with the `#[sort]` attribute must implement the `Ord` trait
    #[sort]
    bar: u8,
    baz: bool,
}
// --------------------------------------------------------------------------------
// The macro-generated code
impl FooSliceMut<'_> {
    fn sort(&mut self) {
        use permutation::permutation as pmut;

        let mut permutation = pmut::sort(&self.bar);

        permutation.apply_slice_in_place(&mut self.bar);
        permutation.apply_slice_in_place(&mut self.baz);
    }

    // ... Other sorting methods ...
}
// --------------------------------------------------------------------------------
// Example usage
fn main() {
    use rand::Rng;

    let mut foo_vec = FooVec::new();
    let mut rng = rand::thread_rng();

    for i in 1..=10 {
        let num = rng.gen();
        foo_vec.push(Foo { bar: num, baz: i % 2 == 0 });
    }

    println!("Before sorting: {foo_vec:#?}");

    foo_vec.as_mut_slice().sort();

    println!("After sorting: {foo_vec:#?}");
}

I don't have any experience with making macros, but I can try implementing it. I'm open to any other ideas or suggestions about the implementation.

The alternative to pulling in a whole crate for sorting is to zip the vectors from all fields, sort them, and then split them back, which is very costly due to multiple allocations. A middle-ground solution would be to provide sorting under a feature (eg. sort) to avoid increasing default dependencies.

Limitations:

  • Sorting methods will sort only based on the specified field (can be solved with generating differently named sorting methods for each field or providing a #[sort(...)] syntax for naming)
  • Sorting a Vec variant requires the as_mut_slice() method first, as described in the project's caveats

Derive std::iter::Extend

Derived vec doesn't implement std::iter::Extend, in order to create struct of array without reallocation, we need to do following code in current version of this crate.

#[derive(StructOfArray)]
Struct Foo {
  foo: String,
}

let vec_of_struct = vec![];
let mut struct_of_vec = FooVec::with_capacity(vec_of_struct.len());
for item in vec_of_struct {
  struct_of_vec.push(item);
}

Write a README and some examples

Additionally to the README, the potentials users for this crate should be able to see the documentation for the code generated by an example.

Update versions of `syn` and `quote`

Currently I have some issues where I'm pulling in old versions of syn and quote due to using this as a dependency, and it's hurting build times. Probably we could unpin those versions entirely and let consumers of the library decide?

Setup some benchmarks

To compare SoA and AoS layouts in various cases: few members, lot of members, hot/cold data access, ...

Provide FromIterator implementation

Lovely package! Here's my usecase though, set in terms of the readme example: I have Vec<Cheese> and I attempted to do:

let cheeses: Vec<Cheese> = ...;
let CheeseBoard: CheeseVec = cheeses.into_iter().collect();

My example is similar to this, and I am getting this:

error[E0277]: a value of type `boar_parameters::MortalityProbabilityVec` cannot be built from an iterator over elements of type `boar_parameters::MortalityProbability`
   --> src\boar_parameters.rs:374:14
    |
374 |             .collect();
    |              ^^^^^^^ value of type `boar_parameters::MortalityProbabilityVec` cannot be built from `std::iter::Iterator<Item=boar_parameters::MortalityProbability>`
    |
    = help: the trait `std::iter::FromIterator<boar_parameters::MortalityProbability>` is not implemented for `boar_parameters::MortalityProbabilityVec`

So basically, there should be a FromIterator for this?

Is it possible to use the crate with lifetime parameters?

Hello, I have struct which has lifetime parameters:

#[derive(Debug, StructOfArray)]
struct Ind2<'a> {
    ts: &'a str,
    sma_short: f32,
    sma_long: f32,
    sma_raise: f32,
    sma_fall: f32,
}

Error is:

error[E0261]: use of undeclared lifetime name `'a`
 --> src/file.rs:1:17
  |
1 | #[derive(Debug, StructOfArray)]
  |                 ^^^^^^^^^^^^^ undeclared lifetime

error[E0106]: missing lifetime specifier
 --> src/file.rs:1:17
  |
1 | #[derive(Debug, StructOfArray)]
  |                 ^^^^^^^^^^^^^ expected lifetime parameter

error: aborting due to 2 previous errors

Question: Is it possible to apply the derive for struct with lifetimes?

Thank you,

IntoIterator for &'a CheeseSlice

Hello,

I was trying to create trait which works with &'a CheeseVec and &'a CheeseSlice the same way.
But was a bit strange that I cannot find IntoIterator for &'a CheeseSlice.

Are there any reasons or problems why it was not implemented?

Thank you,

performance of the slice iterator

Looks like performance of the iterators is ~x3 times slower than std .zip or izip!
izip!: 4.9sec , CheeseSlice: 15.9 sec

Regards,

Add access methods

Add the methods get, get_mut, get_unchecked, and get_unchecked_mut of Vec to the generated SoA.

Right now it's not possible implement the trait Index and IndexMut. This issue would add a way to access the elements in a way that also works for Vec. This would also make accessing the underlying Vec not required.

// Before
let value = vec.field[1];

// After
let value = vec.get(1)?.field; // vec could be an ExampleVec or a Vec<Example>

And we might get a reference (which we couldn't do before):

let reference = vec.get(2)?;

Also would support ranges:

let slice = vec.get(0..5)?;

And we should probably add ranges to our custom slice so it's possible to get an smaller slice from a slice.

To do so, it would make sense to make a trait similar to SliceIndex but changing the output to not require returning a reference (so we can use our struct of references).

Use one length and capacity variable for whole struct

As it stands, this crate appears to make separate fields each with their own Vec. This would duplicate the length and capacity values for each field. This may not be a huge problem, but it would triple the size of the struct as it grows and can lead to the different Vecs falling out of sync.

The main alternative would be to use unsafe and raw pointers (or NonNull pointers). That said, managing the unsafe would almost certainly be more effort than keeping the Vecs in sync. Regardless, I think this could be a useful discussion to be had. Is cutting the struct to a third worth having unsafe code to vet? (probably not)

#[soa_attr] on field

Sometimes we want to add attribute to generated struct's field.
For example:

#[derive(StructOfArray)]
#[soa_derive(Serialize, Deserialize)]
struct Point {
    x: f32,
    y: f32,
    // Skip serialization for PointVec::meta
    #[soa_attr(Vec, serde(skip))]
    meta: bool,
}

Use trait to link the origin struct to generated soa struct

We can create type connection between the origin struct to generated struct by trait like bellow:

pub trait StructOfArraySource {
  type ArrayType
}

// this part generated in marco
impl StructOfArraySource for Cheese {
  type ArrayType = CheeseVec
}

After this, we can use <Cheese as StructOfArraySource>::ArrayType instead of hard code name CheeseVec.

Type connection is important in generics programing. If I have type T, some container is soa derived and I need describe that container type by T. We can use <T as StructOfArraySource>::ArrayType. It seems the only way.

Allow explicit name generation

Implicit name generation is something that is very dangerous, and we should almost always use hygienic, correctly scoped macros. So instead of making CheeseVec for the following:

#[derive(StructOfArray)]
struct Cheese {
   ...
}

I instead want to specify the resulting SOA name explicitly:

#[derive(StructOfArray(CheeseVec))]
struct Cheese {
   ...
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.