sharksforarms / deku Goto Github PK

View Code? Open in Web Editor NEW

965.0 9.0 49.0 797 KB

Declarative binary reading and writing: bit-level, symmetric, serialization/deserialization

License: Apache License 2.0

Rust 99.33% HTML 0.67%

rust rust-crate serialization deserialization parse encoder-decoder bits bytes declarative symmetric

deku's People

Contributors

Stargazers

Watchers

deku's Issues

Impove derive macro error message

Currently when an error happend in deive macro it will just panic, we should use syn::Error::to_compile_error() instead.
For example, an invalid attribute give this message:

error: proc-macro derive panicked
 --> src\main.rs:3:10
  |
3 | #[derive(DekuRead)]
  |          ^^^^^^^^
  |
  = help: message: called `Result::unwrap()` on an `Err` value: Error { kind: UnknownField(ErrorUnknownField { name: "id", did_you_mean: None }), locations: ["b"], span: Some(#0 bytes(79..86)) }

A better message could be:

error: unknown deku field attribute `id`
 --> src\main.rs:3:10
|
7| #[deku(id = "")]
|         ^^

Rename `to_bitvec` to `to_bits`

Why a function convert a type to bytes (Vec<u8>) is to_bytes but a function convert type to bits(BitVec<Msb0, u8>) is to_bitvec? why not to_bits?

deku/src/lib.rs

Lines 251 to 255 in c7e0377

 /// Write struct/enum to Vec<u8> 

 fn to_bytes(&self) -> Result<Vec<u8>, DekuError>; 

 /// Write struct/enum to BitVec 

 fn to_bitvec(&self) -> Result<BitVec<Msb0, u8>, DekuError>;

Impl over IpAddr, Ipv4Addr and Ipv6Addr

DekuWrite write() should take &mut BitVec

Instead of having an allocation per field, write should take a mut bitvec by ref to extend.

Dependabot can't parse your Cargo.toml

Dependabot couldn't parse the Cargo.toml found at /Cargo.toml.

The error Dependabot encountered was:

Dependabot::DependencyFileNotParseable

View the update logs.

Consolidate BitsReaderItems trait into BitsReader

count is the only difference and could be added to BitsReader::read as an Option<usize>

Improve deku attribute documentation

Document each proc-macro attribute and how they're used

Error on TryFrom impls in not all input is consumed

from_bytes returns (rest, value) while try_from impls should return Error if not all input is consumed

Add CI

https://github.com/actions-rs/meta/blob/master/recipes/quickstart.md

Render generics in proc-macro

Branch created, but is stale... rebase to master

Allow pattern in `id` attribute

For example:

enum Foo {
    #[deku(id = "0..=9")]
    A(u8)
    #[deku(id = "id if id > 9")]
    B(u8, u8)
}

Implement BitsSize, BitsRead and BitsWrite on the struct itself

For composability, it would be nice to do something like the following:

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct FieldB {
    #[deku(bits = "6")]
    data: u8,
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct DekuTest {
    #[deku(bits = "2")]
    field_a: u8,
    field_b: FieldB
}

Restrict reader and writer to certain variables, not all internals

readers/writers should have access to:
rest, struct variables, final attribute variables (bit size, input_is_le)

The provided function can be run in a function sandbox where the needed variables are passed with documented names: i.e.

let variant_read_func = if variant_reader.is_some() {
    fn sandbox_reader(rest:, input_is_le:, field_a: field_b:) {
        quote! { #variant_reader; }
    }
    sandbox_reader(rest, input_is_le, field_a, field_b);
} 
...

Why `from_bytes` need `bit_offset`

deku/src/lib.rs

Line 235 in c7e0377

 fn from_bytes(input: (&[u8], usize)) -> Result<((&[u8], usize), Self), DekuError> 

Why not from_bytes(bytes: Bytes) and from_bits(bits: Bits)? Why do I need to care about which bit the byte start from when I use a function called from_bytes?

Context of enum `id_type` cannot be utilized

For example, passing the top-level endian down to it's child

#[deku(endian = "big")]
struct Parent {
   child: Child
}

#[deku(id_type = "u16", ctx = "_endian: deku::ctx::Endian")] // will default to system endianess, no way to use ctx endian
enum Child {
   Variant
}

Vec type read/write

pub struct MyStruct {
  ext_len: usize,
  #[deku(len_field= "ext_len")]
  extensions: Vec<Extension>,
}

Allow something like this and update the ext_len before dumping bits to acc

Implement DekuRead, DekuWrite for String

Maybe something like this?

use deku::prelude::*;

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct Packet {
    s: String,
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test01() {
        //                      [len  | string               ]
        let data: Vec<u8> = vec![5, 104, 101, 108, 108, 111];

        let (_, value) = Packet::from_bytes((data.as_ref(), 0)).unwrap();

        assert_eq!(
            Packet {
                s: "hello".to_string(),
            },
            value
        );
    }
}

(len would be an u64 in a real example)

Endian-ness composing

I believe it would be a nice features for child structs and enums to inherit the parents endian type.

Currently the following code produces the following error:

 `deku::DekuRead<deku::ctx::Endian>` is not implemented for B

use deku::prelude::*;

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(endian = "big")]
struct Packet {
    len: u16,
    messages: B,
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_type = "u8")]
enum B {
    #[deku(id = "0x00")]
    one,
    #[deku(id = "0x01")]
    two,
    #[deku(id = "0x02")]
    three,
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test01() {
        let data: Vec<u8> = vec![0x04, 0x13, 0x01];

        let (_, value) = Packet::from_bytes((data.as_ref(), 0)).unwrap();

        assert_eq!(
            Packet {
                len: 0x0413,
                messages: B::two,
            },
            value
        );
    }
}

In fact, the way of creating a compiling version of this code seems a bit odd. As I only add endian to the len field.

use deku::prelude::*;

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct Packet {
    #[deku(endian = "big")]
    len: u16,
    messages: B,
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_type = "u8")]
enum B {
    #[deku(id = "0x00")]
    one,
    #[deku(id = "0x01")]
    two,
    #[deku(id = "0x02")]
    three,
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test01() {
        let data: Vec<u8> = vec![0x04, 0x13, 0x01];

        let (_, value) = Packet::from_bytes((data.as_ref(), 0)).unwrap();

        assert_eq!(
            Packet {
                len: 0x0413,
                messages: B::two,
            },
            value
        );
    }
}

add top-level enum attribute `id`

Currently I don't see a way to use ctx as the bytes to parse an enum (instead of reading bits/bytes). The key part is the category and length need to be read before the Messages are parsed. The category field is used to decide which struct from an enum is parsed.

Example

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct AsterixPacket {
    #[deku(bytes = "1", endian = "big")]
    category: u8,
    #[deku(bytes = "2", endian = "big")]
    length: u16,
    #[deku(ctx = "*category")]
    messages: Vec<Message>,
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_ctx = "category")]
enum Message {
    #[deku(id = "48")]
    Cat48(Cat48),
}

Allow types that dont implement DekuRead / DekuWrite when a custom parser is provided

struct DekuTest {
    field_a: u8,
    #[deku(reader = "bytes_to_str(rest)")]
    field_b: String,
}

You should not need to implement anything on String as the custom reader handles the parsing

Conditional Skip

Great idea of a library.

I have a protocol that needs conditional parsing of fields in a struct. I see the skip attribute, but would something like a conditional skip be possible?

use deku::prelude::*;
use std::convert::TryFrom;

#[derive(PartialEq, Debug, DekuRead, DekuWrite)]
pub struct DekuTest {
    pub field_a: u8,
    #[deku(skip, if=(field_a, 1))]
    pub field_b: Option<u8>,
    #[deku(skip, if=(field_b, Some(1)))]
    pub field_c: Option<u8>,
}

fn main() {
    let data: Vec<u8> = vec![0x01, 0x02];

    let value = DekuTest::from_bytes((data.as_ref(), 0)).unwrap();
    println!("{:#?}", value)
}

Differing impls for bits and bytes

If bytes attribute is used and the index is on a byte boundary, it may be quicker to read from a slice of &[u8] instead of reading 8*n bits.

I'd like for more benchmarks to be written before so this optimization can be measured

One option could be to feature flag the bits/bytes attributes

Add `count` attribute

A fixed number of elements to be read

Example:

struct Test {
    #[deku(count = 2)]
    data: Vec<u8>
}

Possibly also rename len attribute to count_field ?

Implement custom writers

Users can implement custom readers but not custom writers

Enum types

Improve examples

Find some good examples for the README/lib.rs landing page
- Showcase struct, enums, vec, custom reader/writer

Enum attribute improvements

Current enum behavior

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_type = "u8")]
enum Packet {
    #[deku(id = "0x00")]
    Zero,
    #[deku(id = "0x01")]
    One,
    #[deku(id = "0x02")]
    Two,
    #[deku(id = "0x03")]
    Three,
    #[deku(id = "0x04")]
    Four,
}

attribute inherit which would inherit the id from the value already assigned.

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_type = "u8", inherit)]
enum Packet {
    Zero = 0x00,
    One =  0x01,
    Two = 0x02,
    Three = 0x03,
    Four = 0x04,
}

attribute ordered which would take the first element and increase the id value for each value after that one.

Maybe this would increase for every enum field that didn't have an id defined.

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
#[deku(id_type = "u8", ordered = "0x00")]
enum Packet {
    Zero,
    One,
    Two,
    Three,
    Four,
    #[deku(id = "44")]
    FourtyFour,
}

Benchmarks and performance

Using criterion

+bonus if compared to other crates like nom for reading

Rename BitsWriter and BitsReader

I feel like another name would be better suited, possibly matching the proc-macro's name if that's the convention? DekuRead DekuWrite

BitsWriter::write should return a Result

This will allow us to remove the unwrap in emit_field_update

Make no_std compatible

Merged in #37

Attribute to handle length of bytes(or bits) in Vec<T> with other struct field

Consider the following code:

use deku::prelude::*;

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
pub struct Packet {
    #[deku(bytes = "1")]
    length: u8,
    // byte len of all of messages is length - 2
    messages: Vec<Message>,
}

/// In the real packet, this would be of variable length, so we can't use just the `count` attribute on messages
#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
pub struct Message {
    #[deku(bytes = "1")]
    msg: u8,
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test01() {
        let data: Vec<u8> = vec![0x04, 0x01, 0x02];

        let (_, value) = Packet::from_bytes((data.as_ref(), 0)).unwrap();

        assert_eq!(
            Packet {
                length: 0x04,
                messages: vec![Message { msg: 0x01 }, Message { msg: 0x02 }]
            },
            value
        );
    }
}

I can use the count attribute to give the length of Vec<T> elements, but I see no way of telling the max bytes that a Vec can have in it's container as a whole. Wondering if this would be a feature, or do I need to do some weird custom write/read implementation with a read_bytes field.

No readme

Compile time bit_size checking

deku/deku-derive/src/macros/deku_read.rs

Line 79 in 26eccca

// TODO: Can this somehow be compile time?

This seems like it could be fixed by using const functions. std::mem::size_of() is const which can all be evaluated at compile time. e.g. :

    const fn bit_size() -> usize {
        std::mem::size_of::<$typ>() * 8
    }

Add support for condition and context

Hey, I'm trying to write a simple binary parser with Deku, and here is two problems I found.

Condition

I read your source and found i can pass whatever arguments to it. Sorry about it

Context

Think this binary structure:

struct Data {
    a: u8,
    // This field depends on `Header.flag`
    b: Option<u8>
}
struct Bin {
    flag: u8,
    data: Vec<Data>
}

Because the lack of context, I cant find any way to parse it except writing a custom reading function manully. By the way, add context supportting is a little complicated, I still dont know what the best way is.
Overall, thanks for your great crate.

Add `default` attribute

Defaults the member to the default of the type and skips reading

Print ident name in "Could not match enum variant" Err

In the following line:

deku-derive/src/macros/deku_read.rs:175:                return Err(DekuError::Parse(format!("Could not match enum variant id = {:?}", variant_id)));

It would be nice to print out the ident name (so the name of the Enum) for easier troubleshooting.

I would do it, but I can't for the life of me figure out how to print out the ident. New to proc_macros/quote

`update` attribute

gets called when the struct is .update()'d, kinda like the len attribute but provides a custom impl

Example:

pub struct Ipv4 {
    ....
    #[deku(update = "calc_checksum(...)")]
    pub checksum: u16,       // Header checksum
    ....
}

TokenStream attributes should take a function ident

Instead of specifying arbitrary code in these attributes, only accept a function ident (like serde default) and document the function prototypes for each

Conditional reading

Allows for conditional field reading dependent on the return of a lambda

fn my_condition(input: &[u8], index: usize) {

    // TODO: somehow get access to field_a ?
    // if (field_a == 0xAB) {
    //    return true;
    // }

    return false;
}

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct DekuTest {
    field_a: u8,
    #[deku(bits = "7", read_if=my_condition)]
    field_b: Option<u32>,
}

Not sure the best way to give lambda access to the previously parsed fields

assert_eq/assert attributes

It's useful when a prototype has a magic header(e.g. zlib, pyc, jpg) or a field has a limit.

struct Foo {
    #[deku(assert_eq("[0xAA, 0xBB, 0xCC]"))]
    magic: [u8; 3]
    #[deku(assert("a >= 128"))]
    a: u8,
}

Test against upcoming `bitvec` release

Test against bitvec develop branch:

https://twitter.com/bitvec_rs/status/1290692818353127426

`option_as_tokenstream` will eat span

Since option_as_tokenstream uses Option<String> as its input, darling will discard the span while parsing(because String doesn't have a span). It makes error message hard to read.
For example:

#[derive(DekuRead)]
struct Foo {
    #[deku(cond = "'a' == 2")]
    a: u8,
}

error:

error[E0308]: mismatched types
  --> src\main.rs:24:10
   |
24 | #[derive(DekuRead)]
   |          ^^^^^^^^ expected `char`, found `u8`
   |
   = note: this error originates in a derive macro (in Nightly builds, run with -Z macro-backtrace for more info)

replace it with LitStr:

error[E0308]: mismatched types
  --> src\main.rs:26:19
   |
26 |     #[deku(cond = "'a' == 2")]
   |                   ^^^^^^^^^^ expected `char`, found `u8`

Add `map` attribute

Allow running a function of the read value

Examples:

struct SomeStruct {
    #[deku(map= "|f: u8| f.to_string()")]
    field_a: String,
}

fn map_string(f: u8) -> String {
    f.to_string()
}
struct SomeStruct {
    #[deku(map= "map_string")]
    field_a: String,
}

Edit:
Can do something with trait calls like so:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=9a711662336e1b0059e40947250b05ff

https://doc.rust-lang.org/book/ch19-03-advanced-traits.html#fully-qualified-syntax-for-disambiguation-calling-methods-with-the-same-name

Update func should be in it's own trait

Ensure the crate is web assembly ready

Cleanup enum attribute names

I can't think of a reason why there's both id_bits and bits, for example there's endian but not id_endian (we use endian for enum)

This would be more consistent with field/structs

Also rename id_type to type and id to value

Before:

#[deku(id_type = "u8", id_bits = "5")]
enum Test {
    #[deku(id = "0x01")]
    VarA,
}

After

#[deku(type = "u8", bits = "5")]
enum Test {
    #[deku(value = "0x01")]
    VarA,
}

Add support for unused/padding bits

It would be nice if there was a way to skip over bits or bytes without creating dummy fields in order to save on space e.g :

pub struct SomeStruct {
    field_01: u8,

    // This field is useless but is needed for proper read/write
    unused01: u8,

    field_02: u8,

    // This field is useless but is needed for proper read/write
    unused02: u8,
}

maybe some skip_[bytes|bits] that could be added before and after fields in structs :

#[deku(skip_bytes="1")]

Add `ctx_default` attribute

Add a ctx_default attribute to allow for containers which can both take a context, or default to a fixed context

#[derive(PartialEq, Debug, DekuRead, DekuWrite)]
#[deku(ctx = "a: u8, b: u8", ctx_default = "1, 2")]
pub struct TopLevelCtxStructDefault {
    #[deku(cond = "a == 1")]
    pub a: Option<u8> 
    #[deku(cond = "b == 1")]
    pub b: Option<u8>,
}

#[test]
fn test_ctx_default_struct() {
    let expected = samples::TopLevelCtxStructDefault {
        a: Some(0xff),
        b: None,
    };

    let test_data = [0xffu8];

    // Use default
    let ret_read = samples::TopLevelCtxStructDefault::try_from(test_data.as_ref()).unwrap();
    assert_eq!(expected, ret_read);
    let ret_write: Vec<u8> = ret_read.try_into().unwrap();
    assert_eq!(ret_write, test_data);

    // Use context
    let (rest, ret_read) =
        samples::TopLevelCtxStructDefault::read(test_data.bits(), (1, 2)).unwrap();
    assert!(rest.is_empty());
    assert_eq!(expected, ret_read);
    let ret_write = ret_read.write((1, 2)).unwrap();
    assert_eq!(test_data.to_vec(), ret_write.into_vec());
}

	/// Write struct/enum to Vec<u8>
	fn to_bytes(&self) -> Result<Vec<u8>, DekuError>;

	/// Write struct/enum to BitVec
	fn to_bitvec(&self) -> Result<BitVec<Msb0, u8>, DekuError>;

sharksforarms / deku Goto Github PK

deku's People

Contributors

Stargazers

Watchers

Forkers

deku's Issues

Example

Condition

Context

Recommend Projects

Recommend Topics

Recommend Org

Jobs