GithubHelp home page GithubHelp logo

gimli-rs / leb128 Goto Github PK

View Code? Open in Web Editor NEW
18.0 18.0 15.0 587 KB

Read and write DWARF's "Little Endian Base 128" variable length integer encoding

Home Page: http://gimli-rs.github.io/leb128/leb128/index.html

License: Apache License 2.0

Rust 96.82% Shell 3.18%

leb128's People

Contributors

asutton avatar atouchet avatar dependabot-preview[bot] avatar dependabot-support avatar dependabot[bot] avatar fitzgen avatar hywan avatar jimblandy avatar luavixen avatar olivierlemasle avatar philipc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

leb128's Issues

Return Number of Bytes consumed

With the current implementation, the number of bytes that are consumed is not returned to the user.

About my use case:
It involves a BufReader reading a typical Binary file. I do need to seek through some parts of the file and I want to take advantage of the seek_relative functionality (so the buffer does not get re-filled every time if not necessary). However, I would manually need to keep track of the offsets, which is absolutely no problem, except I do not know how many bytes leb128 consumed, which is a shame.

It would be pretty simple to just return a tuple. I am very new to Rust and not confident enough to open a PR, but I hope it gets added soon enough!

Overflowing while reading can create "phantom numbers" which can possibly cause corrupt/incorrect output

I'm currently using leb128 to read LEB128-encoded numbers from a stream of data (TcpStream) and I encountered a serious bug that caused my application to generate corrupt/incorrect output and could possibly allow for a DoS attack (in my case).

To describe this bug, assume that cursor is my TcpStream and that I am attempting to read TWO (2) LEB128-encoded numbers from it.

Without an overflow, this library works fine:

let mut cursor = std::io::Cursor::new(vec![
  0b1000_0011, 0b0010_1110,              // 5891
  0b1110_0100, 0b1110_0000, 0b0000_0010, // 45156
]);

for call in 1..4 {
  println!("Call #{}: {:?}", call, leb128::read::unsigned(&mut cursor));
}
// Call #1: Ok(5891)  // Number one
// Call #2: Ok(45156) // Number two
// Call #3: Err(IoError(Custom { kind: UnexpectedEof, error: "failed to fill whole buffer" }))

However, when an overflow occurs while reading a very long LEB128 value, a phantom 3rd number appears!

let mut cursor = io::Cursor::new(vec![
  0b1111_1111, 0b1111_1111, 0b1111_1111, 0b1111_1111,
  0b1111_1111, 0b1111_1111, 0b1111_1111, 0b1111_1111,
  0b1111_1111, 0b1111_1111, 0b0111_1111, // Overflow!
  0b1110_0100, 0b1110_0000, 0b0000_0010, // 45156
]);

for call in 1..5 {
  println!("Call #{}: {:?}", call, leb128::read::unsigned(&mut cursor));
}

// Call #1: Err(Overflow) // Number one
// Call #2: Ok(127)       // Where did you come from??
// Call #3: Ok(45156)     // Number two
// Call #4: Err(IoError(Custom { kind: UnexpectedEof, error: "failed to fill whole buffer" }))

This happens because both leb128::read::signed and leb128::read::unsigned exit early if an overflow occurs:

pub fn unsigned<R>(r: &mut R) -> Result<u64, Error>
where
    R: io::Read,
{
    let mut result = 0;
    let mut shift = 0;

    loop {
        let mut buf = [0];
        r.read_exact(&mut buf)?;

        if shift == 63 && buf[0] != 0x00 && buf[0] != 0x01 { // <<<<<<<<<<<<<<<<
            return Err(Error::Overflow);                     // <<<<<<<<<<<<<<<<
        }                                                    // <<<<<<<<<<<<<<<<

        let low_bits = low_bits_of_byte(buf[0]) as u64;
        result |= low_bits << shift;

        if buf[0] & CONTINUATION_BIT == 0 {
            return Ok(result);
        }

        shift += 7;
    }
}

The condition that causes return Err(Error::Overflow); to execute can evaluate to true before the entire LEB128 value has been read, leaving behind extra bytes that can cause serious issues.

Support maximal-length encoded integers

This came up during rustwasm/walrus#30, specifically rustwasm/walrus#30 (comment).

The use case here is that sometimes when dealing with leb128 you'll often have a scenario where a leb128 integer denotes how many bytes left in the region are part of a section or unit. When encoding, though, we often don't know the length of the section up-front, so a common trick is to do something like:

  • For a u32 encoded as leb128, reserve 5 bytes of space
  • Encode the entire section
  • Encode the length of the section into the previously reserved 5 bytes of space

This encoding uses the maximal instead of minimal width, using "padding zero bytes" that look like 0x80 to ensure that leb128 eats up all 5 of the bytes reserved.

It'd be neat if this crate supported such a use case (it's sort of like Seek with writers), although I'd be fine just supporting it with slices for the time being!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.