julianschmid / etherparse Goto Github PK

A rust library for parsing ethernet & ethernet using protocols.

License: Apache License 2.0

Rust 99.82% Shell 0.18%

etherparse's Introduction

etherparse

A zero allocation library for parsing & writing a bunch of packet based protocols (EthernetII, IPv4, IPv6, UDP, TCP ...).

Currently supported are:

Ethernet II
IEEE 802.1Q VLAN Tagging Header
IPv4
IPv6 (supporting the most common extension headers, but not all)
UDP
TCP
ICMP & ICMPv6 (not all message types are supported)

Usage

Add the following to your Cargo.toml:

[dependencies]
etherparse = "0.14"

What is etherparse?

Etherparse is intended to provide the basic network parsing functions that allow for easy analysis, transformation or generation of recorded network data.

Some key points are:

It is completely written in Rust and thoroughly tested.
Special attention has been paid to not use allocations or syscalls.
The package is still in development and can & will still change.
The current focus of development is on the most popular protocols in the internet & transport layer.

How to parse network packages?

Etherparse gives you two options for parsing network packages automatically:

Slicing the packet

Here the different components in a packet are separated without parsing all their fields. For each header a slice is generated that allows access to the fields of a header.

match SlicedPacket::from_ethernet(&packet) {
    Err(value) => println!("Err {:?}", value),
    Ok(value) => {
        println!("link: {:?}", value.link);
        println!("vlan: {:?}", value.vlan);
        println!("net: {:?}", value.net); // contains ip
        println!("transport: {:?}", value.transport);
    }
}

This is the faster option if your code is not interested in all fields of all the headers. It is a good choice if you just want filter or find packets based on a subset of the headers and/or their fields.

Depending from which point downward you want to slice a package check out the functions:

SlicedPacket::from_ethernet for parsing from an Ethernet II header downwards
SlicedPacket::from_linux_sll for parsing from a Linux Cooked Capture v1 (SLL) downwards
SlicedPacket::from_ether_type for parsing a slice starting after an Ethernet II header
SlicedPacket::from_ip for parsing from an IPv4 or IPv6 downwards

In case you want to parse cut off packets (e.g. packets returned in in ICMP message) you can use the "lax" parsing methods:

LaxSlicedPacket::from_ethernet for parsing from an Ethernet II header downwards
LaxSlicedPacket::from_ether_type for parsing a slice starting after an Ethernet II header
LaxSlicedPacket::from_ip for parsing from an IPv4 or IPv6 downwards

Deserializing all headers into structs

This option deserializes all known headers and transfers their contents to header structs.

match PacketHeaders::from_ethernet_slice(&packet) {
    Err(value) => println!("Err {:?}", value),
    Ok(value) => {
        println!("link: {:?}", value.link);
        println!("vlan: {:?}", value.vlan);
        println!("net: {:?}", value.net); // contains ip
        println!("transport: {:?}", value.transport);
    }
}

This option is slower then slicing when only few fields are accessed. But it can be the faster option or useful if you are interested in most fields anyways or if you want to re-serialize the headers with modified values.

Depending from which point downward you want to unpack a package check out the functions

PacketHeaders::from_ethernet_slice for parsing from an Ethernet II header downwards
PacketHeaders::from_ether_type for parsing a slice starting after an Ethernet II header
PacketHeaders::from_ip_slice for parsing from an IPv4 or IPv6 downwards

In case you want to parse cut off packets (e.g. packets returned in in ICMP message) you can use the "lax" parsing methods:

LaxPacketHeaders::from_ethernet for parsing from an Ethernet II header downwards
LaxPacketHeaders::from_ether_type for parsing a slice starting after an Ethernet II header
LaxPacketHeaders::from_ip for parsing from an IPv4 or IPv6 downwards

Manually slicing only one packet layer

It is also possible to only slice one packet layer:

The resulting data types allow access to both the header(s) and the payload of the layer and will automatically limit the length of payload if the layer has a length field limiting the payload (e.g. the payload of IPv6 packets will be limited by the "payload length" field in an IPv6 header).

Manually slicing & parsing only headers

It is also possible just to parse headers. Have a look at the documentation for the following [NAME]HeaderSlice.from_slice methods, if you want to just slice the header:

And for deserialization into the corresponding header structs have a look at:

How to generate fake packet data?

Packet Builder

The PacketBuilder struct provides a high level interface for quickly creating network packets. The PacketBuilder will automatically set fields which can be deduced from the content and compositions of the packet itself (e.g. checksums, lengths, ethertype, ip protocol number).

Example:

use etherparse::PacketBuilder;

let builder = PacketBuilder::
    ethernet2([1,2,3,4,5,6],     //source mac
               [7,8,9,10,11,12]) //destination mac
    .ipv4([192,168,1,1], //source ip
          [192,168,1,2], //destination ip
          20)            //time to life
    .udp(21,    //source port
         1234); //destination port

//payload of the udp packet
let payload = [1,2,3,4,5,6,7,8];

//get some memory to store the result
let mut result = Vec::<u8>::with_capacity(builder.size(payload.len()));

//serialize
//this will automatically set all length fields, checksums and identifiers (ethertype & protocol)
//before writing the packet out to "result"
builder.write(&mut result, &payload).unwrap();

There is also an example for TCP packets available.

Check out the PacketBuilder documentation for more information.

Manually serializing each header

Alternatively it is possible to manually build a packet (example). Generally each struct representing a header has a "write" method that allows it to be serialized. These write methods sometimes automatically calculate checksums and fill them in. In case this is unwanted behavior (e.g. if you want to generate a packet with an invalid checksum), it is also possible to call a "write_raw" method that will simply serialize the data without doing checksum calculations.

Read the documentations of the different methods for a more details:

References

Darpa Internet Program Protocol Specification RFC 791
Internet Protocol, Version 6 (IPv6) Specification RFC 8200
IANA Protocol Numbers
Internet Protocol Version 6 (IPv6) Parameters
Wikipedia IEEE_802.1Q
User Datagram Protocol (UDP) RFC 768
Transmission Control Protocol RFC 793
TCP Extensions for High Performance RFC 7323
The Addition of Explicit Congestion Notification (ECN) to IP RFC 3168
Robust Explicit Congestion Notification (ECN) Signaling with Nonces RFC 3540
IP Authentication Header RFC 4302
Mobility Support in IPv6 RFC 6275
Host Identity Protocol Version 2 (HIPv2) RFC 7401
Shim6: Level 3 Multihoming Shim Protocol for IPv6 RFC 5533
Computing the Internet Checksum RFC 1071
Internet Control Message Protocol RFC 792
IANA Internet Control Message Protocol (ICMP) Parameters
Requirements for Internet Hosts -- Communication Layers RFC 1122
Requirements for IP Version 4 Routers RFC 1812
Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification RFC 4443
ICMP Router Discovery Messages RFC 1256
Internet Control Message Protocol version 6 (ICMPv6) Parameters
Multicast Listener Discovery (MLD) for IPv6 RFC 2710
Neighbor Discovery for IP version 6 (IPv6) RFC 4861
LINKTYPE_LINUX_SLL on tcpdump
LINUX_SLL header definition on libpcap
Linux packet types definitions on the Linux kernel
Address Resolution Protocol (ARP) Parameters Harware Types
Arp hardware identifiers definitions on the Linux kernel

License

Licensed under either of Apache License, Version 2.0 or MIT license at your option. The corresponding license texts can be found in the LICENSE-APACHE file and the LICENSE-MIT file.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you shall be licensed as above, without any additional terms or conditions.

etherparse's People

Contributors

Stargazers

Watchers

etherparse's Issues

Question: Why not use the parsers at libpnet?

Hi Julian, I'm currently hunting for a good ethernet frame parser, I like your crate a lot but am leaning towards libpnet as it has so much other low-level support. Is there a reason why you decided to roll your own?

IP packets can't be written using the builder pattern

Right now, packets cannot be written out unless they are fully-assembled transport-layer TCP/UDP packets. In other words, the write function is only implemented for PacketBuilderStep<UdpHeader> and PacketBuilderStep<TcpHeader>:

However, there might be use cases for etherparse where a programmer may want to write an IP packet with a custom transport protocol or handle the protocol assembly and writing themselves. I suggest implementing write for PacketBuilderStep<IpHeader> to support this use case!

Incorrect example

I'm trying to use custom udp packet builder with almost same code:
https://github.com/JulianSchmid/etherparse/blob/master/examples/write_ipv4_udp.rs

some context to show the error later

//setup the actual payload of the udp packet
let udp_payload = [1,2,3,4,5,6,7,8];

//source ip address
[192,168,1,42],
//destination ip address
[192,168,1,1]

$ cargo run --example=write_ipv4_udp
[1, 2, 3, 4, 5, 6, 11, 12, 13, 14, 15, 16, 8, 0, 69, 0, 0, 36, 0, 0, 64, 0, 20, 17, 227, 77, 192, 168, 1, 42, 192, 168, 1, 1, 0, 0, 0, 42, 0, 16, 108, 20]

IPv4: (start from 0x45/69)

[69, 0, 0, 36, 0, 0, 64, 0, 20, 17, 227, 77, 192, 168, 1, 42, 192, 168, 1, 1, 0, 0, 0, 42, 0, 16, 108, 20]

I'm using raw sockets to send it and Wireshark to view packets

software shows that UDP packet is invalid

payload length is calculated as udp_header_len + payload.len() = 8 + 8 = 16
but as we see there is no payload at the generated packet

Support for #[derive(Serialize, Deserialize)] for all major structs?

I've run into some issues where I've wanted to write packets to disk in their parsed form, not in their on-the-wire format. What are folks thoughts about peppering all of the major structs (PacketHeaders, PacketSlice, and transitively) to auto-derive Serialize and Deserialize? Happy to write it myself - just wanted to ask first.

Replace `IpNumber` & `EtherType` with structs

Task

Rewrite EtherType & IpNumber to be single value structs. E.g.:

struct EtherType(pub u16);

Also add implementations of the From & Into traits for simply converting from u8/u16 into the struct type and from the struct type into u8/u16. Also manually implement the Debug and Display to also display the human readable name if available (e.g. IPv4(0x0800) for EtherType(0x0800)).

Why

Both EtherType and IpNumber are currently implemented as simple enums. E.g.:

#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum EtherType {
    Ipv4 = 0x0800,
    Ipv6 = 0x86dd,
    // ...
}

Originally the plan was to use the enums in the headers. But this approach was discarded in early versions of the library as it would limit the values to only the ones present in the enums. Additionally adding an Unknown(u8)value to the enums would disabled the ability to simply cast the values to their underlying type and added unneeded computation when converting from and into the enum type. So the headers were switched back to the underlying types u8 & u16 to support all valid IP & Ethernet headers.

But some time ago I noticed that that the pcap crate had implemented the Linktype as struct with a simple value ( https://docs.rs/pcap/1.0.0/pcap/struct.Linktype.html ). To me this seems the more sensible approach for EtherType & IpNumber as well, as it comes with the following advantages:

Allows all possible values without any need for re-mapping values when converting the underlying types (u8/u16) from and to the struct type.
Gives some type safety when assigning values.
Allows for printing of the underlying meaning of the value (e.g. UDP) when printing with Debug or Display.

Provide `from_slice_lax` methods

As the validation of length values will become stricter with #35 there should still be a way to still parse packets in a a "laxer" way. As part of this ticket from_slice_lax methods should be added that are not as strict about the correctness of the packet contents, but instead try to parse the packets as far as possible and substitute missing informations whenever possible.

This can be use-full in setups where only the needed parts in a network packet are set. The total_length field can be for example be set to zero if the total packet length is known and no padding is present. It is then still possible to determine the length of the payload (or total_length) by using the length of the slice as a substitute value.

Null/loopback header is incorrectly interpreted as `Ethernet2Header`

Hi @JulianSchmid, as you may know I'm using etherparse in Sniffnet.

Some users complained about the impossibility to see traffic when monitoring their VPN TUN interfaces, and today I tried it myself and found out that the problem resides in the fact that etherparse doesn't correctly identify the traffic as null/loopback, but it still identifies it as ethernet.

This is how a sample packet looks like in Wireshark:

As you can see, it's tagged as Null/loopback.

However, etherparse still categorise this as ethernet. The consequences are multiple:

source and destination MAC addresses are assigned to the first bytes of the packet even if it's not actually an ethernet header
IP and transport header are not detected even if they are present

In particular I double checked and found out that etherparse says that ip and transport header are None, and as you can see from the screenshot below MAC addresses are assigned to the first bytes of the packet, even if those bytes are not supposed to be MAC address but part of the Null/loopback header and part of the IP header.

I think the solution would just require to correctly parse the loopback header and changing the offset to automatically parse the following ip and transport headers.

Atempting to compile in a no_std crate results in an error with Sum16BitWords

Hello, I'm trying to compile etherparse 0.14.2 in a no_std crate but im getting errors about add_2bytes (among others) not existing.

cargo build
   Compiling etherparse v0.14.2
error[E0599]: no method named `add_2bytes` found for struct `Sum16BitWords` in the current scope
   --> /home/josh/.cargo/registry/src/index.crates.io-6f17d22bba15001f/etherparse-0.14.2/src/net/ipv4_header.rs:648:14
    |
647 | /         checksum::Sum16BitWords::new()
648 | |             .add_2bytes([
    | |_____________-^^^^^^^^^^
    |
   ::: /home/josh/.cargo/registry/src/index.crates.io-6f17d22bba15001f/etherparse-0.14.2/src/checksum.rs:4:1
    |
4   |   pub struct Sum16BitWords {
    |   ------------------------ method `add_2bytes` not found for this struct
    |
help: there is a method `add_16bytes` with a similar name
    |
648 |             .add_16bytes([
    |              ~~~~~~~~~~~

...

rror[E0560]: struct `Sum16BitWords` has no field named `sum`
  --> /home/josh/.cargo/registry/src/index.crates.io-6f17d22bba15001f/etherparse-0.14.2/src/checksum.rs:16:25
   |
16 |         Sum16BitWords { sum: 0 }
   |                         ^^^ `Sum16BitWords` does not have this field
   |
   = note: all struct fields are already assigned

Add `from_ether_type` functions for slicing & decoding Ethernet II payloads without header

Modifying existing ip packets

Thank you very much for your work on this awesome library.

I need to alter a packet that I am able to parse (using SlicedPacket::from_ip). Specifically, I need to be able to (i) alter the source IP, (ii) alter the source port, (iii) recompute the checksum and obtain a valid packet.

I can see some references to PacketBuilder, update_checksum_ipv4 and so on but I am curious what is the recommended (and most convenient) way to do this.

Warmest regards,
Shriphani

Support `no_std`

Feature wanted：change the field in place

For example:
To reuse the packet，swap the ip or port in place

Replace `payload_len` with `total_len` in Ipv4Header

The current implementation of the Ipv4Header has the field payload_length that does not exist in the real IPv4 header. The real IPv4 header has a field total_length that also contains the length of the header itself, while payload_length only describes the length of the payload of the IPv4 packet.

This sometimes leads to confusion what was actually present on the wire & also leads to some not-so-nice corner cases that can trigger error return values when writing & decoding headers (e.g. when the payload_len + header_len is bigger then u16::MAX).

Replace payload_len with total_len so the header mirrors the data of the actual on the wire header.

[Design Discussion] Looking for feedback on ICMP API

Hello - from #10 (comment), a little over a year ago, I volunteered to add support for ICMP parsing. So, a year is a long time - my bad - but I'm starting to dig into it now.

But as I dig into it, there are some messy API questions that I thought I would circulate before I did too much coding. The main issue is that may of the messages in Icmp4 and Icmp6 are similar in function but different in implementation. For example:

Icmp v4 (ipv4.proto = 1) and Icmp v6 (ipv6.next-header = 58 !?) actually use different protocol numbers
Icmp v4 echo request/reply use different icmp_types than Icmp v6 (8 vs. 128)
Most messages support similar functions (e.g., everything has a type/code/checksum) but some messages have fairly complex payloads that would be super useful to decode, e.g., unreachables with their encapsulated ip header from the unreachable packet
Some message types are v6 only and are also quite complicated, e.g., router advertisement

Long story short: there are a few ways to encode/abstract hide this info from the caller and I wanted to get folks opinion about their preferences before I did the work. Maybe there are even some better ways that folks could suggest - I'm still new to rust and learning the idiomatic magic. I'm trying to optimize on "what is most useful to the caller" (rather than what's easiest to implement) and even that's not particularly clear to me given that a good API. An apparent tension seems to be making something easier to parse (e.g., fewer nested match() statements) vs. trying to have the compiler prevent you from doing things you're not supposed to do (e.g., create a v6 router advertisement in a v4 packet).

Proposal 1 : "Fat struct"

The idea here is that there is just one struct for all ICMP messages (v4, v6, etc.) and all of the fields are Option<> and only populated if they make sense. This is great for parsing (only one match() statement) but horrible for preventing you from doing, e.g., you could add an unreachable IP header to an echo request packet.

pub struct Icmp {
     icmp_type: u8,
     icmp_code: u8,
    checksum: u8,
    // these are only valid for EchoRequest/Reply
    identifier: Option<u16>,
    sequence_number: Option<u16>,
    // these are only valid for unreachables 
     unreachable_ip: Option<IpHeader>,
   // ...
}

Proposal 2 : "Nested enums"


pub enum TransportHeader {
    Icmp(IcmpHeader),  // new!
    Udp(udp::UdpHeader),
    Tcp(tcp::TcpHeader)
}
enum IcmpHeader {  // nested once
      Icmp4(Icmp4Header),
      Icmp6(Icmp6Header),
}

enum Icmp4Header { // nested again
    EchoRequest4(EchoRequest4Msg),
    EchoReply4(EchoReply4Msg),
   //...
}

enum Icmp6Header {
    DestUnreachable6(DestUnreachable6Msg),
    EchoRequest6(EchoRequest6Msg),
    EchoReply6(EchoRequest6Msg),
    //...
}

// unique structs for each v4 and v6 msgs + some Traits for common operations

This is cleaner from an API standpoint, but is a PITA for parsing, e.g.,

match sliced_packet.ip.transport {
     Udp(_) => ...,
     Tcp(_) => ..., 
     Icmp(msg) => match msg {  // three levels of match just to get to the structs/data!
          Icmp4(icmp4) => match icmp4 { ...},
          Icmp6(icmp6) => match icmp6 {...},
     },
}

Proposal 3 : transport_v4 vs. transport_v6

We could split the crate::transport::TransportHeader into separate v4 and v6 instances:

pub enum TransportHeader_v4 {
    Icmp(icmp::Icmpv4Header),
    Udp(udp::UdpHeader),
    Tcp(tcp::TcpHeader),
}

pub enum TransportHeader_v6 {
    Icmp(icmp::Icmpv6Header),
    Udp(udp::UdpHeader),
    Tcp(tcp::TcpHeader),
}

but this also seems horrible and has the additional damage of breaking a lot of existing code.

Long story short I'm not happy with any of these options so I wanted to float this to the community to get feedback.


## References
* v4: https://en.wikipedia.org/wiki/Internet_Control_Message_Protocol
* v6 : https://en.wikipedia.org/wiki/Internet_Control_Message_Protocol_for_IPv6

[Feature Request] Slice mutating functions

First of all, thanks for writing this crate!

It would be pretty awesome if the slice types had functions to modify attributes, and these wrote-through to the underlying slice.

Another optimization that could be made here is that, since we know what's changing, you can perform an optimized checksum recalculation (see https://tools.ietf.org/html/rfc1631 section 3.3).

Add IEEE 802.1AE (also known as MACsec) support

Add support for IEEE 802.1AE (also known as MACsec) https://en.wikipedia.org/wiki/IEEE_802.1AE .

Any examples of working with ARP packets?

I'm trying to work with ARP packets, but I can't find any examples or documentation for this. More specifically, I'm trying to get the opcode (1=request, 2=reply) of an ARP packet. Any tips?

Further linux_sll support

Followup from #99

Specifically, some points missing are:

Implementation of PacketHeaders::from_linux_sll()
Implementation of LaxSlicedPacket::from_linux_sll()
Implementation of LaxPacketHeaders::from_linux_sll()
Revising the added test code, making sure it makes sense and the relevant cases are covered

Limit the size of `Ipv4Extensions`/`Ipv6Extensions`

These structs have large buffers as it contains arrays that are allocated on the stack, this pushes the size of the IpHeader struct at over 9kB (and therefore PacketImpl and PacketBuilderStep along with it).

dbg!(std::mem::size_of::<Option<IpHeader>>());
// std::mem::size_of::<Option<IpHeader>>() = 9280

This is not a big issue in itself, but it ended up causing a stack overflow in one of our test files where we build a bunch of packets to check different scenarios in a single chunk of code. This is not helped by the fact that rustc/LLVM don't seem to be able to reuse stack space after dropping any reference to PacketBuilderStep as soon as possible, at last using rustc 1.65 without optimizations.

Given that extensions are rarely used, maybe they should be stored as Vec instead of statically sized arrays, or IpHeader enum could be defined as:

pub enum IpHeader {
    Version4(Ipv4Header, Box<Ipv4Extensions>),
    Version6(Ipv6Header, Box<Ipv6Extensions>)
}

I'd be happy to try making a PR if this is considered to be an issue.

Add support for LINUX_SLL

Hi there,

I am working in a network traffic analyzer for a project and one of the traffic dataset samples I am using is encapsulated in "LINKTYPE_LINUX_SLL". Would it be possible to add support for it to the project? As it is said in the Wireshark Wiki,, it is a pseudo-protocol used to encapsulate traffic coming from all devices (the "any" device) or to use in place of the real link layer. Adding support for it, be it the part that can contain IP packets, would be useful for other applications wanting to handle traffic from multiple interfaces at the same time.

The format is specified in this tcpdump page, but I only really need to be able to parse IPv4/Ipv6 on top of it. The header format is relatively simple, mostly composed of 16 Bytes of which:

The bytes 0 and 1 indicate packet type in network order/big endian. With the meanings:
- 0 for unicast packets that were sent by other host and were intended for the receiver
- 1 for broadcast packets that were sent by other host
- 2 for multicast packets that were sent by other host
- 3 for unicast packets that were sent by other host and also were intended for another host
The bytes 2 and 3 indicate ARPHRD_ type. In the tcpdump page, it is noted that it can be either
- ARPHRD_NETLINK (value 824), meaning the encapsulated packet is a netlink packet and the associated protocol type is a Netlink protocol type. Probably out of scope of this project.
- ARPHRD_IPGRE (value 778), meaning the associated protocol type is a GRE protocol type.
- ARPHRD_IEEE80211_RADIOTAP (value 803), in which the associated protocol type is ignored and there's a radiotap link-layer and then a 802.11 header (#83)
- ARPHRD_FRAD (value 770), in which the associated protocol type is ignored and there is a "Frame Relay LAPF frame, beginning with a ITU-T Recommendation Q.922 LAPF header starting with the address field, and without an FCS at the end of the frame"
- Another value (with probably value 1, as I have in my captures for Ethernet, but it may be irrelevant). The associated protocol type may contain the ether type or one of 5 other possibilities.
The bytes 4 and 5 indicate the length of the sender address
The bytes 6 to 13 contain the address of the sender. If it is larger, it is cut to the first 8 bytes. If it is smaller, it is padded to fill the 8 bytes.
The bytes 14 and 15, contain the protocol type. In the case that the bytes 2 and 3 fell in the "other value" category, there are six possibilities:
1. If 0x0001, the payload is a Novell 802.3 frame without an 802.2 LLC header.
2. If 0x0003, we have a weird case. I haven't been able to find a lot of information, just that this values appearing in this case could be related to some Linux kernel bug.
3. If 0x0004, the payload begins with an 802.2 LLC header
4. If 0x000C, the payload is a CAN bus frame
5. If 0x000D, the payload is a CAN FD bus frame
6. Otherwise, it contains the normal Ethernet protocol type

As I understand, the additions would mainly consist on :

Adding from_linux_sll to SlicedPacket, LaxSlicedPacket, PacketHeaders and LaxPacketheaders
Extending the PacketBuilder to be able to create LINUX_SLL packets
Adding the respective variants and creating LinuxSllSlice, LinuxSllHeaderSlice and LinuxSllHeader and their associated functions.
Tests for the added functionality

I intend to work on this in the following days to be able to support this link type in my code. My idea is to mostly focus on the parts of the protocol that would allow me to retrieve the IPv4/Ipv6 packets. Is there any work previous work I should check or specific guidelines I should follow before creating a PR?

[Feature Request] ICMP

Support for reading/writing ICMP would be helpful. There's an opportunity after that for helper methods to generate common ICMP packets, for example fragmentation needed and time exceeded.

RFC: Adding support for parsing IPv4 Record Route Option

Hi all,

See title. Record route is very cool and very useful (can provide a bunch of citations if this is a disputed point), but it looks like it will require some surgery to add it to the existing extensions parsing logic.

Specifically,

[minor?] In v4, there are options (e.g., from rfc791 - RR, NO-OP, source route, etc.) and extensions (e.g., AUTH - currently implemented), which seem to populate the same space but use different terms - I'm assuming we don't need to disambiguate, but open to opinions here.
the main read() call for IPv4 extensions (https://github.com/JulianSchmid/etherparse/blob/master/etherparse/src/internet/ipv4_exts.rs#L35) doesn't take the options/extensions length as a parameter so that needs to be computed from the call sites
The actual read() logic needs to be updated with a cursor, while( index < length) logic to parse variable length options
The write() logic needs to be updated to understand if the sum of the options specified exceeds the total header length available (e.g., 4 bits of header * 4 bytes is max header size of 64 bytes - 20 bytes v4 base header --> max options length is 44 bytes)

I'm happy to put together a PR to put all of this together the way I think it should go, but wanted to start a thread here first to see if others agree/have different ideas.

Thoughts?

How to build TCP and raw ethernet packet by using Builder?

From the document, I know that PacketBuilderStep has a write method to dump raw bytes for UDP packet, but no such method for TCP or Ethernet packet.

How to build packets other than UDP with the builder? Or maybe will be supported in the future?

Thanks.

Add FCS and non FCS functions

SlicedPacket::from_ethernet_with_fcs
SlicedPacket::from_ethernet_without_fcs
Update SlicedPacket::from_ethernet to auto detect FCS
LaxSlicedPacket::from_ethernet_with_fcs
LaxSlicedPacket::from_ethernet_without_fcs
Update LaxSlicedPacket::from_ethernet to auto detect FCS

Support for 802.11 (WLAN) packets

I don't know if this would be out of scope, but one protocol that seems to be currently unsupported by this library is 802.11 (https://wiki.wireshark.org/Wi-Fi). It would be useful to support at least some subset of it, possibly falling back to raw bytes for complex and more niche frame types.

Support partial default initialization with `Ipv4Header` & `TcpHeader`

It would be nice if one could simply write:

let header = TcpHeader{
    source_port: 123,
    ns: true,
    ..Default::default()
};

But this currently causes a compile time error as both Ipv4Header as well as TcpHeader contain private fields. In both of structs cases these private fields contain a buffer for options as well as a length for how much of the buffer is filled.

We can get around this limitation by moving the buffers into their own types and making them public.

Thoughts on how to handle Arp parsing?

Hi @JulianSchmid - thanks again for the help in getting ICMP{v4,v6} landed. Now I'm volunteering to write Arp support as well, but I'd prefer to do it in a way that you're more likely to accept (and ideally is less work for you).

Good news, there's already an Arp enum.

In theory I could add some additional parsing code at https://github.com/JulianSchmid/etherparse/blob/master/etherparse/src/packet_decoder.rs#L255 and just extend InternetSlice and InternetHeader to add the Arp variant. It would be a breaking change for a lot of people (including my code!) but probably the best way to do it.

https://github.com/JulianSchmid/etherparse/blob/master/etherparse/src/internet/internet_slice.rs#L7

Should I write this up or do you have any other concerns that would need to be addressed at the design phase?

Remove `ValueError`'s in writes by introducing more restrictive data types

Currently certain fields in some headers can be filled with invalid data that can not be represented on the wire. An example is the flow_label field in the Ipv6Header. In the struct it currently is an u32 but on the wire it is a 20-bit value. That means you can assign values to the struct that are not representable on the wire. Currently this would trigger a ValueError when writing a header, but it would be nicer if no "invalid" data is even settable in the header in the first place and the actual setting on an invalid data triggers an error.

The idea would be to introduce struct types (e.g. Ipv6FlowLabel) that implements the core::convert::TryFrom trait and disallow setting of invalid values in the first place.

"Downwards" parsing (currently) makes it difficult to work with unpopular protocols

The problem

The etherparse family of from functions parse a given u8 slice "downward" by design. This generally means that if etherparse knows the format of a payload at any point in the parsing process, it will parse it.

At first glance, this feature sounds great. One of the central tenets of the etherparse library is that it places a particular emphasis on the most popular packet-based protocols, and downwards parsing does just that. Suppose, for example, that you had a u8 slice containing a set of nested packets that all came from popular networking protocols, e.g. Ethernet -> IP -> TCP. With downward parsing, a Rust programmer could point a single from_ethernet function at this slice, and parse all three.

At the same time, the current implementation of downward parsing also inhibits the use of etherparse for unpopular or custom protocols. Suppose, now, that you wanted to use etherparse to implement an IP router that received IP packets along a set of interfaces and forwarded them along to their intended destination. You should not have to care about payload format. For each u8 slice that came in along an interface, you would naturally reach for the from_ip function to parse it into a SlicedPacket. The problem is the payload pointer. In my mind, I think that most Rust programmers in this situation would assume that the payload always pointed to the payload of the IP packet. But it does not. By design, if etherparse happened to recognize the format of the IP packet payload as another packet (e.g. a TCP packet), the payload pointer would instead point to the payload of that packet, because etherparse would have gone ahead and eagerly parsed it for you.

In summary:

If you want to use etherparse to help you implement some part of the network layer -- in this specific case, an IP router -- you might have to violate the separation of concerns that is characteristic of the OSI model and dip into the transport layer.
While the current implementation of downward parsing makes it easy to use etherparse for popular protocols, it does so at the cost of introducing what I would characterize as inconsistent and confusing behavior that makes it harder to use etherparse for unpopular protocols.

Possible solutions

As with #26, I am happy to do the work of implementing the solution to this problem. However, before I do, I want to discuss how I should go about solving it. Among others, I see at least two solutions:

Do away with "downward" parsing.
Split payload into separate pointers that each reliably point to the same payload.

I am personally in favor of the first solution, especially because even the packets of popular protocols do not always follow the particular "downwards" nesting order (Ethernet -> IP -> TCP) that etherparse assumes they do. For example, it is perfectly acceptable (and perhaps even somewhat common) to nest an IP packet inside of a TCP packet. (See Tunneling) Likewise, I think that is perfectly reasonable to ask Rust programmers to parse each payload themselves.

[Feature request] Checksum validation

In the case of reading packets from outside sources (SlicedPacket, etc) it would be useful to have an easy way to validate that the checksums on various headers are correct. A fn is_valid_checksum(&self) -> bool or similar.

Why is ip_payload_ref optional?

Hi,

Noticed today that NetSlice::ip_payload_ref() returns Option<&IpPayloadSlice<'a>> but can't really figure out the reason why because it seems that both IPv4/6 have payload. Is it some smart trick or a genuine omission?

impl<'a> NetSlice<'a> {
    /// Returns a reference to ip payload if the net slice contains
    /// an ipv4 or ipv6 slice.
    #[inline]
    pub fn ip_payload_ref(&self) -> Option<&IpPayloadSlice<'a>> {
        match self {
            NetSlice::Ipv4(s) => Some(&s.payload),
            NetSlice::Ipv6(s) => Some(&s.payload),
        }
    }
}

TryFrom for IpNumber

Just a mild convenience: TryFrom/try_from for u8 -> IpNumber, to avoid having to pull something else in!

Confusion in payload sizes

I'm working with ingesting packets from AF_PACKET and I'm running into some confusion in the sizes of the payload field.

I'm matching SlicedPacket::from_ethernet and if let-ing Some(Udp(header)), implying I get a valid UDP paylod (I think). But when I compare payload.len() and header.len() - 8 I get different values.

Specifically, my payloads are 8192 byte jumbo frames (which I see in wireshark and tcpdump). header.len() - 8 properly reports this, but payload.len() reports 8196 and 15956. I have no idea why sometimes it says 15956 (especially considering we've already destructured a valid payload) or why in the more normal looking case, it's 4 bytes bigger.

I'd appreciate any input. Thanks!

Weak naming

Please consider using the protocol data units and not just naming every single thing a packet. Reorganizing everything according to their real units would bring clarity and would also avoid further confusion and mix ups.

More specific error types

Hey, thanks for your library :) I want to use the manual slicing methods but noticed that they all return the ReadError enum, which contains variants that cannot actually occur.

For example with Ethernet2HeaderSlice.from_slice the only error variant that can actually occur is UnexpectedEndOfSlice.

Would you accept a PR changing ReadError to something like the following?

pub enum ReadError {
    IoError(Error),
    UnexpectedEndOfSlice(UnexpectedEndOfSliceError),
    DoubleVlan(DoubleVlanError),
    IpUnsupportedVersion(u8),
    Ipv4(Ipv4Error),
    Ipv6(Ipv6Error),
    IpAuthenticationHeader(IpAuthenticationHeaderError),
    Tcp(TcpError),
}

Ethernet2HeaderSlice.from_slice would then return a Result<Ethernet2HeaderSlice<'a>, UnexpectedEndOfSliceError>
Ipv4HeaderSlice.from_slice would then return a Result<Ipv4HeaderSlice<'a>, Ipv4Error>
etc.

No good way to get transport layer protocol for ipv6

given:

pub enum InternetSlice<'a> {
    Ipv4(Ipv4HeaderSlice<'a>, Ipv4ExtensionsSlice<'a>),
    Ipv6(Ipv6HeaderSlice<'a>, Ipv6ExtensionsSlice<'a>),
}

transport layer protocol is in Ipv6HeaderSlice, unless there are extension headers, in which case one must iterate through extensions to get the last. The shortest I've managed is let (_, next_proto, _) = Ipv6Extensions::from_slice(v6.next_header(), v6_ext.slice()).

Breaking change, but I'm wondering if putting the extensionsslice back inside the headerslice would make sense. This would allow a method on ipv6headerslice to have access to extensionslice and do the calculations to return the correct result.

[Feature Request] TCP Stream Reassembly

Hello,

Firstly, thank you for the excellent library, it is clean and easy to use.

One thing I have been looking for is an equivalent of wiresharks 'tcp_dissect_pdus' (see section 9.4.2) that handles TCP packet reassembly.

This may be out of scope of this library, but I thought it wouldn't hurt to ask!

Correct payload length for packets with Ethernet FCS

Currently the library just assumes that the rest of the packet is always belonging to the lower layer protocol and ignores the lengths specified in IP & UDP. But this is not always the case, there are recordings where the Ethernet Layer also contains a FCS (Frame Check Sequence) which is located at the very end of the packet.

The packet slicing and decoding functions should automatically exclude the FCS from the payload.

Implement is_fragmenting_payload() for InternetSlice

Since both variants of InternetSlice implement is_fragmenting_payload(), it can also be trivially implemented for InternetSlice.

Setup build & test CI jobs for 16 bit target

There is currently no CI build for 16 bit platforms & there is also no test build. This should probably be setup.

Support partial default initialization with `IpAuthHeader` & `Ipv6RawExtHeader`

It would be nice if one could simply write:

let header = IpAuthHeader{
    next_header: 123,
    icv: [1,2,3,4].into(),
    ..Default::default()
};

let header = Ipv6RawExtHeader{
    next_header: 123,
    payload: [1,2,3,4,5,6].into()
};

But this is currently not possible as the payload buffers are implemented via some private fields. In both of structs cases these private fields contain a buffer for options as well as a length for how much of the buffer is filled.

We can get around this limitation by moving the buffers into their own types and making them public.

Add support for IPv6 Jumbograms

Add "Jumbo Payload Option" parsing and determine the payload size based on the option.

etherparse 0.12.0 should have been etherparse 0.11.1

The only changes in etherparse in 0.12.0 are new APIs, which are backwards-compatible changes.

Cargo follows semver, with the added rule that 0.y.z+1 is compatible with 0.y.z.

If the rules are followed, that means less bumps for downstream users, which is nice.

https://doc.rust-lang.org/cargo/reference/semver.html

This guide uses the terms "major" and "minor" assuming this relates to a "1.0.0" release or later. Initial development releases starting with "0.y.z" can treat changes in "y" as a major release, and "z" as a minor release. "0.0.z" releases are always major changes. This is because Cargo uses the convention that only changes in the left-most non-zero component are considered incompatible.

The "etherparse::ReadError" enum does not implement the "std::error::Error" trait, which does not seem to allow for using the "?" operator

Hello,

Thank you for your great project.

I would note that when writing a function like this:

use std::error::Error;
use etherparse::Ipv4Header;

extern crate etherparse;

fn some_parsing_function(input_payload: &[u8]) -> Result<Option<&[u8]>, Box<dyn Error>> {
    let (ip_header, inner_payload): (Ipv4Header, &[u8]) = Ipv4Header::read_from_slice(input_payload)?;
    Ok(None)
}

Will cause the compiler to raise the following error:

error[E0277]: the trait bound `etherparse::ReadError: std::error::Error` is not satisfied
  --> src/main.rs:16:116
   |
16 |             let (ip_header, inner_payload): (Ipv4Header, &[u8]) = Ipv4Header::read_from_slice(input_payload)?;
   |                                                                                                             ^ the trait `std::error::Error` is not implemented for `etherparse::ReadError`
   |
   = note: required because of the requirements on the impl of `std::convert::From<etherparse::ReadError>` for `std::boxed::Box<dyn std::error::Error>`
   = note: required by `std::convert::From::from`

It seems to be caused by the fact that etherparse::ReadError does not implement std::error::Error.

I think that it would be handy to be able to use the ? operator in this case (maybe that there is another way?).

Regards,

Thoughts on a PacketHeaders implementation without a &[u8] reference for payload?

I post this not necessarily as a bug or feature request, but for honest discussion. I'm not really a rust expert but I'm trying to learn and am curious if others have similar issues.

I commonly couple etherparse with the rust-pcap (https://docs.rs/pcap/latest/pcap/) library and write code similar to this:

fn main() {
    let mut cap = Device::lookup().unwrap().open().unwrap();
   
    let connection_tracker = ConnectionTracker::new(....); // custom TCP connection tracking lib
   
    while let Ok(packet) = cap.next_packet() {
        if let Ok(parsed) = ParsedHeaders::from_ethernet(packet.data) {
                  connection_tracker.process(parsed);
        }
    }
}

struct ConnectionTracker {
      cached_pkts: Vec<PacketHeaders>,
      // ....
}

impl ConnectionTracker {

     fn process(&mut self, headers: PacketHeaders) {
              if interesting(&headers) {
                     // cache the packet for later inspection
                     self.cached_pkts.push_back(headers);
              }
     }
}

... but of course this code doesn't work because PacketHeaders contains a payload member which is a &[u8] slice, which means that one has to declare the specific lifetime for anything that stores PacketHeaders (https://docs.rs/etherparse/0.13.0/etherparse/struct.PacketHeaders.html), which eventually forces ConnectionTracker to be in the 'static lifetime which is not what I want.

How do others work around this? I'd love tof this to be just a dumb thing that I don't get about Rust.

I'm currently hacking around this by declaring a new struct "SimplePacket" which is member for member the same as PacketHeaders, except that 'payload' is a Vec[u8] rather than a &[u8]. I understand that the &[u8] is for performance and prevents the code from needing a memcpy() under the covers, but I find it really hard to work around and was, ultimately - in a long winded way - if other people have the same issue, what they do, and if it's worth while considering either changing PacketHeaders to remove the &[u8] or maybe declaring a new (third!) packet holding struct simillar to the SimplePacket that I did in my code.

Feedback welcome and thank you again for the great library!