ccsdspy / ccsdspy Goto Github PK

I/O interface and utilities for CCSDS binary spacecraft data in Python. Library used in flight missions at NASA, NOAA, and SWRI

Home Page: https://ccsdspy.org

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

binary ccsds decoding operations packets python science space

ccsdspy's Introduction

CCSDSPy - IO Interface for Reading CCSDS Data in Python.

This community-developed package provides a Python interface for reading tightly packed bits in the Consultative Committee for Space Data Systems (CCSDS) format used by many NASA and ESA missions. The library is developed with requirements sourced from the community and extensive automated testing.

Used By

https://raw.githubusercontent.com/ccsdspy/ccsdspy/main/docs/_static/used-by/small/mms.jpg

https://raw.githubusercontent.com/ccsdspy/ccsdspy/main/docs/_static/used-by/small/pace.png

https://raw.githubusercontent.com/ccsdspy/ccsdspy/main/docs/_static/used-by/small/hermes.png

https://raw.githubusercontent.com/ccsdspy/ccsdspy/main/docs/_static/used-by/small/csa.png

https://raw.githubusercontent.com/ccsdspy/ccsdspy/main/docs/_static/used-by/small/punch.png

https://raw.githubusercontent.com/ccsdspy/ccsdspy/main/docs/_static/used-by/small/spherex.png

https://raw.githubusercontent.com/ccsdspy/ccsdspy/main/docs/_static/used-by/small/elfin.jpg

https://raw.githubusercontent.com/ccsdspy/ccsdspy/main/docs/_static/used-by/small/padre.png

Do you know of other missions that use CCSDSPy? Let us know through a github issue!

Installation

To install ccsdspy

pip install ccsdspy

Usage Example

The following example shows how simple it is to read in fixed length CCSDS packets.

import ccsdspy
from ccsdspy import PacketField, PacketArray

pkt = ccsdspy.FixedLength([
     PacketField(name='SHCOARSE', data_type='uint', bit_length=32),
     PacketField(name='SHFINE',   data_type='uint', bit_length=20),
     PacketField(name='OPMODE',   data_type='uint', bit_length=3),
     PacketField(name='SPACER',   data_type='fill', bit_length=1),
     PacketField(name='VOLTAGE',  data_type='int',  bit_length=8),
     PacketArray(
         name='SENSOR_GRID',
         data_type='uint',
         bit_length=16,
         array_shape=(32, 32),
         array_order='C'
     ),
])

result = pkt.load('mypackets.bin')

Documentation

Our documentation is hosted on readthedocs and can be found here.

Getting Help

For more information or to ask questions about the library or CCSDS data in general, check out the CCSDSPy Discussion Board hosted through GitHub.

Acknowledging or Citing ccsdspy

If you use ccsdspy, it would be appreciated if you let us know and mention it in your publications. The code can be cited using the DOI provided by Zenodo. The continued growth and development of this package is dependent on the community being aware of it.

Code of Conduct

When interacting with this package please behave consistent with the following Code of Conduct.

ccsdspy's People

Stargazers

Watchers

Forkers

fgallardo 5tefan ezrabrooks nithins12 rukku aggelis aidenszeto biganans amr-navy ehsteve bchammer805 cgobat rstrub jmbhughes jmorton gfireman scandey tloubrieu-jpl

ccsdspy's Issues

Byte ordering specified in PacketField not taken into account for float datatypes

I am finding crazy values for float values among sane int and uint values. Changing byte_order for float datatypes did not resolve. Digging in, it turns out that the default float datatype created from https://github.com/ddasilva/ccsdspy/blob/f3d9d9f45c950df0b8d0aa610396b845c27d1832/ccsdspy/decode.py#L63 defaults to system byteordering, probably little endian these days and will not work to parse bigendian.

I'll open a resolving PR in a second.

Metadata incorrect when trying to install from pip. Instead, pip installs version 0.0.13 and won't install 1.1.1.

root@137a5b971b25:/CoDICE# pip install ccsdspy==1.1.1
Collecting ccsdspy==1.1.1
  Using cached ccsdspy-1.1.1.tar.gz (4.7 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
  WARNING: Generating metadata for package ccsdspy produced metadata for project name unknown. Fix your #egg=ccsdspy fragments.
Discarding https://files.pythonhosted.org/packages/0b/32/256d1c20ef299cc449b5b669c9fde531eee9eabaa251abc9ee851ea7a068/ccsdspy-1.1.1.tar.gz#sha256=9682d22101cfc88d46cc2d8a04e25dc29cc5d91a695ea2e640d4764deb9e0f82 (from https://pypi.org/simple/ccsdspy/) (requires-python:>=3.6): Requested unknown from https://files.pythonhosted.org/packages/0b/32/256d1c20ef299cc449b5b669c9fde531eee9eabaa251abc9ee851ea7a068/ccsdspy-1.1.1.tar.gz#sha256=9682d22101cfc88d46cc2d8a04e25dc29cc5d91a695ea2e640d4764deb9e0f82 has inconsistent name: filename has 'ccsdspy', but metadata has 'unknown'
ERROR: Could not find a version that satisfies the requirement ccsdspy==1.1.1 (from versions: 0.0.12_fixed, 0.0.7, 0.0.8, 0.0.9, 0.0.10, 0.0.13, 1.0.0, 1.1.0.post1, 1.1.1)
ERROR: No matching distribution found for ccsdspy==1.1.1

Metadata incorrect when trying to install from pip. Instead, pip installs version 0.0.13.
Any of the 1.x versions do the same.

Fix bit_offset support for PacketAraay

This bitoffset should be incremented each time when the array is expanded, but it is not. This causes it to be read incorrectly.

Have the packet arrays at the same location in the parsed results (load output) as in the VariableLength packet definition

As for now, at the output of function _unexpand_field_arrays they are pushed at the end of the list of fields, instead of where they were origninally defined.

This is a minor request, but if possible, that would help the user to retrieve the parsed data where it was initially defined.

Add the ability to define an array variable in a PacketField

Packets may include arrays which would could be returned as a numpy array. For example, an image array may consist of 128 8-bit elements. Currently the field would have to be defined as

PacketField(name='IMAGE', data_type='uint', bit_length=1024)

and the user would have to further parse this field into an array.

PacketArray(name='IMAGE, shape=128, data_type='uint', bit_length=8)

which would return np.array(128, dtype=uint8)

and maybe even allow for 2d arrays

PacketArray(name='IMAGE, shape=(8,8), data_type='uint', bit_length=8)

which would return np.array((8,8), dtype=uint8)

Given how easy it is to reshape a numpy array maybe it is not worth supporting multidimensional arrays.

Europa-Clipper Science Data System uses CCSDSpy

We now do use ccsdspy for our production parsing of the CCSDS packets.
Here is the logo of the mission:

Add XTCE Support through space_packet_parser library

This issue is to track the addition of XTCE support through an optional dependency to the space_packet_parser library (https://github.com/medley56/space_packet_parser).

Support is proposed to be through the from_file('something.xml') method, which returns a custom object that wraps the space_packet_parser module.

Allow optional secondary header fields based on secondary header flag

This tends to not happen very often in practice, but it's part of the spec and we should support it. PacketField's should have a keyword argument secondary_header=True (defaults False). When this keyword argument is true, they are only included parsed into the packet body when the secondary header flag is 1.

The array in the return dictionary should be made a MaskedArray in this case. It only makes sense to do this for VariableLength, because the FixedLength decoder relies on fixed spacing between the packets.

Provide access to data in the primary ccsds header

The class ccsdspy.FixedLength provides load() which returns a dictionary with the data from the packet fields but does not provide any info from the primary ccsds header.

Values which would be very helpful would be

the ap_id
an array of the packet counts. Packet counters which would helpful to ensure that no packets were missed.

My preference would be to provide all of the cssds header values.

Fail Safe on Corrupted Packets, Generate Error Report

Test Data + Code.zip

Consider providing a way to cite this package in the scientific literature

Scientific packages (e.g. sunpy, scipy) provide a way to cite their use by providing a doi (e.g. scipy). There are many ways to get a doi associated with the package

zenodo - allows you to get a doi without writing a paper, sunpy does this for every release.
Journal of Open Source Software (JOSS allows you to get a doi by submitting to them a short narrative paper and the package. Your code is actually reviewed before being accepted.
submitting a paper to an actual journal. Many journals will now accept papers about software tools.

Of course, you do not have to do just one. SunPy has actually done all of them! See this page.

Provide version number

This is required for operational code to be able to pin the required version to a stable release.

Feature Request: load from byte array for live parsing

It seems like CCSDSpy would be viable for live plotting of telemetry if I could pass in a byte string/array of chunks of mixed binary data and process a second's worth of data at a time.

Absolutely reasonable to say this is out of scope, but it feels close to being viable with the existing structure. I see that _load wants a numpy array of bytes and I wish I could just dump that array in directly rather than saving to file first. For my particular data stream, I can guarantee even splits between packets so hopefully it'd be relatively clean.

In the longer term, allowing such a thing might benefit from a little bit of overhead to handle the case of missing bytes or extra bytes (if a packet is split across a chunk boundary). Both cases would preferably (to me) still return the successful packets up to that point and then return the incomplete packets or extra bytes for handling at a higher level.

Add post-processing steps to aide in calibration and time conversion

It would be very useful fulfill to implement post-processing steps, including time conversation and things like converting digital values to analog using known calibration curves. For instance, sometimes temperature will be sent down from a spacecraft as a 0-255 digital value, and you have some linear function y=mx+b from pre-launch engineering that converts the digital value to degrees Celsius. Another common use case is replacing integer values with string values, such as replacing 0 with "DISABLED", 1 with "ENABLED", and 2 with "STANDBY".

Handling a 48-bit time field split between 32-bit coarse time and 16 bit fine time could be handled like this. The result in the dictionary returned by pkt.load() would be an array of datetimes.

     rtime42_conv = RTime42Converter(
         coarse_length=32,
         fine_length=16,
         reference_time=datetime(1970, 1, 1),
         coarse_units='s',
         fine_units='ms'
      )

     pkt = FixedLength([
         ....
         PacketField(name="packet_time", bit_length=48, data_type='uint', converter=rtime42_conv)
         ....
     ])

Calibration curves could be handled like this (having both PolyConverter and LinearConverterer is redundant, but reads better).

     pkt = FixedLength([
         ....
         PacketField(
              name="temperature", bit_length=8, data_type='uint',
              converter=LinearConverter(slope=1.2, intercept=-30)
         ),
         PacketField(
              name="current", bit_length=8, data_type='uint',
              converter=PolyConverter(coeffs=[0.1, 1.2, 0.3])
         ),
         ....
     ])

Add coverage checking

It's a good idea to monitor and track test coverage to make sure that coverage is not decreasing over time. There are some free tools like codecov that integrate with github actions and provide a coverage badge as well as coverage feedback on each merge request. See this README for an example of the badge and this merge request to see an example of codecov reports.

Add the ability to create simulated packets

It is often very helpful to be able to create simulated packets especially in the early phases of development or for testing. The packet types FixedLength and VariableLength could be expanded to include a function (to_file?) to output binary data if given arrays of data to fill in the packet fields.

to_file(data: dict)

where dict includes the packet field names as indices and numpy arrays for the values. The data dict should mirror the data dict that is provided by parsing the file.

Buffered fields in packets

We have packets with this structure:

	field_A	complete	data	field_C
1	1	yes	ertergdfg	3
2	2	no	aaaaaa	4
3	4	no	bbbbb	4
4	5	yes	cccccc	2

The complete field tells if the data field is complete yet or not.

So if complete==no that means the data value to be considered needs to be concatenated with the next packet data field value, as long as the complete value is still no.

So in this case, the data value we want to packet 2 is aaaaaabbbbbcccccc.

It is important to get the values from multiple packets together as specified here because we want to apply a decompression algorithm on top of this value (we were thinking of the CCSDSpy converters) and the algorithm only works on the full data stream, in this example the value aaaaaabbbbbcccccc.

In our case, the values for field_A and field_C of packets 3 or 4 do not matter and can be ignored, but I don't know if this would be true in similar use cases.

Do you know how this could be implemented in CCSDSpy ? Maybe the converter could be an object instance which is persisted for the full CCSDS packet streams ? So that we can buffer the data field value until the field is complete and decompression can be applied to it ? From the documentation there https://docs.ccsdspy.org/en/latest/user-guide/converters.html is sounds to be the case.

Provide function to split data stream into packets

This is an issue to split a data stream into a list of bytes objects, one for each packets. This would go in ccsdspy.utils. It would mainly be useful for debugging purposes.

Create a stable release after commit 31f1a33377fc540096722520fa1512a986f8f5b0

For Europa-clipper we will have a stable release of our Science Data System which has a dependency on ccsdspy which need to be on a stable release as well.

The latest PR that we need in ccsdspy is #118

Thanks

Collaboration question

Hi Daniel,

I am using a ticket as an open letter to you.

I work at JPL, currently contributing to the Europa-Clipper Science Data System.

We inherited a CCSDS parsing python library from a student in the team and since we were very happy with it, and making extensions on it (such as reading packets with variable length), we were willing to open-source it.

But I realized you already have a successful development which does what we need minus maybe a few features, so we changed our strategy to:

migrate the CCSDS parsing functions of our system to CCSDSpy
create pull request on your repository to add the possibly missing features

We would like to hire a student to do this work this summer (June-August). Would you support this initiative ?

We don't have the details yet on the open-sourcing procedure on our side, but that might involve to move your repository from a personal organization to something more official (e.g. http://github.com/NASA-PDS/). Maybe you already have an opinion on that ?

Thanks,

Documentation for CSV File Packet Definition

It would be good to have a section in the user guide explaining how to define packet definitions using CSV files. This ticket is to add that section to the user's guide.

Add converters to convert segments of bytes to strings in binary, oct, or hex representation

This ticket is to support converts that turn segments of bytes into strings in binary, oct, or hex representation. This came out of conversations with @tloubrieu-jpl and @nischayn99 about something that they had in their legacy code for parsing CCSDS packets in Europa Clipper, that would be nice to have built in.

I think the most basic way this could be done would be as follows:

from ccsdspy import FixedLength, converters

pkt = FixedLength([
    PacketArray(
        name='some_bytes',
        data_type='uint',
        bit_length=8,
        array_shape=100,
    ),
])
pkt.add_converted_field(
    "some_bytes", 
    "some_bytes_as_string", 
    converters.StringifyBytes(format="oct")  # or "hex", "bin"
)
result = pkt.load('MyCCSDS.tlm')

print(result['some_bytes_as_string'])

Which would print an array of strings in oct format.

Add the ability to define a packet format using a file as input

This may be easier to do if a new object was created that held the list of PacketFields. This object could then read a file. Such a file could be as simple as a csv file.

Rename method names `from_file`, `load`, `to_file`

Currently, the method names are

load(filename) - to parse a binary file
from_file(filename) - to create a packet definition from a file (only csv right now)
to_file(filename) - to create a binary file with synthetic data (coming soon!)

I feel like all of these function names are unclear and require someone to read the documentation to figure out what they do which should not be the case.

The purpose of this issue is to discuss whether it is worthwhile to change these names and what the new names might be.

SPHEREx use of ccsdspy

Hi, The NASA SPHEREx mission will use ccsdspy:
https://www.jpl.nasa.gov/missions/spherex

Cheers,
Andrew

Files with unexpected EOF cannot be processed.

The "assert" statement causes the process to fail. Desired behavior is to process all complete packets.

Raise a warning if sequence count is non-consecutive

To let the user know if the packet sequence count in the primary header skips a number which would imply that a packet was lost. The sequence count is per APID therefore they would have to be checked individually.

Odd numbers of packets in a file cannot be loaded when bit_length of array item is 12

There is some very strange behavior going on when a Packet Array field has 12 bits and the file being opened has an odd number of packets in it.

Ends up with the following value error. All but the last field seem to be executing, but it gets stuck on the last one. (not sure that the earlier ones are actually correct though...)

Traceback (most recent call last):
  File "/Users/scott/x/x/fake_twelvebit_data.py", line 29, in <module>
    pkt.load('fake_twelve_single.ccsds')
  File "/Users/scott/x/x/.venv/lib/python3.11/site-packages/ccsdspy/packet_types.py", line 169, in load
    packet_arrays = _load(
                    ^^^^^^
  File "/Users/scott/x/x/.venv/lib/python3.11/site-packages/ccsdspy/packet_types.py", line 622, in _load
    field_arrays = _decode_fixed_length(file_bytes, fields)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/scott/x/x/.venv/lib/python3.11/site-packages/ccsdspy/decode.py", line 207, in _decode_fixed_length
    arr[i :: meta.nbytes_final] = file_bytes[
    ~~~^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not broadcast input array from shape (0,) into shape (1,)

Unfortunately this is beyond my debugging prowess, I really couldn't follow the logic of that section of code to find the (presumed) off-by-one error. I did generate a quick script to replicate the issue and a similar script with 8 bit values instead of 12 that does not replicate the issue. Both scripts can be run directly or as faux-notebooks in VSCode. (EDIT: Behavior will replicate with either 1.1.1 or main)

Example script for twelve bit array

# %%
import ccsdspy
from ccsdspy import PacketField, PacketArray
import io

# %%
pkt = ccsdspy.FixedLength([
    PacketArray(name='twelve', data_type='int', bit_length=12, array_shape=(8, 1), array_order='C')
])
# %% One packet does not work
fakepkt = io.BytesIO(b"\x00\x01\xC0\x00\x00\x0B\x00\x00\x01\x00\x20\x03\x00\x40\x05\x00\x60\x07")
print(f'Fake twelve single file has {fakepkt.getbuffer().nbytes} bytes')
with open('fake_twelve_single.ccsds', 'wb') as file:
    file.write(fakepkt.getbuffer())

# %% Two packets works fine
fakepkts_even = io.BytesIO(b"\x00\x01\xC0\x01\x00\x0B\x00\x00\x01\x00\x20\x03\x00\x40\x05\x00\x60\x07\x00\x01\xC0\x02\x00\x0B\x00\x00\x01\x00\x20\x03\x00\x40\x05\x00\x60\x07")
print(f'Fake twelve even file has {fakepkts_even.getbuffer().nbytes} bytes')
with open('fake_twelve_even.ccsds', 'wb') as file:
    file.write(fakepkts_even.getbuffer())

# %% Three packets does not work
fakepkts_odd = io.BytesIO(b"\x00\x01\xC0\x01\x00\x0B\x00\x00\x01\x00\x20\x03\x00\x40\x05\x00\x60\x07\x00\x01\xC0\x02\x00\x0B\x00\x00\x01\x00\x20\x03\x00\x40\x05\x00\x60\x07\x00\x01\xC0\x03\x00\x0B\x00\x00\x01\x00\x20\x03\x00\x40\x05\x00\x60\x07")
print(f'Fake twelve odd file has {fakepkts_odd.getbuffer().nbytes} bytes')
with open('fake_twelve_odd.ccsds', 'wb') as file:
    file.write(fakepkts_odd.getbuffer())
# %%
print(f'Attempting to load just one packet from file')
pkt.load('fake_twelve_single.ccsds')
# %%
print(f'Attempting to load two packets from file')
pkt.load('fake_twelve_even.ccsds')
# %%
print(f'Attempting to load three packets from file')
pkt.load('fake_twelve_odd.ccsds')

Equivalent working code for 8 bit array

# %%
import ccsdspy
from ccsdspy import PacketField, PacketArray
import io

# %%
pkt = ccsdspy.FixedLength([
    PacketArray(name='eight', data_type='int', bit_length=8, array_shape=(8, 1), array_order='C')
])
# %% One packet does not work
fakepkt = io.BytesIO(b"\x00\x02\xC0\x00\x00\x07\x00\x01\x02\x03\x04\x05\x06\x07")
print(f'Fake eight single file has {fakepkt.getbuffer().nbytes} bytes')
with open('fake_eight_single.ccsds', 'wb') as file:
    file.write(fakepkt.getbuffer())

# %% Two packets works fine
fakepkts_even = io.BytesIO(b"\x00\x02\xC0\x00\x00\x07\x00\x01\x02\x03\x04\x05\x06\x07\x00\x02\xC0\x01\x00\x07\x00\x01\x02\x03\x04\x05\x06\x07")
print(f'Fake eight even file has {fakepkts_even.getbuffer().nbytes} bytes')
with open('fake_eight_even.ccsds', 'wb') as file:
    file.write(fakepkts_even.getbuffer())

# %% Three packets does not work
fakepkts_odd = io.BytesIO(b"\x00\x02\xC0\x00\x00\x07\x00\x01\x02\x03\x04\x05\x06\x07\x00\x02\xC0\x01\x00\x07\x00\x01\x02\x03\x04\x05\x06\x07\x00\x02\xC0\x02\x00\x07\x00\x01\x02\x03\x04\x05\x06\x07")
print(f'Fake eight odd file has {fakepkts_odd.getbuffer().nbytes} bytes')
with open('fake_eight_odd.ccsds', 'wb') as file:
    file.write(fakepkts_odd.getbuffer())
# %%
print(f'Attempting to load just one packet from file')
pkt.load('fake_eight_single.ccsds')
# %%
print(f'Attempting to load two packets from file')
pkt.load('fake_eight_even.ccsds')
# %%
print(f'Attempting to load three packets from file')
pkt.load('fake_eight_odd.ccsds')

Add the ability to parse variable length packets

Many ccsds packets are not fixed length. The easiest way to implement this capability would be to add a new PacketField without a length specified. If only one such PacketField is defined that way, it would be possible to parse the rest of the packet and then fill the remainder of the packet values into this variable length PacketField.

Remove dependency on astropy?

Awesome library!

My only critique would be the dependency on astropy for the Table class. Would you consider accepting to remove astropy as a dependency? The only material place astropy appears to be used is here in ccsdspy/interface.py#L108 to wrap the result. I think it would be a little more flexible to allow the user to import astropy.table.Table and wrap the result if they want it wrapped. Astropy is a pretty big dependency to require just for this.

https://github.com/ddasilva/ccsdspy/blob/7eb6226aa30ca84368c08014466d09436343ce47/ccsdspy/interface.py#L108

Update the badge in README to point to github action results

The badge currently points to travis. Github actions can generate badges as well. See the official documentation.

Add the ability to define array items using `from_file`

I would suggest the following convention

name, bit_length, data_type
ARRAY, 8, uint(16)

So uint(shape).
This matches the programmatic way to define it and does not add too much complexity to the csv file.

PacketArray(name='SENSOR_GRID', data_type='uint', bit_length=16,
            array_shape=(32, 32), array_order='C')

Integer parsing mismatch

Hi @ddasilva ,

For this packet field,

ccsdspy.PacketField(f"FGx_CH{c}_{i}", bit_length=24, data_type='int')

When the expected value is positive, we're getting the correct result, but when it is negative, we're getting erroneous values. For example, when the expected value is -88, the value we're getting is 16777128. Could you please let us know what we could do?

Thanks,
Nischay Nagendra

`PacketArray` represented as `PacketField`

Issue

I noticed when doing some testing that when I created a PacketArray and then printed it that the string representation said PacketField instead of PacketArray. I don't think this was intended, but feel free to close the issue if it was.

Relevant area of code:

ccsdspy/ccsdspy/packet_fields.py

Line 74 in 0d19cd9

if values["cls_name"] == "PacketArray":

Simple example:

import ccsdspy
from ccsdspy import PacketField, PacketArray

pkt = [
  PacketField(
       name='SHCOARSE',
       data_type='uint',
       bit_length=32
  ),
  PacketArray(
       name="data",
       data_type="uint",
       bit_length=16,
       array_shape="expand",   # makes the data field expand
  ),
  PacketField(
       name="checksum",
       data_type="uint",
       bit_length=16
  ),
]
print(pkt[1])

the output is:
PacketField(name='data', data_type='uint', bit_length=16, bit_offset=None, byte_order='big')
instead of PacketArray(name='data', data_type='uint', bit_length=16, bit_offset=None, byte_order='big')

Make the load method of FixedLength and VariableLength return the file stream argument at the location where it was before the call

That would be easy to implement in the _load function.

It is needed when the same stream is parsed multiple times, in my case, for 1. decision_packet (when different packets definitions can be used with the same APID), 2. calculate a CRC (see discussion #108 (reply in thread))

That could be optional with an argument 'return_file_where_it_was' default to False to make the code backward compatible ?

Let me know, your thoughts on that.

iter_packet_bytes incrementally read from stream, blocking for wait

It would be nice to have iter_packet_bytes (or similar util function) that can just sit and listen at a socket for packets by grabbing small chunks of data at a time. It would be a reasonable assumption (in my opinion) to trust that the packets are all well-formed, but that there might be extra bytes at the end of a given chunk of data that belong to the next packet.
Originated from #83 (comment)

Support arrays whose length is determined by another field

This feature request comes from wanting to support a type of data I have been told some instruments generate, but currently isn't supported.

This change would update VariableLength class to support variable length fields where the length is determined by another field in the packet. For example, you could have fields data1_len which sets the length of the data array, and then data2_len which follows it and sets the length of the data2 array.

An example is below:

   import ccsdspy
   from ccsdspy import PacketField, PacketArray

    pkt = ccsdspy.VariableLength([
         PacketField(
              name='SHCOARSE',
              data_type='uint',
              bit_length=32
         ),
         PacketField(
              name='data_len',
              data_type='uint',
              bit_length=8,
         ),	 
         PacketArray(
              name="data",
              data_type="uint",
              bit_length=16,
              array_shape="data_len",  # links data to data_len
         ),
         PacketField(
              name="checksum",
              data_type="uint",
              bit_length=16
         ),
    ])

Options to skip message headers between packets

For Europa-Clipper test files, we're having packets separated by specific pieces of binary stream.

See spreadsheet https://docs.google.com/spreadsheets/d/1YX-_zw9tdEwkYxdQ5IA8tClHnzwaNehGvE04JfFdY2A/edit#gid=0

We have 3 cases:

standard packet sequence
fixed length header
packet start marker

I propose:

case 2: skip a fixed number of bits in between packets
case 3: specify the start marker with as a callable taking as parameters n bits of the stream. The callable will be callable at every position in the stream. A packet will start a the position after whenever the callable return True (in our case the function works on 4 bytes and returns seq[1] == 0xF0 and seq[0] == seq[2] == seq[3] != 0x00)

I am attaching 2 test files for case 2 and 3:
case-2.bin.zip
case-3.bin.zip

Add documentation for variable length packets

Looks like there isn't an example provided for variable length packets besides the API reference.

Add Type Annotations

This issue is to add type annotations to the code, so that it can be used with static analysis checkers like MyPy.

Signed field not on byte boundaries uses too many bits to calculate sign

For example: the 12 bit signed integer 0x800 returns -30720, or 0x800 - 0x8000. Looks like all integer fields are fit into the nearest numpy size that fits the required bits and sign extension is not working properly?

Example code:

# %%
import ccsdspy
from ccsdspy import PacketField, PacketArray
import io

# %%
pkt = ccsdspy.FixedLength([
    PacketField(name='uintfive', data_type='uint', bit_length=3),
    PacketField(name='negfive', data_type='int', bit_length=5),
    PacketField(name='postwelve', data_type='int', bit_length=12),
    PacketField(name='negtwelve', data_type='int', bit_length=12),
])
# %%
# 0b101, 0b11011, 0b000000001100, 0b111111110100
fakepkt = io.BytesIO(b"\x00\x01\xC0\x00\x00\x03\x5B\x00\xCF\xFA")
print(f'Fake twelve single file has {fakepkt.getbuffer().nbytes} bytes')
with open('fake_off_byte_fields.ccsds', 'wb') as file:
    file.write(fakepkt.getbuffer())

# %%
print(f'Attempting to load just one packet from file')
pkt.load('fake_off_byte_fields.ccsds')
# returns {'uintfive': array([2], dtype=uint8), 'negfive': array([-101], dtype=int8), 
#         'postwelve': array([12], dtype='>i2'), 'negtwelve': array([-28678], dtype='>i2')}

Convert string format to use f-strings

Since Python 3.6, f-strings provide a much cleaner approach for string formatting.

Support dynamic packet definitions based on header

This ticket is to support dynamic packet definitions that change their definitions per-packet based on the header. This came out of conversations with @tloubrieu-jpl and @nischayn99 about a particular packet they have on Europa Clipper which changes its definition based on a header field (which isn't the secondary header field, even).

I came up with two ways we can support this. I'm writing them here so others can discuss and weigh in.

Packet Factory

We can have a PacketFactory, with its own load() method, which allows you provide a function reference that returns a packet definition based on the primary header. For each packet, the primary header would be parsed, and then the provided function would be called, and the packet would be re-parsed with the definition provided by the function. An example of what this would look like is as follows.

def packet_factory_fun(headers):
       if headers[xxx] == yyy:
           return VariableLength([
                  ....
            ])
        elif  headers[xxx] == zzz:
           return VariableLength([
                  ....                           # different than before
            ])
        else:
            return VariableLength([
                  ....                           # even more different 
           ])

pkt_fac = PacketFactory(packet_factory_fun)
result = pkt_fac.load('telemetry.bin')

If you wanted to parse more than the primary header and use that to decide on the rest of the packet, you could pass in an optional argument to provide that "decision" definition.

def packet_factory_fun(initial_results):
    # make decision based on result of parsing with initial_defs
    ...

initial_defs= VariableLength([
    # initial fields to parse
    ...
])

pkt_fac = PacketFactory(packet_factory_fun, initial_defs=initial_defs)
result = pkt_fac.load('telemetry.bin')

Define fields conditionally based on Expression

I don't think this is the best option, but I think it's a bit complex but I will mention it here. We could add a a keyword argument condition= that you could set to some very simply expression that would be evaluated for each packet to determine if the field is included. For example, if the field is only included if the secondary header field is none, you could add condition="CCSDS_SECONDARY_FLAG==1".

I don't think this is the best because adding the ability to parse these expressions is going to add a lot of complexity. We could also just replace the string expression with a callable (probably a lamba) and do something like condition=(lambda headers: headers['CCSDS_SECONDARY_FLAG'] == 1), but I also think that doing that multiple times might be messier than using a factory.

pkt = ccsdspy.FixedLength([
     PacketField(name='SHCOARSE', data_type='uint', bit_length=32, condition="CCSDS_SECONDARY_FLAG==1"),
     PacketField(name='SHFINE',   data_type='uint', bit_length=20, condition="CCSDS_SECONDARY_FLAG==1"),
     PacketField(name='OPMODE',   data_type='uint', bit_length=3),
     PacketField(name='SPACER',   data_type='fill', bit_length=1),
     PacketField(name='VOLTAGE',  data_type='int',  bit_length=8),
     PacketArray(
         name='SENSOR_GRID',
         data_type='uint',
         bit_length=16,
         array_shape=(32, 32),
         array_order='C'
     ),
])

Sparse packet definitions with array fields are not assigned bit_offset

Array fields in a FixedLength packet definition are not assigned bit_offset. When not all fields in the packet are defined, this has the effect of reading data starting with the previously-defined end of field instead of the requested position.

Package metadata like author and version should be moved

The standard place for this information to live in pyproject.toml. All of the python tooling can then extract the information from there.

Add support for loading variable length packets from CSV

When completing #114 , I noticed that you cannot load variable length packets from a CSV. There is no way to do the reference linking or specifying an expanding field. PUNCH would like to load variable length packets from CSV. I'm happy to work on this, but I'd like guidance on how you want this structured. At the moment, the loading CSV is in the _BasePacket class. This change would make the loading different for fixed and variable length.

Return extra bytes from iterate and/or load

Related to earlier issue: #83 (comment)

Would be handy to be able to parse mostly well-formed but unfinished streams of packets and report or return the extra bytes at the end. This does not need to handle broken packets in the middle of the stream, just partial packets at the current end (where the last packet may or may not even have a length value).