GithubHelp home page GithubHelp logo

det-lab / datareaderwriter Goto Github PK

View Code? Open in Web Editor NEW
5.0 5.0 1.0 5.6 MB

Example implementations of the DFDL and kaitai data-description languages

License: MIT License

C++ 42.56% Makefile 0.42% Dockerfile 1.47% CMake 10.52% Kaitai Struct 43.43% Jupyter Notebook 1.60%

datareaderwriter's Introduction

Binary Data Reader

A library for reading and writing binary physics data into a structured format.

DOI

One of the issues currently facing the physics community is the highly variable nature of experimental data formats. Analysis software is often tightly coupled to a particular binary data format. There may be many unique formats due to the nature of experimental design, or as a byproduct of the hardware available to researchers. This means that a great analysis tool that is available might be useless to you if your data is in the incorrect format. Simplifying the process of needing to integrate a unique data format into every piece of analysis code is therefore of great value.

This repository seeks to demonstrate and evaluate the use of existing tools to declaratively define the structure of binary data, in an effort to streamline user interaction with raw binary data.

Interfacing with data

Hand-Writing Code

Writing code by hand to interact with your binary data files is always one option. This is a great way to get a better feel for your data structure, and get an idea of how exactly you might be interested in interacting with it. Some simplistic example code can be found in the handwritten directory.

However, it's worth being aware that this can very quickly become a tedious and titanic effort. Luckily for us, there are ways we can make computers write the code for us!

Side note: If you're working with hand-written code to interface with the animal_raw data, you may experience some weirdness with the standard output (in particular, random additional characters may be rendered along with the string "cat").

Kaitai Struct

One code-generating option is Kaitai Struct, which uses a yaml-style format to declare a binary data format's structure. The strength is that Kaitai then generates a library of code (in your language of choice) for reading a raw data file. The advantage of this is that the code can be directly included as a library into another program.

See the kaitai directory for more information about kaitai, as well as some example code using Kaitai Struct Compiler.

DFDL

DFDL takes a much different approach and serves directly as a parser instead of simply generating the code that the user must then incorporate. After declaring your format, DFDL parses the raw file and produces a new XML or JSON file. This file contains all the information in the raw file but has now been structure to be easily accessible. Nearly all programming languages have some type of XML or JSON parsing library which simplifies the process of accessing the relevant data.

Click here for more information and a usage example on Daffodil.

datareaderwriter's People

Contributors

glass-ships avatar brettged avatar pibion avatar maramaraschino avatar zonca avatar jpivarski avatar

Stargazers

KenForever avatar FuQiang avatar  avatar Man Sun avatar Lukas avatar

Watchers

James Cloos avatar  avatar  avatar  avatar Art Wilson avatar

Forkers

jpivarski

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.