GithubHelp home page GithubHelp logo

pombredanne / graphtage Goto Github PK

View Code? Open in Web Editor NEW

This project forked from trailofbits/graphtage

0.0 1.0 0.0 7.91 MB

A semantic diff utility and library for tree-like files such as JSON, JSON5, XML, HTML, YAML, and CSV.

License: GNU Lesser General Public License v3.0

Python 100.00%

graphtage's Introduction

Graphtage

PyPI version Tests Slack Status

Graphtage is a command-line utility and underlying library for semantically comparing and merging tree-like structures, such as JSON, XML, HTML, YAML, plist, and CSS files. Its name is a portmanteau of “graph” and “graftage”—the latter being the horticultural practice of joining two trees together such that they grow as one.

Installation

$ pip3 install graphtage

Command Line Usage

Output Formatting

Graphtage performs an analysis on an intermediate representation of the trees that is divorced from the filetypes of the input files. This means, for example, that you can diff a JSON file against a YAML file. Also, the output format can be different from the input format(s). By default, Graphtage will format the output diff in the same file format as the first input file. But one could, for example, diff two JSON files and format the output in YAML. There are several command-line arguments to specify these transformations; please check the --help output for more information.

By default, Graphtage pretty-prints its output with as many line breaks and indents as possible.

{
    "foo": [
        1,
        2,
        3
    ],
    "bar": "baz"
}

Use the --join-lists or -jl option to suppress linebreaks after list items:

{
    "foo": [1, 2, 3],
    "bar": "baz"
}

Likewise, use the --join-dict-items or -jd option to suppress linebreaks after key/value pairs in a dict:

{"foo": [
    1,
    2,
    3
], "bar":  "baz"}

Use --condensed or -j to apply both of these options:

{"foo": [1, 2, 3], "bar": "baz"}

The --only-edits or -e option will print out a list of edits rather than applying them to the input file in place.

Matching Options

By default, Graphtage tries to match all possible pairs of elements in a dictionary. While computationally tractable, this can sometimes be onerous for input files with huge dictionaries. The --no-key-edits or -k option will instead only attempt to match dictionary items that share the same key, drastically reducing computation. Likewise, the --no-list-edits or -l option will not consider interstitial insertions and removals when comparing two lists. The --no-list-edits-when-same-length or -ll option is a less drastic version of -l that will behave normally for lists that are of different lengths but behave like -l for lists that are of the same length.

ANSI Color

By default, Graphtage will only use ANSI color in its output if it is run from a TTY. If, for example, you would like to have Graphtage emit colorized output from a script or pipe, use the --color or -c argument. To disable color even when running on a TTY, use --no-color.

HTML Output

Graphtage can optionally emit the diff in HTML with the --html option.

$ graphtage --html original.json modified.json > diff.html

Status and Logging

By default, Graphtage prints status messages and a progress bar to STDERR. To suppress this, use the --no-status option. To additionally suppress all but critical log messages, use --quiet. Fine-grained control of log messages is via the --log-level option.

Why does Graphtage exist?

Diffing tree-like structures with unordered elements is tough. Say you want to compare two JSON files. There are limited tools available, which are effectively equivalent to canonicalizing the JSON (e.g., sorting dictionary elements by key) and performing a standard diff. This is not always sufficient. For example, if a key in a dictionary is changed but its value is not, a traditional diff will conclude that the entire key/value pair was replaced by the new one, even though the only change was the key itself. See our documentation for more information.

Using Graphtage as a Library

See our documentation for more information.

Extending Graphtage

Graphtage is designed to be extensible: New filetypes can easily be defined, as well as new node types, edit types, formatters, and printers. See our documentation for more information.

Complete API documentation is available here.

License and Acknowledgements

This research was developed by Trail of Bits with partial funding from the Defense Advanced Research Projects Agency (DARPA) under the SafeDocs program as a subcontractor to Galois. It is licensed under the GNU Lesser General Public License v3.0. Contact us if you're looking for an exception to the terms. © 2020, Trail of Bits.

graphtage's People

Contributors

esultanik avatar bollwyvl avatar ehershey avatar zombienub avatar c00k133 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.