GithubHelp home page GithubHelp logo

parser's Introduction

Parser Combinator

Github Action Scrutinizer Code Quality

Simple Yet Powerful Parsing Library.

What is this

A library to create custom parsers. Based on the "ancient" concept of parser combinators, this library contains a vast variety of base parsers, decorators, combinators and helpers.

Why use this

Parsers made with this library can be used in many ways. Parsing is transforming text into a usable structure.

This can be used for various purposes, whether it be transforming json / csv / xml / yaml / etc. into some kind of data structure, or parsing a custom DSL or expression language into an abstract syntax tree.

Whether you wish to create your own file format, your own programming language, interpret existing file formats or languages... This library is here to help.

How to use this

For hands on how-tos, see the guide.

Installation

Using composer: composer require stratadox/parser

Overview

There's 3 base parsers: any, text and pattern.

  • Any matches any single character.
  • Text matches a predefined string.
  • Pattern matches a regular expression.

These can be upgraded by a fair amount of add-ons ("decorators"), which can be combined as needed:

  • Repeatable applies the parser any number of times, yielding a list.
  • Map modifies successful results based on a function.
  • Full Map modifies all results based on a function.
  • Ignore requires the thing to be there, and then ignores it. (Miauw)
  • Maybe does not require it, but uses it if it's there.
  • Optional combines the above two.
  • Except "un-matches" if another parser succeeds.
  • End returns an error state if there's unparsed content.
  • All or Nothing fiddles with the parse error.

Parsers can be combined using these combinators:

All the above can be mixed and combined at will. To make life easier, there's a bunch of combinator shortcuts for "everyday tasks":

  • Between matches the parser's content between start and end.
  • Between Escaped matches unescaped content between start and end.
  • Split yields one or more results, split by a delimiter.
  • Must Split yields two or more results, split by a delimiter.
  • Keep Split yields a structure like {delimiter: [left, right]}.

There's several additional helpers, which are essentially mapping shortcuts:

  • Join implodes the array result into a string.
  • Non-Empty refuses empty results.
  • At Least refuses arrays with fewer than x entries.
  • At Most refuses arrays with more than x entries.
  • First transforms an array result into its first item.
  • Item transforms an array result into its nth item.

To enable lazy parsers (and/or to provide a structure), different containers are available:

Example 1: CSV

For a basic "real life" example, here's a simple CSV parser:

<?php
use Stratadox\Parser\Helpers\Between;
use Stratadox\Parser\Parser;
use function Stratadox\Parser\any;
use function Stratadox\Parser\pattern;

function csvParser(
    Parser|string $sep = ',',
    Parser|string $esc = '"',
): Parser {
    $newline = pattern('\r\n|\r|\n');
    return Between::escaped('"', '"', $esc)
        ->or(any()->except($newline->or($sep)->or($esc))->repeatableString())
        ->mustSplit($sep)->maybe()
        ->split($newline)
        ->end();
}

(For associative result mapping, see the CSV example)

Example 2: Calculator AST

This next example parses basic arithmetic strings (e.g. 1 + -3 * 3 ^ 2) into an abstract syntax tree:

<?php
use Stratadox\Parser\Containers\Grammar;
use Stratadox\Parser\Containers\Lazy;
use Stratadox\Parser\Parser;
use function Stratadox\Parser\pattern;
use function Stratadox\Parser\text;

function calculationsParser(): Parser
{
    $grammar = Grammar::with($lazy = Lazy::container());

    $sign = text('+')->or('-')->maybe();
    $digits = pattern('\d+');
    $map = fn($op, $l, $r) => [
        'op' => $op,
        'arg' => [$l, $r],
    ];

    $grammar['prio 0'] = $sign->andThen($digits, '.', $digits)->join()->map(fn($x) => (float) $x)
        ->or($sign->andThen($digits)->join()->map(fn($x) => (int) $x))
        ->between(text(' ')->or("\t", "\n", "\r")->repeatable()->optional());

    $lazy['prio 1'] = $grammar['prio 0']->andThen('^', $grammar['prio 0'])->map(fn($a) => [
        'op' => '^',
        'arg' => [$a[0], $a[2]],
    ])->or($grammar['prio 0']);

    $grammar['prio 2'] = $grammar['prio 1']->keepSplit(['*', '/'], $map)->or($grammar['prio 1']);

    $grammar['prio 3'] = $grammar['prio 2']->keepSplit(['+', '-'], $map)->or($grammar['prio 2']);

    return $grammar['prio 3']->end();
}

(For a working example, see the Calculator example)

Documentation

Additional documentation is available through the guide, the reference and/or the tests.

parser's People

Contributors

jesse-twindigital avatar stratadox avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.