goffrie / plex Goto Github PK

View Code? Open in Web Editor NEW

405.0 8.0 28.0 173 KB

a parser and lexer generator as a Rust procedural macro

License: Apache License 2.0

Rust 100.00%

lexer-generator parser-generator

plex's People

Contributors

Stargazers

Watchers

plex's Issues

Allow backtracking for function calls without parentheses

I am working on a language using Plex, and would like to remove parentheses from function calls. From what I can tell according to this question, doing this would require "backtracing", to allow patterns like this in the parser: Ident(i) exp_list[e] { if ($1 is not defined) { back_and_choose_another_rule(); } }. Would it be possible to add this, or is there better way to do this that Plex is capable of?

P.s. — I'm new to using parsers so I may have no idea what I'm actually talking about. Sorry if this doesn't make sense!

Warnings caused by plex macros

I started seeing these on the newest rust nightly

kai@kai-thinkpad ~/d/plex (master)> cargo test
   Compiling proc-macro2 v1.0.10
   Compiling unicode-xid v0.2.0
   Compiling bit-vec v0.4.4
   Compiling syn v1.0.17
   Compiling vec_map v0.6.0
   Compiling lalr v0.0.2
   Compiling bit-set v0.4.0
   Compiling redfa v0.0.2
   Compiling quote v1.0.3
   Compiling plex v0.2.5 (/home/kai/dev/plex)
warning: unnecessary braces around block return value
   --> examples/demo.rs:167:13
    |
167 |             Ident(var) Equals assign[rhs] => Expr {
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: remove these braces
    |
    = note: `#[warn(unused_braces)]` on by default

warning: unnecessary braces around block return value
   --> examples/demo.rs:200:13
    |
200 |             Ident(i) => Expr {
    |             ^^^^^^^^ help: remove these braces

warning: unnecessary braces around block return value
   --> examples/demo.rs:204:13
    |
204 |             Integer(i) => Expr {
    |             ^^^^^^^^^^ help: remove these braces

    Finished test [unoptimized + debuginfo] target(s) in 22.71s
     Running target/debug/deps/plex-1283d3a11f2149ad

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests plex

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Adding #![allow(unused_braces)] to src/lib.rs fixes it when compiling plex itself but not when using the macro in other projects (or the tests above).

Multiple entry points

Thanks for plex! Would it be feasible to support multiple entry points that reuse the same grammar? As an example, I have something like:

statement: Statement {
    expression[e] TokenSemicolon => ExprStatement(e),
    // other variants...
}

expression: Expr {
    // some variants...
}

and want to expose parser entry points for both Expr and Statement. This might be possible via a single entry point that returns an enum but I'm wondering if there's a cleaner way.

Any way to build for stable channel?

Getting compile errors on stable channel:

error[E0554]: `#![feature]` may not be used on the stable release channel
 --> ~/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.2.5/src/lib.rs:2:12
  |
2 | #![feature(proc_macro_diagnostic)]
  |            ^^^^^^^^^^^^^^^^^^^^^

error[E0554]: `#![feature]` may not be used on the stable release channel
 --> ~/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.2.5/src/lib.rs:3:12
  |
3 | #![feature(proc_macro_span)]
  |            ^^^^^^^^^^^^^^^

I'm able to manually just remove those features and unstable functions for personal usage (in fork for example), but in general, any plans or ideas how to correctly get rid of them?

[HELP] Function argument parser

Hello. I would like to create a parser that can find LPAREN (IDENTIFIER COLON TYPE (COMMA IDENTIFIER COLON TYPE)*)? RPAREN. How can I repeatedly find a specific pattern?

pub enum DeclarationContent {
    Function {
        name : String,
        argTypes : HashMap<String, Types>, // <- I would like to find multiple of these
        returnTypes : Types,
        body : FunctionBody
    }
}

pub struct FunctionBody {
    expressions : Vec<Expression> // and multiple of these.
}

pub enum Types {
    Void,
    Integer,
    FloatingPoint,
    String,
    Character,
    Boolean,
    Array {
        length : i32,
        types  : Box<Types>
    },
    UnsizedArray {
        types : Box<Types>
    },
    Tuple {
        types : Box<Vec<Types>>
    },
    Dictionary {
        keyTypes   : Box<Types>,
        valueTypes : Box<Types>
    },
    Function {
        argsTypes   : Box<Vec<Types>>,
        returnTypes : Box<Types>
    },
    Other {
        name : String
    }
}

Anyway to have a priority?

Like this for example

plex::parser! {
    fn parse_(Token, Span);

    // combine two spans
    (a, b) {
        Span {
            lo: a.lo,
            hi: b.hi,
        }
    }

    program: Program {
        statements[s] => Program { stmts: s }
    }

    statements: Vec<Expr> {
        => vec![],
        statements[mut st] outer[e] SemiColon => {
            st.push(e);
            st
        }
    }

    outer: Expr {
        #[priority]
        Token SemiColon => ...,
        inner[a] => a,
    }
    
    inner: Expr {
        Token => ....,
    }
}

Issue Compiling: `expected function, found struct variant`

I am trying to compile a project which uses plex, and I am getting this error when trying to compile plex:

   Compiling plex v0.0.3 (https://github.com/goffrie/plex#802851ac)
error[E0423]: expected function, found struct variant `base::SyntaxExtension::NormalTT`
  --> /Users/addisonbean/.cargo/git/checkouts/plex-99281a3d5255642b/802851a/src/lib.rs:18:9
   |
18 |         base::SyntaxExtension::NormalTT(Box::new(parser::expand_parser), None, true));
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ did you mean `base::SyntaxExtension::NormalTT { /* fields */ }`?

error[E0423]: expected function, found struct variant `base::SyntaxExtension::NormalTT`
  --> /Users/addisonbean/.cargo/git/checkouts/plex-99281a3d5255642b/802851a/src/lib.rs:20:9
   |
20 |         base::SyntaxExtension::NormalTT(Box::new(lexer::expand_lexer), None, true));
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ did you mean `base::SyntaxExtension::NormalTT { /* fields */ }`?

error: aborting due to 2 previous errors

error: Could not compile `plex`.

This is the output of rustc --version: rustc 1.21.0-nightly (4fdb4bedf 2017-08-21)

Please publish on crates.io

Title says it all.

Certain lexical tokens cause optional productions to shift-reduce conflict

As goffrie mentioned in a previous issue regarding optional grammar, they mentioned it should be possible to write optional productions using the following syntax:

optionalSemicolon: () {
    => (),
    Semicolon => (),
}

...

manyStatements: ... {
    => ...
    manyStatements[sts] statement[st] optionalSemicolon => ...
}

The following code is a version of what is shown above:

Trying the code above results in a shift-reduce conflict from the parser:

lexer! macro does not accept pub functions

lexer!{
    pub fn take_token(tok: 'a) -> Token<'a>;
}

This doesn't work, though I think it should.

⚠⚠⚠ Extreme compile times ⚠⚠⚠

As the grammar gets large, compile times grow a lot. A grammar similar to C but still a lot simpler takes about 2 or 3 minutes to compile and it gets recompiled on any change to the project. I had to put it in a separate crate to avoid it. Expressions with many precedence levels seem to be the main cause but even without them it takes way too long.

Not sure if related but doc tests also seem to take ages when using plex and a large grammar.

Token lifetime in parser

Hi,

If I add a lifetime to my token enum like in the README, I have a hard time compiling the parser. Essentially I get the following error.

error[E0106]: missing lifetime specifier
src/parser/mod.rs:53:19
53 |         fn parse_(Token, Span);
     |                   ^^^^^ help: consider giving it an explicit bounded or 'static lifetime: `Token + 'static`

How do I add a lifetime here? The syntax of this macro is a little confusing...

Tests

Automated testing is great! It would probably be a good idea to write some.

Example lexer can be simplified

There is no need to return text from the lexer manually - the lexer already returns a tuple with all the info needed to advance remaining and figure out the span after every token. That way the lexer looks much cleaner.

I can submit a PR if you're interested.

Documentation leads to 404 error

The documentation link on crates.io ( http://goffrie.github.io/plex ) is broken

broken again ;-(

Doesn't work with rustc 1.11.0-nightly (696b703b5 2016-07-03)

The crates.io version seems out of date

The provided example is compiling with the nightly 70f130954 2019-04-16 by adding missing "crate::" statements.
The last version published on crates.io is 0.2.3. The fixed example does not compile against this version.
Would it be possible to update the version on crates.io?

No way to pass additional args to the lexer

I needed to pass an interner to the lexer (to avoid calling .to_owned() for idents) but there doesn't seem to be a way to add it to the signature without making the proc macro panic.

warning: function 'reduce_11' is never used

A certain production rule in my parser definition causes the whole parser definition to be marked under the "function 'reduce_11' is never used" rustc warning. Build code from this point in my project's git history to reproduce: 8fc9de55efef7ab242d38f29b0020880dbd8f8de. Parser source is at src/dcparser.rs

One block for a set of two or more production rules?

I have a set of rules under one production in my CFG that both have basically duplicated blocks of code, and could be handled in one. Is there a way to do this? The following example is what my code looks like:

numeric_with_modulus: DCNumericType {
    numeric_type_token[mut nt] Percent number[n] => {
        // do something 
    }
    numeric_with_explicit_cast[mut nt] Percent number[n] => {
        // do something very similar
    }
}

Would be nice to have something like this:

numeric_with_modulus: DCNumericType {
    numeric_type_token[mut nt] Percent number[n] |
    numeric_with_explicit_cast[mut nt] Percent number[n] => {
        // do the similar thing :p
    }
}

of course, both rules would have to have the same parameters. or, same 'function' signature if that's the right term :p Since each block is a bit like a function; with the non-terminals / terminals being the parameters.

Please make requirements clear

It would be nice if you would mention in the README.md that plex requires nightly rust.

Naming regex patterns?

Suppose I have this lexical syntax for a token:

Identifier ::= <Initial> <Subsequent>*
Initial ::= a..z
Subsequent ::= Initial | ...

Here the Initial part is used in both Identifier and Subsequent. As far as I understand, currently the only way to do this in plex's lexer! is by duplicating Initial part, something like:

lexer! {
    ...
    r"[a-z]([a-z]|[<subsequent>])*" => ...
}

In the simplified example above the repetition of [a-z] is not too bad, but in the actual use case the syntax is much more complex and the repetition is a real problem.

Ideally I should be able to give this regex [a-z] a name and use it in other regex patterns. Maybe something like:

lexer! {
    ...
    let initial = "[a-z]"
    let subsequent = "..."
    r"$initial($initial|[$subsequent])*" => ...
}

Is this currently possible with plex? If not I think this would be a useful addition to it.

Optional Token in parser

Would it be possible to add support for an optional token in the parser?

I'm trying to implement a parser that allows me to parse a language with an optional semicolon. If this is already implemented it probably should be documented.

Attribute needed to use the emit function

Plex won't build on 1.30.0-nightly (33b923fd4 2018-08-18) without adding #![feature(proc_macro_diagnostic)] to the crate attributes.

Update to syn 1.0

Not sure if this crate is still begin updated but it would be great to see it ported to syn 1.0 (and also quote 1.0).

New crates.io release

Would be nice.

Add support for case-insensitive regex

In some languages identifiers and keywords are case-insensitive. Some examples are Ada, VHDL and SQL. Also literals in other languages like C and C++ can be in parts case-insensitive.

For these cases it would be helpful to mark regexes to match case-insensitive. In flex regexes that are case-insensitive are prefixed with ?i:.

Won't build on Rust 1.19.0-nightly

When running cargo build, I get these errors:

    Compiling plex v0.0.3 (file:///Users/addisonbean/git/other/plex)
 error[E0560]: struct `syntax::ast::Lifetime` has no field named `name`
    --> src/lexer.rs:122:21
     |
 122 |                     name: ident.name
     |                     ^^^^^ `syntax::ast::Lifetime` does not have this field
 
 error[E0308]: mismatched types
    --> src/lexer.rs:131:31
     |
 131 |         cx.lifetime(DUMMY_SP, Symbol::gensym("text"))
     |                               ^^^^^^^^^^^^^^^^^^^^^^ expected struct `syntax::ast::Ident`, found struct `syntax::ast::Symbol`
     |
     = note: expected type `syntax::ast::Ident`
                found type `syntax::ast::Symbol`
     = help: here are some functions which might fulfill your needs:
             - .to_ident()
 
 error: no associated item named `RESTRICTION_STMT_EXPR` found for type `syntax::parse::parser::Restrictions` in the current scope
    --> src/lexer.rs:159:47
     |
 159 |         let expr = try!(parser.parse_expr_res(parser::Restrictions::RESTRICTION_STMT_EXPR, None));
     |                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 error: no associated item named `RESTRICTION_STMT_EXPR` found for type `syntax::parse::parser::Restrictions` in the current scope
    --> src/parser.rs:620:51
     |
 620 |             let expr = try!(parser.parse_expr_res(parser::Restrictions::RESTRICTION_STMT_EXPR, None));
     |                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 error: aborting due to previous error(s)
 
 error: Could not compile `plex`.
 
 To learn more, run the command again with --verbose.

I am on macOS 10.12.4, using Rust 1.19.0-nightly.

Support of start conditions

Hello I'm currently taking a lecture in the topic of lexical analyzers where we focus on flex. flex has a thing called Starting Conditions where you can basically set the parser into a different state. This is useful for things that can't be easily defined using a regex like multi line comments.

Is this or something similar currently supported by flex?

I would think that it would be a bit of work but not two difficult to implement. If it's not supported right now would you think that that is something interesting to have. I could look into implementing it and also learning some stuff about macros? 🙃

Rust stable

Hi!

I'm not very familiar with Rust, but very excited by your library. Only my problem is I decided to keep on latest stable Rust.
Is it very difficult to support stable Rust at least lexer part?

Optional new Program in Parser

Hey, i wanted to ask if it's possible to add a new optional Program to the parser so you can for example in php or html open a new context.

Precedence operators to resolve S/R conflicts

This isn't a feature request, more feeling out: would you be interested in a PR that added precedence operators to automatically solve shift–reduce conflicts? (per bison's %left / %right / %nonassoc) If so, I might give it a shot. (No guarantees I'm actually capable of it!)

supporting error recovery?

error recovery is important for practical parser, is there any plan that plex will support it?
for example, when we are failed to parse a statement, we skip remaining tokens until we met a semicolon:

program {
    => vec![],
    => program[mut z1] statement[z2] => {z1.push(z2); z1},
}
statement {
    expression[z1] Semicolon => Statement::Expression(z1),
    .* Semicolon => Statement => Statement::Error,
}

Can't compile with 1.21.0-nightly

Dependencies all build fine, but plex itself fails with a bunch of rustc errors.

rustc --verbose output running rust 1.21.0-nightly on Linux 4.11.7-1-ARCH

   Compiling plex v0.0.3
     Running `rustc --crate-name plex /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/lib.rs --crate-type dylib --emit=dep-info,link -C prefer-dynamic -C debuginfo=2 -C metadata=490a9b298919bdee -C extra-filename=-490a9b298919bdee --out-dir /home/inori/github/curie/target/debug/deps -L dependency=/home/inori/github/curie/target/debug/deps --extern redfa=/home/inori/github/curie/target/debug/deps/libredfa-883b8c377d7f163c.rlib --extern lalr=/home/inori/github/curie/target/debug/deps/liblalr-37a0030533b01d07.rlib --cap-lints allow`
error[E0425]: cannot find function `expr_is_simple_block` in module `classify`
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/lexer.rs:153:23
    |
153 |             classify::expr_is_simple_block(&*expr)
    |                       ^^^^^^^^^^^^^^^^^^^^ not found in `classify`

error[E0425]: cannot find function `mk_sp` in module `codemap`
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/parser.rs:519:33
    |
519 |             cx.item_fn(codemap::mk_sp(lo, hi), range_fn_id, vec![
    |                                 ^^^^^ not found in `codemap`

error[E0425]: cannot find function `mk_sp` in module `codemap`
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/parser.rs:585:44
    |
585 |                     Binding::Enum(codemap::mk_sp(lo, parser.prev_span.hi), pats)
    |                                            ^^^^^ not found in `codemap`

error[E0425]: cannot find function `expr_is_simple_block` in module `classify`
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/parser.rs:600:27
    |
600 |                 classify::expr_is_simple_block(&*expr)
    |                           ^^^^^^^^^^^^^^^^^^^^ not found in `classify`

error[E0425]: cannot find function `mk_sp` in module `codemap`
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/parser.rs:611:31
    |
611 |             let sp = codemap::mk_sp(lo, parser.prev_span.hi);
    |                               ^^^^^ not found in `codemap`

error[E0599]: no method named `eat_lifetime` found for type `syntax::parse::parser::Parser<'_>` in the current scope
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/lexer.rs:116:34
    |
116 |         if let Some(lt) = parser.eat_lifetime() {
    |                                  ^^^^^^^^^^^^

error[E0308]: mismatched types
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/lexer.rs:122:31
    |
122 |         cx.lifetime(DUMMY_SP, Symbol::gensym("text"))
    |                               ^^^^^^^^^^^^^^^^^^^^^^ expected struct `syntax::ast::Ident`, found struct `syntax::ast::Symbol`
    |
    = note: expected type `syntax::ast::Ident`
               found type `syntax::ast::Symbol`
    = help: here are some functions which might fulfill your needs:
            - .to_ident()

error[E0599]: no associated item named `RESTRICTION_STMT_EXPR` found for type `syntax::parse::parser::Restrictions` in the current scope
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/lexer.rs:150:47
    |
150 |         let expr = try!(parser.parse_expr_res(parser::Restrictions::RESTRICTION_STMT_EXPR, None));
    |                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error[E0063]: missing field `span` in initializer of `syntax::ast::WhereClause`
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/lexer.rs:237:27
    |
237 |             where_clause: ast::WhereClause {
    |                           ^^^^^^^^^^^^^^^^ missing `span`

error[E0063]: missing field `span` in initializer of `syntax::ast::WhereClause`
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/parser.rs:172:23
    |
172 |         where_clause: ast::WhereClause {
    |                       ^^^^^^^^^^^^^^^^ missing `span`

error[E0609]: no field `value` on type `syntax::ast::Attribute`
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/parser.rs:549:28
    |
549 |                 match attr.value.node {
    |                            ^^^^^ did you mean `style`?

error[E0609]: no field `value` on type `syntax::ast::Attribute`
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/parser.rs:550:65
    |
550 |                     ast::MetaItemKind::List(ref tokens) if attr.value.name == "no_reduce" => {
    |                                                                 ^^^^^ did you mean `style`?

error[E0609]: no field `value` on type `syntax::ast::Attribute`
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/parser.rs:561:53
    |
561 |                     ast::MetaItemKind::Word if attr.value.name == "overriding" => {
    |                                                     ^^^^^ did you mean `style`?

error[E0599]: no associated item named `RESTRICTION_STMT_EXPR` found for type `syntax::parse::parser::Restrictions` in the current scope
   --> /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/parser.rs:597:51
    |
597 |             let expr = try!(parser.parse_expr_res(parser::Restrictions::RESTRICTION_STMT_EXPR, None));
    |                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

error: aborting due to 14 previous errors

error: Could not compile `plex`.

Caused by:
  process didn't exit successfully: `rustc --crate-name plex /home/inori/.cargo/registry/src/github.com-1ecc6299db9ec823/plex-0.0.3/src/lib.rs --crate-type dylib --emit=dep-info,link -C prefer-dynamic -C debuginfo=2 -C metadata=490a9b298919bdee -C extra-filename=-490a9b298919bdee --out-dir /home/inori/github/curie/target/debug/deps -L dependency=/home/inori/github/curie/target/debug/deps --extern redfa=/home/inori/github/curie/target/debug/deps/libredfa-883b8c377d7f163c.rlib --extern lalr=/home/inori/github/curie/target/debug/deps/liblalr-37a0030533b01d07.rlib --cap-lints allow` (exit code: 101)

Nightly build broken by proc_macro feature rename

The proc_macro feature was recently split up into several smaller features (and a small amount was stabilized). This causes use of unstable feature x errors when you build plex.

goffrie / plex Goto Github PK

plex's People

Contributors

Stargazers

Watchers

Forkers

plex's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs