Comments (7)
This came up again in the google group. One difficulty is to figure out how to best implement it in all target languages.
I'm not sure about C, C++ and C#.
from bnfc.
I have two open questions before I can try to work on this:
What is the most idiomatic way to implement this in different languages?
I have an idea for Haskell, OCaml and Java but not for the C-like languages (C, C++ and C#).
What syntax to use?
If I understood correctly the example in #150, ebnf uses [...]
but this already means list in lbnf. The comment above suggest {...}
but for me it would be more natural to use a regex-like syntax like Foo?
meaning 0 or 1 occurrence of Foo
.
from bnfc.
Maybe that's too simple, but for C/C++ you could just use null pointers. At least that's the approach IcarusVerilog's vhdlpp uses. Vhdlpp is a VHDL to Verilog transpiler which I'm currently using to parse VHDL...
An object for say
fooLabel . foo_baz ::= <"bla"> "foobar";
would look like
class foo_baz {/*...*/};
class fooLabel : foo_baz {
symbol_ptr *m1; //might be NULL!
symbol_ptr *m2;
};
This approach, however, has one obvious drawback: Another layer of indirection must be introduced in order to point to terminal symbols too.
But I think there is another, much more elegant solution of the problem!
Our current goal was to directly map rules containing optional parts onto an appropriate ADT or class hierarchy. But what if we'd use a transformation on the AST of the LBNF in order to expand, or better "desugar away" these optionals.
Simply put, the following LBNF
AllSuffix . Suffix ::= <"foobar"> "all";
NameSuffix . Suffix ::= Simple_name;
should be expanded (on AST level) to the desugared LBNF
AllSuffix . Suffix ::= "all";
AllSuffix_foo . Suffix ::= "foobar" "all";
NameSuffix . Suffix ::= Simple_name;
before any further processing.
This has the charming side effect, that none of the existing backends have to be touched.
A general algorithm for this expansion is very easy to see, if you consider a statements containing two or more optionals. For example an AllSuffix
additionally could contain a begin delimiter and a end label.
AllSuffix . Suffix ::= <"?"> <"foobar"> "all" <"?">;
This desugars to every possible combination
AllSuffix000 . Suffix ::= "all" ;
AllSuffix001 . Suffix ::= "all" "?";
AllSuffix010 . Suffix ::= "foobar" "all" ;
AllSuffix011 . Suffix ::= "foobar" "all" "?";
AllSuffix100 . Suffix ::= "?" "all" ;
AllSuffix101 . Suffix ::= "?" "all" "?";
AllSuffix110 . Suffix ::= "?" "foobar" "all" ;
AllSuffix111 . Suffix ::= "?" "foobar" "all" "?";
The binary numbers are not really intended to go into an implementation and only serve as an analogy that each optional can be either present 1
or not 0
.
from bnfc.
The new approach to optionals could also fairly easy be extended to support nested optionals too. A snippet like
Foobar . FooFoo ::= < <Label> "-" <Label> ":" > "foobar
is quite relevant for practial use, because VHDL-2008's grammar makes heavy use of those constructs in EBNF.
This piece of grammar
block_header ::=
[ generic_clause [ generic_map_aspect ; ] ]
[ port_clause [ port_map_aspect ; ] ]
for instance was just copy pasted from the language's official reference manual.
from bnfc.
Interesting idea @forflo. I realize now that what you want to do and what was suggested in this issue are actually two different features.
There is actually already a mechanism similar to what you're suggesting in BNFC in the rules
macro. We could extend this macro to allow nested disjonctions, then you could write your example as
rules FooFoo = ( ( Label |) "-" ( Label |) ":" |) "foobar" ;
This has the advantage that it doesn't introduce new symbols in the language.
from bnfc.
BNFC in the rules macro.
Exactly. I like your proposed syntax.
There are a number of drawbacks even with this solution.
- A rules macro containing
n
optional symbols would generate2^n
different rules whose labels also have to be distinct. Hence, we need a way to produce label prefixes that are meaningful. - For each rule with more than one label, the C/C++ backend produces a subclass for each distinct label of the rule. Hence, the AST structure will be kind of bloated.
The second point is not really a drawback of the current approach since it is present already. If you manually expand optional grammar symbols, you have to face the exact same issue.
Let me clarify point one by means of a previous example.
AllSuffix . Suffix ::= <"?"> <"foobar"> "all" <"?">;
translates to your poposed syntax as follows
rules Suffix = ( "?" |) ( "foobar" |) "all" ( "?" |) ;
which should expand to
Suffix_opt000 . Suffix ::= "all" ;
Suffix_opt001 . Suffix ::= "all" "?";
Suffix_opt010 . Suffix ::= "foobar" "all" ;
Suffix_opt011 . Suffix ::= "foobar" "all" "?";
Suffix_opt100 . Suffix ::= "?" "all" ;
Suffix_opt101 . Suffix ::= "?" "all" "?";
Suffix_opt110 . Suffix ::= "?" "foobar" "all" ;
Suffix_opt111 . Suffix ::= "?" "foobar" "all" "?";
As an interpreter writer, you'd have to do a lot pattern matching on the different constructors of data Suffix
, but I think that's more elegant than the need to handle Maybe
Monads or NULL pointers, since - in a way - the parser does these decisions for you and only produces a slightly "richer" AST.
I really do think, that this is the most feasible and general solution for our problem since we (1) don't have to touch backends (2) don't have to introduce new symbols to LBNF. We'd only take the next logical step. That is, from macros helping with lists and operator precedence to macros helping with optionals.
Moreover, I don't think there really is a general, non-ugly method of mapping optionality onto type systems of all target languages. Also, I don't think that we need this kind of functionality, because there is at least one major language out there whose grammar does not serve well as the basis for later processing -- of course I'm talking about VHDL...
from bnfc.
I was planning on using BNFC and this feature would really make my life easier. Is this still planned?
From experience with tavor, its ?(category)
syntax works really well and is easy to read (IMO) so I think a regex-like syntax would be best for this, i.e. (category)?
.
Taking further inspiration from it, its repeated and permutation groups are really nice and work like this:
+n(category)
specifies thatcategory
should repeat exactlyn
times in sequence, wheren
is a natural number+n,m(category)
is similar to the above, except it specifies thatcategory
should repeat betweenn
andm
times (inclusive)@(cat1 | cat2 | cat3)
specifies thatcat1
,cat2
andcat3
all have to occur in sequence, but in any order
These probably aren't in the scope of this particular issue, but I think they are nice potential additions to consider.
from bnfc.
Related Issues (20)
- Generated C++ parser does not allow whitespace to be a token HOT 3
- Release 2.9.4.1
- Where do I get grammars :P HOT 7
- Juxtaposing quotation mark HOT 4
- GHC 9.6 HOT 1
- The Ocaml printer crashes if the indent level becomes negative HOT 1
- Improve generated CPP code to trace function calls HOT 5
- Optional Semicolons HOT 5
- Java serialization (pretty print) of strings does not escape special chars HOT 3
- how-to cabal HOT 3
- Java/ANTLR does not lex escape sequences in String HOT 1
- Ocaml backends do not lex escape sequences in Char
- Java: jflex-generated lexer with line numbers fails to build HOT 1
- Release 2.9.5
- Java/ANTLR: example C fails to build with Java 20 due to case mismatch
- Advertisement: Online yacc/lex grammar editor/tester HOT 1
- Support GHC 9.8
- Document that `$LANG` and `$LOCALE_ARCHIVE` need to be set
- Store tokens' position range instead of just start position
- List category of internal category should be internal HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bnfc.