Comments (10)
Before I read your message in its entirety, have you tried the MacroMode.PriorityOverride
option to override static_if
?
from ecsharp.
Whew! I just committed support for preserving newlines and comments in EC#, a feature that was much harder than I thought it would be, particularly on the printer side. And I fixed a bug in LLLPG that broke the parser and had me baffled for awhile. Hopefully trivia preservation will be much easier in LES, but sadly there are two different printers to deal with (I thought I could deal with all three printers and parsers in one day, hahahaha.. er.. nope, not by a long shot).
The problem I have with "modularizing" StdMacros is in deciding where to draw the lines. It is not obvious to me what counts as "high" or "low" level. I mean, for most of the macros in StdMacros you can point to one or more programming languages where the feature implemented by that macro is built into the language, and one or more other languages where it is an "add-on". But I guess I can agree with your examples: static_if
feels like a low-level feature while alt class
feels like, and operates as, an add-on, even though ADTs are built into many languages as a low-level feature.
Hopefully for now you can solve the problem with MacroMode.PriorityOverride
- but actually you should use PriorityInternalOverride
as PriorityOverride
is meant for end-users.
from ecsharp.
Doesn't implementing IsQuickBindLhs
correctly [in a macro] require an enhanced macro system like Nemerle has, with a second macro-processor pass that provides access to more type & member information? I haven't thought about that feature for a long time. (I'd love to have Nemerle-like macro hygiene, btw, but it involves concepts like "imports" and "colored identifiers" or "private symbols" or whatever the kids are calling them these days, and it seems to require closer integration with a compiler, which LeMP is not currently designed to provide.)
from ecsharp.
Say... is there a reason you're not using macros (quote {}
) to write RequiredMacros.cs?
from ecsharp.
Thanks for your replies! I'll try to respond to each of them in some arbitrary order below.
- Actually, I must confess that I didn't even know
MacroMode.PriorityInternalOverride
existed until now. It seems like thatenum
value could solve most (if not all) of my macro-related problems, so I'll definitely give it a try. - But I disagree that implementing
IsQuickBindLhs
requires an enhanced macro system. All it takes is a few builtins, which are evaluated at compile-time. Admittedly, this does require compiler support for the builtins, but – strictly speaking – there's no need for an additional macro processing phase. Likeforeach
statements, the::
operator can be lowered tostatic if (#isQuickBindLhs(<lhs>) { <quickbind> } else { <scope resolution> }
. LeMP could implement the hypothetical#isQuickBindLhs
node as a macro that evaluates to a Boolean value, and process thestatic if
at macro expansion time.ecsc
can map both#isQuickBindLhs
andstatic if
to builtins, which are only evaluated during the IRGen phase. - Also,
ecsc
defines builtins that can be used to build hygienic macros, andecsc
's macro library actually uses those builtins to make sure that local variable definitions do not interfere with the outer scope's locals (and vice-versa). Again, these builtins areecsc
-specific, but perhaps LeMP can use a few heuristics to obtain similar results. - I did consider using
quote { }
in RequiredMacros.cs, but I sort of decided to hold off on that for now. Keeping the*.out.cs
files in sync with the*.ecs
files is a bit of a pain, so my plan is to just wait a little longer untilecsc
can compile its own macro library, and then start usingquote { }
in the macro library. - Oh, and trivia preservation sounds great, by the way. Flame actually has some (theoretical) support for comment preservation and printing, but none of the current front-ends support it. I might just look into embedding comments in the IR that
ecsc
generates – once the next version of Loyc lands, that is. I've always wanted to include comments from the source programming language into, for example, generated C++ code, and this is the missing piece I needed to do just that. Awesome!
from ecsharp.
Update: I have defined overrides for the static_if
and @[#static] #if
macros in RedefinedMacros.cs, and the override mechanism seems to work beautifully. Thanks for helping me out!
(I'll close this issue now, given that my problem has been solved.)
from ecsharp.
I guess I have waited long enough to comment...
It's pretty interesting how you've stretched out the lexical macro system to execute what would otherwise be conventional compiler passes, but something about this approach makes me uncomfortable. Let's see...
- I have a vague uneasiness that macros don't compose or chain very well (e.g. if a macro emits a call to
#foo()
thinking that there is a macro called#foo
, often it is not guaranteed that the intended version of#foo
is actually called...#foo
's namespace might somehow not be imported at the call site, or maybe there's a name collision as there are two macros called#foo
). - Your
foreach
macro builds upon a series of compiler builtins. In this case it works, and potentially it could allows third parties to add their own new constructs with "smart" behavior - as you mentioned,#useSequenceExpressions
could use these builtins to act smarter. But I feel like there must be limits to how much one can accomplish this way. But even if those limits were overcome, by making your compile-time builtins turing-complete (and whatever else they would need), programming this way has a couple of major downsides:- The macro author has to write code in a different (and more clunky) way than if he was writing an ordinary compiler pass or if he was writing a macro that has access to semantic information.
- The some of the macro's logic is encoded in Loyc trees. Creating those trees wouldn't be necessary in a compiler pass, and processing those trees is an act of interpretation, so it necessarily takes more CPU time than if compiled code (like a compiler pass) did the same task.
So... I think ultimately we should move away from using lexical macros to do semantic tasks, though I'm not prepared at the moment to suggest what to do instead.
Symbols and hygiene
I'd like to share an idea I originally had about how a Loyc compiler would work. As you know, LNode.Name
is a Symbol
, and while currently Symbol
is basically just a string (it has an Id
too, which maybe I should delete), it is deliberately not sealed
, and in the back of my mind my plan was always to use Symbol
for resolved references in source code.
Imagine if your compiler built its symbol-tables as usual, but the symbols in the symbol tables were actually Symbol
s, i.e. derived from class Symbol
. Now, if a method contains this code:
var x = 23;
Foo(x);
The EC# parser, of course, produces a Loyc tree for Foo(x)
which refers to a pair of ordinary global symbols. My idea was that a "symbol resolution" pass would scan over the source code, replacing global symbols with resolved symbols. The output would still be a Loyc tree and it would still print out as Foo(x)
, but the symbol with Name = "x"
would actually "be" the local variable x
and include type information so that you could look up the type of x
directly from the LNode
, something like ((LocalVariable)x.Name).VarType
, if that makes sense. Similarly, the Symbol with Name = "Foo"
would actually represent the method Foo(int)
.
Hygiene could also be achieved by a macro simply producing Symbol
s that are in a "local" SymbolPool
(although I wonder if Symbol Pools are even a useful concept - non-global Symbols could be created without a pool, that's how it works in JavaScript). This doesn't work in the usual LeMP workflow (that I use), since EcsNodePrinter
just prints the Symbol.Name
without regard for whether it's a global symbol or not, so two different Symbols called x
end up with the same name in the output. But inside a compiler that problem need not occur, and when outputting plain C# I could fix the problem by adding a unique-naming pass, before printing, whose job would be to ensure that all non-global symbols get a unique name string.
So... does such a design sound like a good idea to you?
from ecsharp.
Looks like the discussion is sort of diverging into two separate subjects. I've split my response into two sections accordingly.
Semantic macros
Truth be told, semantic macros as a replacement for a lexical macro-based design are a tough sell for me, so I'll argue against them now. (Sorry!)
Let's start off with a point of agreement. I think you're absolutely right that writing a #foreach
macro is pretty clunky. It's just that the alternative so far – baking it into the compiler – is so much worse. Perhaps it's worth explaining my two reasons for writing #foreach
as a macro:
foreach
is inherently platform-dependent. For example, .NET runtime hasinterface IEnumerable<T>
andinterface IEnumerable<T>
, withforeach
being little more than syntactic sugar for a loop that calls methods on these interfaces. Butecsc
is not as biased toward the .NET runtime ascsc
andmcs
, and can (at least theoretically) generate code for other platforms. By implementingforeach
as a macro, a different platform-specific definition can be written forforeach
. This wouldn't have been possible if I had bakedforeach
in theecsc
source code.- Writing
foreach
as a lexical macro is, for all its clunkiness, still significantly less clunky than writing it as a "normal" expression/statement that the compiler can analyze.ecsc
'slock
implementation is significantly more complex than theforeach
macro, and that complexity buysecsc
very little. Admittedly, error diagnostics forlock
are much clearer than the error diagnostics for theforeach
macro, but theforeach
macro can be platform-specific, whereas thelock
implementation is forever dependent on the existence of a type namedSystem.Threading.Monitor
, regardless of the platform for which code is generated.
These two points, especially the first, favor neither lexical nor semantic macros. Both can be used to create platform-specific definitions for (essentially platform-agnostic) constructs such as lock
, using
and foreach
. I don't think a semantic macro will be more concise than a lexical macro with builtins (especially if that lexical macro uses quote { }
), but I suppose that a semantic macro might be able to provide better diagnostics than the current lexical macros with builtins.
But I'm not convinced by your other criticisms. Specifically:
- On chaining macros.
RequiredMacros.cs
defines all dependencies for the#foreach
macro in the same namespace, in the same file, in the same assembly. It's all pretty cohesive. There's just no way that#foreach
will get imported without its dependencies.
And I don't think it's reasonable to expect#foreach
(or any other macro, for that matter) to continue to work fine when someone writes a conflicting definition (with an equal priority) for its dependencies. If that happens, then I expect LeMP to either report an error, or just silently pick one of those macro definitions and haveecsc
diagnose any errors that this may cause. Besides, the macros inLeMP.StdMacros
are just as vulnerable to this sort of thing as#foreach
. I could easily redefineif
as a macro that always picks the then-branch, and I'm sure that'd break no small amount of macros inLeMP.StdMacros
. - The performance argument. You're right that encoding the macro's logic in Loyc trees results in additional processing time. Interpreting these trees is probably slower than handling the macro's logic from the compiler itself. But I believe semantic macros would be slower still, because they necessitate an additional intermediate representation: an IR that is annotated with type and symbol information. And embedding that type of information in classes derived from
Symbol
doesn't seem like a bad design, but that IR would still need to be constructed and then lowered to Flame's IR (ecsc
translates Loyc trees directly into Flame IR at the moment). The construction/deconstruction passes would quite possibly impact performance far more negatively than a handful of builtin nodes ever could.
Though my main argument against semantic macros is their complexity. They add yet another major pass to the compilation process. That means that anybody who seeks to fully understand how EC# works must also learn how semantic macros work, which constructs they can affect, the order in which they are expanded, what the structure and rules of the IR they operate on are, etc.
Also, semantic macros complicate macro development. Right now, an EC# macro is always a lexical macro. But if semantic macros are introduced as well, then programmers will suddenly have to ask themselves which type of macro they should create. If semantic macros are strictly more powerful than lexical macros, then they might have to re-write entire lexical macros as semantic macros when they realize they need a feature that only semantic macros offer.
Plus, semantic macros are arguably harder to verify by the compiler than lexical macros. Right now, the (Flame) IR generated by ecsc
is correct by construction. If the input LNode
s contain errors that must be diagnosed at compile-time, then ecsc
will do just that, and will subsequently terminate the compilation process; ecsc
either reports an error or produces a correct IR tree. But a semantic macro could sneakily insert a semantically invalid construct, without the compiler taking notice. This would result in very hard-to-detect bugs.
As a final point in favor of lexical macros with builtins: this type of thing has precedent in other languages. The D programming language, for example, defines std.traits
, a module that defines function templates which answer questions about the source code. That is eerily similar to ecsc
's builtins. For example, isArray
is the D equivalent of #builtin_is_array_type
. If I recall correctly, then C++ has similar metaprogramming libraries.
Going forward, I think that it would be better to wrap ecsc
's builtins into a standardized macro library, just like D has done. (Apparently, Phobos uses dmd
builtins under the hood, as well: isNested
seems like a fair example).
Symbols and hygiene
I very much like the idea of using separate SymbolPool
s to guarantee that symbols don't overlap. It certainly is a lot cleaner than the builtin-based design. I don't have a lot of spare time right now, but I think it's definitely worth implementing. I'll try to implement it in ecsc
and its macro library once I get the opportunity to do so.
Or do you think that it makes more sense to just use LeMP's (future) renaming pass in ecsc
? That would have the extra advantage of producing less confusing output when ecsc
is instructed to print macro-expanded source code (via the -E
switch).
from ecsharp.
Semantic macros
Let me start here because I suspect you misunderstand what I was suggesting:
If the input LNodes contain errors that must be diagnosed at compile-time, then ecsc will do just that, and will subsequently terminate the compilation process; ecsc either reports an error or produces a correct IR tree. But a semantic macro could sneakily insert a semantically invalid construct, without the compiler taking notice. This would result in very hard-to-detect bugs.
I'm suggesting the same kind of semantic macros as Nemerle has. The macros would run after the list of types and methods has been built, but before method bodies have been converted to IR. You can't sneak anything invalid in if macros run before semantic analysis. (I think in Nemerle you might be able to do semantic analysis on local variables too, and create new class members.... I don't know how that works and I haven't easily found details online; if we do this we should install Nemerle and do some experiments to understand the details.)
And the second-stage macros should operate on Loyc trees so that users can easily switch which kind of macro they are writing, and so that the same MacroProcessor can still be used.
As a final point in favor of lexical macros with builtins: this type of thing has precedent in other languages. The D programming language, for example, defines std.traits, a module that defines function templates which answer questions about the source code.
Isn't that quite different? I didn't try any serious metaprogramming in D but IIRC, you can write code that calls those trait (builtin) functions directly and immediately acts on the results. That's more powerful and general than generating a syntax tree that eventually calls trait functions, as a lexical macro must do.
They add yet another major pass to the compilation process.
Since some of the compiler's own work could be implemented with them, it's not necessarily an additional pass beyond what you'd do anyway, is it? How many passes do you use already?
On chaining macros. RequiredMacros.cs defines all dependencies for the #foreach macro in the same namespace, in the same file, in the same assembly. It's all pretty cohesive. There's just no way that #foreach will get imported without its dependencies
I'll tell you a secret... I originally planned to let people write fully qualified macro names like Namespace.MacroName(...)
, I don't remember actually implementing that, but I'm seeing some code to support it in MacroProcessorTask.GetApplicableMacros
so, yeah, maybe you can already invoke a macro without its namespace being imported.
Symbols and hygiene
Or do you think that it makes more sense to just use LeMP's (future) renaming pass in ecsc?
I'm confused because this is not an either/or question. If multiple symbol pools are used then a renaming pass is required to avoid name collisions in the text output.
Another thought, perhaps one should be able to write quote(pool) {...}
to use a specified symbol pool for all identifiers in a quotation.
One more thought. For hygienic namespace lookup, to accomplish the same thing that Nemerle does, macros could instruct LeMP to look up macros from specific namespaces by wrapping their output in a certain command, say, #macroNamespaceContext()
.
As an example, let's say macro1
is in NamespaceA
and it wants to return a call to macro2
in NamespaceB
- without using a fully-qualified name - and there is also a macro2
in NamespaceC
. macro1
is called from user code like this:
#importMacros(NamespaceA);
#importMacros(NamespaceC);
class Foo { ... }
macro1(otherMacro(Foo));
The user is also trying to call otherMacro
from NamespaceC
.
Let's say the macro originally returns something nonhygienic like this:
return quote { macro2($(node[0])) };
That is, macro1
is a useless macro that forwards the call to macro2
. But the point is it wants macro2
from NamespaceB
, while otherMacro
should still be resolved from NamespaceC
. The returned syntax tree in this case is
macro2(otherMacro(Foo));
Whereas the hygienic return would look more like this:
SymbolPool pool = new SymbolPool();
return F.Call("#macroNamespaceContext",
F.Literal(pool),
F.Literal(new DList<Symbol> {(Symbol)"NamespaceC"}),
F.Call(pool["macro2"], node[0]));
so the first argument is a Literal containing a SymbolPool
, the second argument is a Literal containing an IReadOnlyCollection<Symbol>
of namespaces*, and the third argument is the actual output from the macro. Finally, the macro processor would use each Symbol's SymbolPool
to decide in which list of namespaces to search for that Symbol.
(* the macro processor currently doesn't support qualified names stored as LNode - the qualified name Foo.Bar
is just a Symbol with a literal dot in its name - and maybe it should stay that way since Symbols are dramatically faster than LNode as a dictionary key, or for comparisons.)
Obtaining hygiene is a bit clunky this way, but a helper function and/or macro could help.
from ecsharp.
My reply turned out to be significantly longer than I hoped it would be. Sorry about that.
Semantic macros
The macros would run after the list of types and methods has been built, but before method bodies have been converted to IR. You can't sneak anything invalid in if macros run before semantic analysis.
Right, but isn't that a little contradictory? I mean, suppose that a type Foo
has been analyzed, and we now encounter an expression Foo.Bar
. Foo
in Foo.Bar
could a type, a namespace or a value, and there's really only one way to ascertain what Foo
is in the context wherein it appears: by analyzing the expression Foo.Bar
. And semantic macros need to be able to answer questions like: "is Foo
a type?" You probably also want semantic macros to be able to discover what Foo
's type is, if it is a value rather than a type, which requires for all preceding statements and expressions to have been analyzed.
So, really, semantic macros would have to be evaluated in the middle of the semantic analysis process.
Which brings up the problem of what exactly a semantic macro returns. For example, a #foreach
macro has to know what the type of its collection expression is, so it'll want to evaluate its collection argument node first, check what that collection's type is, build a loop with one or more induction variables, and only then evaluate the loop body node in the context of that loop.
At the moment, that logic is captured by the #foreach
lexical macro as a Loyc tree, and the compiler then analyzes the resulting tree.
Now suppose that a semantic macro has to do the same job. There are basically two options here:
- It could produce an IR tree, but then the macro is given the opportunity to sneak in illegal IR.
- It could return a Loyc tree which is then fed to the remainder of the semantic analysis process. But then the collection expression is evaluated twice: once by the semantic macro to figure out which construction it should use, and once more by the semantic analysis, as the semantic macro will have to embed the collection expression in the Loyc tree produces. This isn't just bad for performance, it also implies that any (warning/error) diagnostic related to the collection expression is now printed twice.
Alternatively, the semantic macro could produce some kind of wishy-washy Loyc tree that contains both unanalyzed Loyc trees and IR trees. But that's just option one in disguise, as the macro could easily insert invalid IR trees in that mixed Loyc/IR tree.
I also don't feel comfortable with imposing Flame IR on the macro writer, because that implies that any breaking change to Flame is a breaking change to EC#. But if Flame IR is not used in semantic macros, then they'd require (at the very minimum) an entire type system to insulate the semantic macros from Flame's type system. That'd be a lot of work, would definitely hurt performance, make it harder for (other) people to write an EC# compiler, and would add little value to the language.
Anyway, I just like the (apparent) simplicity of lexical macros. They're fairly easy to write and understand, and I don't feel the same way about semantic macros. Maybe I'm wrong in thinking that, but the only way to know for sure what EC# semantic macros would be like is to build a prototype, and then take things from there. That'd be a massive undertaking, though, and I'd rather just keep on extending ecsc
up to the point where it starts becoming a reasonable alternative to csc
and mcs
for some projects. Can we put semantic macros on hold until we get to that point? There's still a lot of work to be done before ecsc
is mature enough to compile itself.
Isn't that quite different? I didn't try any serious metaprogramming in D but IIRC, you can write code that calls those trait (builtin) functions directly and immediately acts on the results. That's more powerful and general than generating a syntax tree that eventually calls trait functions, as a lexical macro must do.
I don't completely understand what you mean by that. Can you give me an example of "code that calls those trait (builtin) functions directly and immediately acts on the results?"
Since some of the compiler's own work could be implemented with them, it's not necessarily an additional pass beyond what you'd do anyway, is it? How many passes do you use already?
I was referring to the passes specified by the language itself: preprocessing, lexing, parsing, lexical macro expansion and semantic analysis. In hindsight, I was wrong to say that semantic macros necessitate another pass. They probably just extend the semantic analysis pass, but in an awkward way.
I'll tell you a secret... I originally planned to let people write fully qualified macro names like Namespace.MacroName(...), I don't remember actually implementing that, but I'm seeing some code to support it in MacroProcessorTask.GetApplicableMacros so, yeah, maybe you can already invoke a macro without its namespace being imported.
Well, I sure didn't see that one coming. I'm glad you're considering hygienic macro imports, though. I wouldn't mind switching to those once they become available.
Symbols and hygiene
I'm confused because this is not an either/or question. If multiple symbol pools are used then a renaming pass is required to avoid name collisions in the text output.
What I meant was that storing Symbol
instances in the ecsc
's symbol would solve a problem that has already been solved if LeMP is going to rename all locals anyway.
Another thought, perhaps one should be able to write quote(pool) {...} to use a specified symbol pool for all identifiers in a quotation.
Sounds like a great idea.
Obtaining hygiene is a bit clunky this way, but a helper function and/or macro could help.
Yeah, that syntax is sort of clunky. But I suppose that's okay if we can find an appropriate macro to build on top of that. I think a variant of quote (pool) { ... }
that implicitly creates a pool, and then imports a set of namespaces could work nicely. Here's an example of what I imagine what that might look like (I know #quoteWithMacroImports
is a bit silly; I haven't put much thought into what to call this macro).
return #quoteWithMacroImports (NamespaceC, OtherMacros)
{
macro2($(node[0]))
};
from ecsharp.
Related Issues (20)
- Base compileTime on dotnet-script CLI tool instead of Roslyn scripting HOT 11
- Support F#ish object expression or Java#ish anonymous clases HOT 2
- Add `define` macro with first argument treated as `this` to enable chaining HOT 29
- Comments before multi-using statements are erased HOT 4
- await fluent operator HOT 1
- Error when transforming the `typeof(X<>)` construct HOT 2
- The error Semicolon': expected Colon for the specific code HOT 2
- Support C# 9 pattern matching HOT 3
- Ref locals cannot have parentheses
- EC#: Syntax error in `Foo<T?>`
- Qualified names of identifier macros not working?
- 'with' or quick binding operator bug HOT 1
- EC#: [return: ...] attribute sometimes causes IndexOutOfRangeException in InternalList
- EC#: `#pragma warning` is not propagated to output file
- EC#: Add support for suffix `!` operator
- EC#: New `?>` operator causes parser errors in code like `X<T?>` HOT 1
- Lemp integration - custom macro is not been called
- Lemp - Match only on attribute name
- Add Deconstruct Method To All LNode types to use c# pattern matching
- FPL Abs bug HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ecsharp.