benjamn / ast-types Goto Github PK
View Code? Open in Web Editor NEWEsprima-compatible implementation of the Mozilla JS Parser API
License: MIT License
Esprima-compatible implementation of the Mozilla JS Parser API
License: MIT License
Every PathVisitor
instance has a class visitor.Context
such that Object.getPrototypeOf(visitor.Context.prototype) === visitor
, and the value of this
within visitor methods is an instance of this Context
class.
This means that any properties accidentally added to this
during the visitor method will be thrown away (or become inaccessible) from other visitor methods, probably indicating a bug.
It should be easy to warn about this by checking Object.keys(context)
after calling the visitor method in context.invokeVisitorMethod
.
That is, make sure every Syntax.*
type used by Esprima is represented in our type system. Not a big problem if Esprima doesn't know about all of our types—only if we don't know about all of Esprima's types.
Besides master, I would also like to ensure coverage of harmony and fb-harmony.
Using exportDeclaration, it doesn't seem possible to build the current export default
syntax:
export default foo;
I'm not sure if I'm missing something or not; currently I'm just manually creating:
{
type: 'ExportDeclaration',
default: true,
declaration: {
type: 'Identifier',
name: 'foo'
}
}
I'd be happy to do a PR to fix this, just want to make sure I'm not missing something in the def
for moduleDeclaration
that allows building that object
The alternative is to stop traversing, which can be a cryptic failure mode.
Hi, great job at first!
I'm writing continuous passing style transformation and i have a problem breaking my mind.
I've done almost everything, intercepting callback marker, replacing with a function call and nesting all the sibling nodes.
But i can't figure out how to remove sibling nodes from parent after copying them in the callback function so duplicates persits.
I'm visiting Expressions and calling path.get('elements') or path.parent.get('elements') or path.parentPath.get('elements') and so on... and did't the trick.
I extract siblings from 'path.parent.parentPath.value' starting from the index of the current node, but how can i remove the siblings?
It seems an assignment like
path.parent.parentPath.value = _.initial(path.parent.parentPath.value, current_index)
// (initial returns the first part till the current_index)
is not working, do i have to use the .replace() function?
path.parent.parentPath.value.replace is not defined,
path.parent.parentPath.replace is not defined too.
Any suggestions?
Thank you really much for your work!
I still think there's some value in having a distinction between the Path
type and its derived NodePath
type, but we can accomplish that by having two separate source files within the ast-types
package, rather than having a separate NPM package.
I want to automate building AST subtrees, something like:
var types = require('ast-types');
var nodeParams = { /* my params here */};
types.builders[nodeParams.type].apply(this, ???);
Is there a way to find out what is the proper order of params I should pass to a node builder of a specific type?
There are .buildParams
defined on a type, but for now it seems impossible to get access to that list from external module.
It would be nice to have something like types.namedTypes[type].fields
I suppose?
Or to make getFieldNames(node)
return properly ordered fields?
Or just to make any builder accept object with params, not only list of arguments.
@benjamn I can probably do a PR, if you wish.
This is first of all a question, shouldn't the new nodes added by replace be traversed?
Ex:
types.traverse(ast, function() {
if ( ... ) {
this.replace(functionExpressionNode);
}
});
functionExpressionNode
doesn't get traversed.
When having to perform multiple passes to apply different transformations, like for instance when using transpiling from ES6, the speed of the visitor algorithm is specially important. I saw an issue resugar/resugar#64 that suggested running transformations only on files that actually had some syntax to transform. I think that's good but could be done way better if it was integrated here.
Basically the algorithm is like this:
hasTypes
propertyhasTypes
with the all the seen children's hasTypes
hasTypes
or it hasn't been initialised.Now, to make that fast we can limit the hasTypes
property to be a Number and play with Javascript's bit operators (hence limited to 32bits), it's probably just enough. Since there are more than 32 types we can't fix them to a given bit, instead we can use a Bloom filter and offer a pretty good approximation. So the logic becomes:
k
bits set, 2 seems a good k
for the number of types (around 50?).hasTypes
property, for each child in the node it's just: node.hasTypes | child.type.hash
node.hasTypes & type.hash === type.hash
(Note: some false positives can occur)There are edge cases, like a Visitor implementing the generic visit
method, in that case the traversal must always descend. There are also probably some cases that would require invalidation of the hasTypes
(from the node to the root), I'm not familiar enough with the visiting/transformation algorithm to be sure.
What I don't have clear in my head right now by looking at the current visitor design (using the auxiliary path object) is if the library assumes that nodes are immutable and so we can't assign the hasTypes
hash to them. The paths seem to be cached, and could be a candidate to hold it, but I've failed to see if those caches would survive different traversals.
Currently the traversal will create a 1:1 mapping between actual syntax Nodes and NodePaths, the paths are cached for later visits but they don't survive separate traversals, like when chaining different ES6 transformations.
I've run a very contrived benchmark that would always reuse a single NodePath just changing it's value property for each visit. The results are quite good with over a x2 increase when visiting underscode.js, so I think this route has potential.
Two options here that I see, the first one should be low hanging fruit, expose to the caller the root NodePath used in a traversal (which should have all the children already cached) and allow it to be passed along for further calls.
// First transformation returns the root nodepath
var rootPath = types.visit(ast, firstVisitor)
// Second transformation uses it instead of the AST (so the cache is used)
rootPath = types.visit(rootPath, secondVisitor)
The second option is a bit more hard to pull off but should improve the speed even on single passes. When visiting the tree we don't need to keep the whole structure, just the stack of nodes that took us to a given node. So a single NodePath can be used for the traversal, we just update its stack before passing it on to the visitor.
var path = new NodePath(ast);
assert( path.stack === [Program] === [ast] )
path.stack.push('body'); // The field name
path.stack.push(path.value.body) // The field value
ast.body.forEach(function (item, idx) {
path.stack.push(idx); // field name
path.stack.push(item); // field value
assert( path.stack === [Program, 'body', [Stmt, Stmt], 0, Stmt] )
visit(path);
path.stack.pop();
path.stack.pop();
assert( path.stack === [Program, 'body'] )
});
The limitation here is that a visitors cannot directly hold a reference to the passed NodePath, since it'll be mutating by the traversal. I don't see why it would want to keep it but in any case, to do so it could be documented that it needs to call first path.clone()
or path.freeze()
to get a copy of the path for future use. Obviously it's still possible to keep references to path.node
.
The current methods in NodePath, like .each
will also need to create a new NodePath instance but they are only used on transformations and even in those cases the copy can be optimized:
NodePath.value = function () {
return this.stack[ this.stack.length - 1 ];
};
NodePath.each = function (cb) {
var list = Array.copy(this.value); // Freeze collection for semantics
var iterPath = this.clone();
list.forEach(function (item, idx) {
iterPath.stack.push(idx);
iterPath.stack.push(item);
cb(iterPath);
iterPath.stack.pop();
iterPath.stack.pop();
});
}
Note that to handle Scope
it can either be included as a virtual item in the stack ([Statement, Scope, Function]
) or act as a wrapper for a node ([Statement, Scoped(Function)]
) and then handle the unwrapping on the NodePath logic.
So by having the limitation of a mutable NodePath reference, which isn't a bad thing IMHO, we can eliminate the creation of thousands of objects from the visitor logic, improving speed and memory usage.
Edit: I forgot to mention that creation of NodePath instances can probably be controlled from the traversal, so it's even possible to re-use instances like it's done for Context
.
@benjamn, we need to expand the definition of ImportDeclaration
to include support for a new kind
of import, as today, we have named
and default
, as in this block:
def("ImportDeclaration")
.bases("Declaration")
.build("specifiers", "kind", "source")
.field("specifiers", [def("ImportSpecifier")])
.field("kind", or("named", "default"))
.field("source", ModuleSpecifier);
I will suggest batch
or binding
, but I'm pretty bad naming things.
Once we decide on the name, I can finish the work on esprima to get the new syntax in.
The indentation of a node is dependent on the indentation of its ancestors, so this is a natural piece of information for NodePath
to provide.
// core.js
def("Function")
.bases("Node")
.field("id", or(def("Identifier"), null), defaults["null"])
.field("params", [def("Pattern")])
.field("body", or(def("BlockStatement"), def("Expression")));
What would an ES5 function without a block statement look like? Is this just here to support arrow functions in ES6? If so, wouldn't it be better to override it in es6.js
?
Executing the file in a browser should define a global variable called AstTypes
.
This will require shimming Object.defineProperty
, among other things.
Currently, this project is geared towards ES, but since the interface is so nice for building asts, I'd like to use it in other parsing projects.
Would it be possible to make the project more generalized so that it doesn't load all the types in def/*
unless requested?
If I have the following file:
function bar(i) {
if(i > 0) {
return i + bar(i - 1);
}
return 0;
}
function foo() {
var x = 5;
var y = bar(x);
return function() { return y; };
}
And I parse it with ast-types, the bar
and foo
function nodes don't have source location (loc
is null). Weirdly, the anon function returned in foo
does. What is the status of source location support? Is it still buggy?
Specifically, the Path
type should support a .replace(...)
method that can take any number of replacements (zero means to remove the current node).
Currently, this assertion succeeds:
require("ast-types").builtInTypes.string.arrayOf().assert(["a",,"c"])
It probably should not succeed, because the second element of the array is missing/undefined
.
I am trying to use regenerator with jstransform and I am getting the following error when transpiling one of my files:
Message:
regenerator error while transforming <filename>:
did not recognize object of type "ObjectTypeAnnotation"
Lack of line information makes this hard to debug.
Also, I'm trying to use jstransform with harmony
and stripTypes
enabled. This does not seem to play so well with regenerator.
I can't repro this in an isolated case but there is an off-by-one error for a combination of an
path.insertBefore(...nodes)
and path.replace()
(deletion). I worked around by moving from:
// path.insertBefore have called multiple times before this.
path.insertBefore(...nodes);
path.replace();
to
// path.insertBefore have called multiple times before this.
path.replace(...nodes);
I'll keep trying to repro in isolated code but I wanted to report the issue incase you have any ideas
The examples currently use types.traverse
, which is still available, but deprecated.
It would be very useful if you could evaluate n.LogicalExpression.getFieldType("operator")
to retrieve (in this case) the LogicalOperator
type.
Before writing any code, I want to make sure it made sense to add a method to the Scope
class to check if the scope is in strict mode?
specifically .traverse()
Hey @benjamn, what license is ast-types under?
The idea is to be able to treat different node types as if they were the same type. You already have a type which could be considered virtual: Function
, which defines the shared properties of FunctionExpression
s and FunctionDeclaration
s. However, FunctionExpression
and FunctionDeclaration
have to explicitly "inherit" from Function
.
Lets consider another example: Identifier
and JSXIdentifier
. For static analysis and code mods, I might want to treat both of these types as identical.
I would like to be able to write logic to process nodes independently of the concrete node type. Virtual types could be created dynamically and concrete node types could be associated with virtual types dynamically.
I'm not sure if what I'm saying makes sense at all , I'm not especially good at expressing my ideas. Or maybe something like this is already possible to do and I just don't know how. Happy to talk about this via a different channel.
This is to distinguish files that contain lots of def
declarations, like lib/core.js
, from files that implement the type system, like lib/types.js
.
I'm trying to call regenerator
from Atom using gulp
. Atom has a Content Security Policy (CSP) that basically disallows eval()
.
regenerator
depends on recast
, which depends on ast-types
. traverse.js in ast-types
currently contains the following code:
var deprecate = require("depd")('require("ast-types").traverse');
var deprecatedWrapper = deprecate.function(
traverseWithFullPathInfo,
'Please use require("ast-types").visit instead of .traverse for ' +
'syntax tree manipulation'
);
deprecate.function
in is wrapFunction
in depd
, which contains the following code:
var deprecatedfn = eval('(function (' + args + ') {\n'
+ '"use strict"\n'
+ 'log.call(deprecate, message, site)\n'
+ 'return fn.apply(this, arguments)\n'
+ '})')
This is a long way of saying that I cannot use regenerator
in Atom because of this transitive dependency on eval()
. Atom has a package named loophole
to work around these sorts of issues on a one-off basis, though that would require a change to depd
whereas I think the better thing to do would be to eliminate the deprecated code here in ast-types
.
Thoughts? Is it hard to eliminate this deprecated code? Or would it be OK to stop flagging it as deprecated, at least for one release :)
Just as we currently have .namedTypes.MemberExpression
and .builders.memberExpression
, perhaps we could also have an auto-generated .patterns
namespace containing functions like .patterns.MemberExpression
for building arbitrary pattern-matching types, e.g.
var types = require("ast-types");
var p = types.patterns;
var thisFooPattern = p.MemberExpression(
p.ThisExpression(),
p.Identifier("foo")
);
types.traverse(ast, function(node) {
if (thisFooPattern.check(node)) {
console.log("found this.foo:", node);
}
});
Not sure if this an issue or not but removing nodes via replace
can leave their parent nodes around even if they have no children. This can result in weird appendages being printed out from a post transform AST.
i.e.
var y = 1,
x = 2;
Lets say you remove the two VariableDeclarator
s this does not clean up the now redundant VariableDeclaration
so printing out results in.
var undefined;
There are similar issues with ExpressionStatement
s having no expression
value resulting in
;
Being printed out.
Do you feel this is something ast-types
should handle?
I can take a look at it if you think it's an issue with ast-types
.
For now it’s not clear what’s behaviour of Scope’s methods, it’d be nice to have some detailed docs sometime.
In the definition of VariableDeclaration you have the first kind
param and the second declarations
, whereas spec describes declarations
as first and then kind
. Esprima also prints kind
as the last param, thogh it doesn’t matter as far it is object.
Is there some reason to keep exactly that order of params in builder?
Is there a way to break traversing?
For example, it can be done via throwing an exception:
var visitor = {};
visitor['visit' + nodeType] = function (path) {
this.traverse(path);
if (!test(path.node)) throw 1;
};
try {
visit(ast, visitor);
} catch (e) {}
But is there a more elegant way to do so?
Like this.break()
, or this.stop()
within the callback, similar to this.traverse(path)
?
The problem I had before was due to transformations not keeping the source locations in the final nodes. There are a few key places that need to copy along the source locs.
What's the right pattern to do this with builders? Do I need to assign the node to a variable and then do something like newNode.loc = oldNode.loc
? Or can I pass along the source locations to the builders somehow?
I wanted to give you a heads up about facebookarchive/esprima#85 (I know you're CC'ed but just in case) and figure out what we can do to make that transition smooth. The change should be easy to make (I literally just did a find+replace). I don't know how you want to version this lib - I'm sure we'll just bump major for esprima.
That is, change lines like this:
assert.ok(n.Expression.check(node));
to this:
n.Expression.assert(node);
All too easy to bang out
types.visit(ast, {
visitFunctino: function(path) { ... }
});
and somehow never visit any Functino
nodes.
We already use the def
syntax to generate AST types and builders automatically, so it would be awesome if we also generated documentation at https://benjamn.github.io/ast-types/documentation, or something like that.
If a Path
object is an array type, it should probably act more like the built-in Array
class.
Here's what I'm thinking:
each
to forEach
length
propertyArray
methods, such as: concat
, every
, find
, reduce
, reverse
, slice
, some
, sort
, splice
.I see 0.5.0 is the latest in npm but I don't see its changes in the repo.
For writing compiler to JavaScript, I'm using Jison and would like all the Nodes to have method at(...)
so I could easily add Jison location to any existing expression or statement by transforming it from Jison format to SourceLocation
.
So I would just write smth like b.literal('123').at(@$)
and use that argument to set loc
property of any Node at
was called on.
Is there ability to achieve that?
When I use the jsx transpiler from react-tools, I tend to have an entire directory of code that needs to be transpiled before it can be run. In those cases, my package.json
includes this stanza:
"scripts": {
"prepublish": "node_modules/.bin/jsx --harmony --strip-types src/ lib/"
}
Because the CLI of bin/regenerator
does not take input and output directories as parameters, it is hard to use in this way. Ideally, I think I would do:
"main": "./lib/main",
"devDependencies": {
"react-tools": "~0.11.2",
"regenerator": "^0.7.1"
},
"scripts": {
"prepublish": "node_modules/.bin/regenerator --include-runtime src/ temp_src/ node_modules/.bin/jsx --harmony --strip-types temp_src/ lib/"
}
Alternatively, I could create yet another Node package to do this and add that to devDependencies
, but I would like to avoid that, if possible.
I'm writing some code that is trying to pick up variable appearances (to eventually determine global variables in the code). Unfortunately, I can't rely on node.type === "Identifier" for node.name to be noted as a used reference, because Identifier is also used in other places (such as for label names). If however, I could do something like:
if (aNode.type === "identifier" && getFieldType(aNode.parent, "whatever this field was").expects(Expression))
Then I could know that this is an identifier behaving like an expression, and thus an actual candidate for a reference to a variable. If there is some other much more trivial way of figuring this out that I'm missing, I'm also happy to hear that.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.