benjamn / ast-types Goto Github PK

Esprima-compatible implementation of the Mozilla JS Parser API

License: MIT License

JavaScript 9.30% Shell 0.13% TypeScript 90.57%

ast-types's Introduction

AST Types

This module provides an efficient, modular, Esprima-compatible implementation of the abstract syntax tree type hierarchy pioneered by the Mozilla Parser API.

Installation

From NPM:

npm install ast-types

From GitHub:

cd path/to/node_modules
git clone git://github.com/benjamn/ast-types.git
cd ast-types
npm install .

Basic Usage

import assert from "assert";
import {
  namedTypes as n,
  builders as b,
} from "ast-types";

var fooId = b.identifier("foo");
var ifFoo = b.ifStatement(fooId, b.blockStatement([
  b.expressionStatement(b.callExpression(fooId, []))
]));

assert.ok(n.IfStatement.check(ifFoo));
assert.ok(n.Statement.check(ifFoo));
assert.ok(n.Node.check(ifFoo));

assert.ok(n.BlockStatement.check(ifFoo.consequent));
assert.strictEqual(
  ifFoo.consequent.body[0].expression.arguments.length,
  0,
);

assert.strictEqual(ifFoo.test, fooId);
assert.ok(n.Expression.check(ifFoo.test));
assert.ok(n.Identifier.check(ifFoo.test));
assert.ok(!n.Statement.check(ifFoo.test));

AST Traversal

Because it understands the AST type system so thoroughly, this library is able to provide excellent node iteration and traversal mechanisms.

If you want complete control over the traversal, and all you need is a way of enumerating the known fields of your AST nodes and getting their values, you may be interested in the primitives getFieldNames and getFieldValue:

import {
  getFieldNames,
  getFieldValue,
} from "ast-types";

const partialFunExpr = { type: "FunctionExpression" };

// Even though partialFunExpr doesn't actually contain all the fields that
// are expected for a FunctionExpression, types.getFieldNames knows:
console.log(getFieldNames(partialFunExpr));
// [ 'type', 'id', 'params', 'body', 'generator', 'expression',
//   'defaults', 'rest', 'async' ]

// For fields that have default values, types.getFieldValue will return
// the default if the field is not actually defined.
console.log(getFieldValue(partialFunExpr, "generator"));
// false

Two more low-level helper functions, eachField and someField, are defined in terms of getFieldNames and getFieldValue:

// Iterate over all defined fields of an object, including those missing
// or undefined, passing each field name and effective value (as returned
// by getFieldValue) to the callback. If the object has no corresponding
// Def, the callback will never be called.
export function eachField(object, callback, context) {
  getFieldNames(object).forEach(function(name) {
    callback.call(this, name, getFieldValue(object, name));
  }, context);
}

// Similar to eachField, except that iteration stops as soon as the
// callback returns a truthy value. Like Array.prototype.some, the final
// result is either true or false to indicates whether the callback
// returned true for any element or not.
export function someField(object, callback, context) {
  return getFieldNames(object).some(function(name) {
    return callback.call(this, name, getFieldValue(object, name));
  }, context);
}

So here's how you might make a copy of an AST node:

import { eachField } from "ast-types";
const copy = {};
eachField(node, function(name, value) {
  // Note that undefined fields will be visited too, according to
  // the rules associated with node.type, and default field values
  // will be substituted if appropriate.
  copy[name] = value;
})

But that's not all! You can also easily visit entire syntax trees using the powerful types.visit abstraction.

Here's a trivial example of how you might assert that arguments.callee is never used in ast:

import assert from "assert";
import {
  visit,
  namedTypes as n,
} from "ast-types";

visit(ast, {
  // This method will be called for any node with .type "MemberExpression":
  visitMemberExpression(path) {
    // Visitor methods receive a single argument, a NodePath object
    // wrapping the node of interest.
    var node = path.node;

    if (
      n.Identifier.check(node.object) &&
      node.object.name === "arguments" &&
      n.Identifier.check(node.property)
    ) {
      assert.notStrictEqual(node.property.name, "callee");
    }

    // It's your responsibility to call this.traverse with some
    // NodePath object (usually the one passed into the visitor
    // method) before the visitor method returns, or return false to
    // indicate that the traversal need not continue any further down
    // this subtree.
    this.traverse(path);
  }
});

Here's a slightly more involved example of transforming ...rest parameters into browser-runnable ES5 JavaScript:

import { builders as b, visit } from "ast-types";

// Reuse the same AST structure for Array.prototype.slice.call.
var sliceExpr = b.memberExpression(
  b.memberExpression(
    b.memberExpression(
      b.identifier("Array"),
      b.identifier("prototype"),
      false
    ),
    b.identifier("slice"),
    false
  ),
  b.identifier("call"),
  false
);

visit(ast, {
  // This method will be called for any node whose type is a subtype of
  // Function (e.g., FunctionDeclaration, FunctionExpression, and
  // ArrowFunctionExpression). Note that types.visit precomputes a
  // lookup table from every known type to the appropriate visitor
  // method to call for nodes of that type, so the dispatch takes
  // constant time.
  visitFunction(path) {
    // Visitor methods receive a single argument, a NodePath object
    // wrapping the node of interest.
    const node = path.node;

    // It's your responsibility to call this.traverse with some
    // NodePath object (usually the one passed into the visitor
    // method) before the visitor method returns, or return false to
    // indicate that the traversal need not continue any further down
    // this subtree. An assertion will fail if you forget, which is
    // awesome, because it means you will never again make the
    // disastrous mistake of forgetting to traverse a subtree. Also
    // cool: because you can call this method at any point in the
    // visitor method, it's up to you whether your traversal is
    // pre-order, post-order, or both!
    this.traverse(path);

    // This traversal is only concerned with Function nodes that have
    // rest parameters.
    if (!node.rest) {
      return;
    }

    // For the purposes of this example, we won't worry about functions
    // with Expression bodies.
    n.BlockStatement.assert(node.body);

    // Use types.builders to build a variable declaration of the form
    //
    //   var rest = Array.prototype.slice.call(arguments, n);
    //
    // where `rest` is the name of the rest parameter, and `n` is a
    // numeric literal specifying the number of named parameters the
    // function takes.
    const restVarDecl = b.variableDeclaration("var", [
      b.variableDeclarator(
        node.rest,
        b.callExpression(sliceExpr, [
          b.identifier("arguments"),
          b.literal(node.params.length)
        ])
      )
    ]);

    // Similar to doing node.body.body.unshift(restVarDecl), except
    // that the other NodePath objects wrapping body statements will
    // have their indexes updated to accommodate the new statement.
    path.get("body", "body").unshift(restVarDecl);

    // Nullify node.rest now that we have simulated the behavior of
    // the rest parameter using ordinary JavaScript.
    path.get("rest").replace(null);

    // There's nothing wrong with doing node.rest = null, but I wanted
    // to point out that the above statement has the same effect.
    assert.strictEqual(node.rest, null);
  }
});

Here's how you might use types.visit to implement a function that determines if a given function node refers to this:

function usesThis(funcNode) {
  n.Function.assert(funcNode);
  var result = false;

  visit(funcNode, {
    visitThisExpression(path) {
      result = true;

      // The quickest way to terminate the traversal is to call
      // this.abort(), which throws a special exception (instanceof
      // this.AbortRequest) that will be caught in the top-level
      // types.visit method, so you don't have to worry about
      // catching the exception yourself.
      this.abort();
    },

    visitFunction(path) {
      // ThisExpression nodes in nested scopes don't count as `this`
      // references for the original function node, so we can safely
      // avoid traversing this subtree.
      return false;
    },

    visitCallExpression(path) {
      const node = path.node;

      // If the function contains CallExpression nodes involving
      // super, those expressions will implicitly depend on the
      // value of `this`, even though they do not explicitly contain
      // any ThisExpression nodes.
      if (this.isSuperCallExpression(node)) {
        result = true;
        this.abort(); // Throws AbortRequest exception.
      }

      this.traverse(path);
    },

    // Yes, you can define arbitrary helper methods.
    isSuperCallExpression(callExpr) {
      n.CallExpression.assert(callExpr);
      return this.isSuperIdentifier(callExpr.callee)
          || this.isSuperMemberExpression(callExpr.callee);
    },

    // And even helper helper methods!
    isSuperIdentifier(node) {
      return n.Identifier.check(node.callee)
          && node.callee.name === "super";
    },

    isSuperMemberExpression(node) {
      return n.MemberExpression.check(node.callee)
          && n.Identifier.check(node.callee.object)
          && node.callee.object.name === "super";
    }
  });

  return result;
}

As you might guess, when an AbortRequest is thrown from a subtree, the exception will propagate from the corresponding calls to this.traverse in the ancestor visitor methods. If you decide you want to cancel the request, simply catch the exception and call its .cancel() method. The rest of the subtree beneath the try-catch block will be abandoned, but the remaining siblings of the ancestor node will still be visited.

NodePath

The NodePath object passed to visitor methods is a wrapper around an AST node, and it serves to provide access to the chain of ancestor objects (all the way back to the root of the AST) and scope information.

In general, path.node refers to the wrapped node, path.parent.node refers to the nearest Node ancestor, path.parent.parent.node to the grandparent, and so on.

Note that path.node may not be a direct property value of path.parent.node; for instance, it might be the case that path.node is an element of an array that is a direct child of the parent node:

path.node === path.parent.node.elements[3]

in which case you should know that path.parentPath provides finer-grained access to the complete path of objects (not just the Node ones) from the root of the AST:

// In reality, path.parent is the grandparent of path:
path.parentPath.parentPath === path.parent

// The path.parentPath object wraps the elements array (note that we use
// .value because the elements array is not a Node):
path.parentPath.value === path.parent.node.elements

// The path.node object is the fourth element in that array:
path.parentPath.value[3] === path.node

// Unlike path.node and path.value, which are synonyms because path.node
// is a Node object, path.parentPath.node is distinct from
// path.parentPath.value, because the elements array is not a
// Node. Instead, path.parentPath.node refers to the closest ancestor
// Node, which happens to be the same as path.parent.node:
path.parentPath.node === path.parent.node

// The path is named for its index in the elements array:
path.name === 3

// Likewise, path.parentPath is named for the property by which
// path.parent.node refers to it:
path.parentPath.name === "elements"

// Putting it all together, we can follow the chain of object references
// from path.parent.node all the way to path.node by accessing each
// property by name:
path.parent.node[path.parentPath.name][path.name] === path.node

These NodePath objects are created during the traversal without modifying the AST nodes themselves, so it's not a problem if the same node appears more than once in the AST (like Array.prototype.slice.call in the example above), because it will be visited with a distict NodePath each time it appears.

Child NodePath objects are created lazily, by calling the .get method of a parent NodePath object:

// If a NodePath object for the elements array has never been created
// before, it will be created here and cached in the future:
path.get("elements").get(3).value === path.value.elements[3]

// Alternatively, you can pass multiple property names to .get instead of
// chaining multiple .get calls:
path.get("elements", 0).value === path.value.elements[0]

NodePath objects support a number of useful methods:

// Replace one node with another node:
var fifth = path.get("elements", 4);
fifth.replace(newNode);

// Now do some stuff that might rearrange the list, and this replacement
// remains safe:
fifth.replace(newerNode);

// Replace the third element in an array with two new nodes:
path.get("elements", 2).replace(
  b.identifier("foo"),
  b.thisExpression()
);

// Remove a node and its parent if it would leave a redundant AST node:
//e.g. var t = 1, y =2; removing the `t` and `y` declarators results in `var undefined`.
path.prune(); //returns the closest parent `NodePath`.

// Remove a node from a list of nodes:
path.get("elements", 3).replace();

// Add three new nodes to the beginning of a list of nodes:
path.get("elements").unshift(a, b, c);

// Remove and return the first node in a list of nodes:
path.get("elements").shift();

// Push two new nodes onto the end of a list of nodes:
path.get("elements").push(d, e);

// Remove and return the last node in a list of nodes:
path.get("elements").pop();

// Insert a new node before/after the seventh node in a list of nodes:
var seventh = path.get("elements", 6);
seventh.insertBefore(newNode);
seventh.insertAfter(newNode);

// Insert a new element at index 5 in a list of nodes:
path.get("elements").insertAt(5, newNode);

Scope

The object exposed as path.scope during AST traversals provides information about variable and function declarations in the scope that contains path.node. See scope.ts for its public interface, which currently includes .isGlobal, .getGlobalScope(), .depth, .declares(name), .lookup(name), and .getBindings().

Custom AST Node Types

The ast-types module was designed to be extended. To that end, it provides a readable, declarative syntax for specifying new AST node types, based primarily upon the require("ast-types").Type.def function:

import {
  Type,
  builtInTypes,
  builders as b,
  finalize,
} from "ast-types";

const { def } = Type;
const { string } = builtInTypes;

// Suppose you need a named File type to wrap your Programs.
def("File")
  .bases("Node")
  .build("name", "program")
  .field("name", string)
  .field("program", def("Program"));

// Prevent further modifications to the File type (and any other
// types newly introduced by def(...)).
finalize();

// The b.file builder function is now available. It expects two
// arguments, as named by .build("name", "program") above.
const main = b.file("main.js", b.program([
  // Pointless program contents included for extra color.
  b.functionDeclaration(b.identifier("succ"), [
    b.identifier("x")
  ], b.blockStatement([
    b.returnStatement(
      b.binaryExpression(
        "+", b.identifier("x"), b.literal(1)
      )
    )
  ]))
]));

assert.strictEqual(main.name, "main.js");
assert.strictEqual(main.program.body[0].params[0].name, "x");
// etc.

// If you pass the wrong type of arguments, or fail to pass enough
// arguments, an AssertionError will be thrown.

b.file(b.blockStatement([]));
// ==> AssertionError: {"body":[],"type":"BlockStatement","loc":null} does not match type string

b.file("lib/types.js", b.thisExpression());
// ==> AssertionError: {"type":"ThisExpression","loc":null} does not match type Program

The def syntax is used to define all the default AST node types found in babel-core.ts, babel.ts, core.ts, es-proposals.ts, es6.ts, es7.ts, es2020.ts, esprima.ts, flow.ts, jsx.ts, type-annotations.ts, and typescript.ts, so you have no shortage of examples to learn from.

ast-types's People

Contributors

Stargazers

Watchers

Forkers

pscheit jeffmo azu amasad masrud thomasboyt mduvall rreverser gabelevi dmitrysoshnikov caridy briandipalma fkling hfeeki drslump blazarus cpojer jamestalmage michiel wix-playground grabbou mroch trabus marcioj jhgg juliankrispel canve mastercactapus codemix aequanimitas garetht avikchaudhuri stevekane keyz bmeck knisterpeter hzoo motiz88 brentschroeter jessebeach dcousineau tivac albinekb samwgoldman hebuliang johanneswuerbach wukkuan brendanannable jlongster danharper slavah danibranas vjeux pajn shuhei harendranathvegi9 gerhobbelt alexxnica kryndex alunny adiba ryanpardieck acidburn0zzz 0x24a537r9 vitgottwald elliottsj threehams maxmcd aurelhann phoenixmatrix bgw psalaets xsburg chrisakers vladimirmilenko therobinator ben-chin aidan-eb pvdz thekip pkaminski ankeetmaini jedwards1211 brieb terminalkitten bnjmnt4n nemodreamer brandynbennett jrburke octogonz narayanpai devongovett unki2aut rickhanlonii fuath skeggse tanduong evenius mendezmariano jgoz

ast-types's Issues

getFieldType?

I'm writing some code that is trying to pick up variable appearances (to eventually determine global variables in the code). Unfortunately, I can't rely on node.type === "Identifier" for node.name to be noted as a used reference, because Identifier is also used in other places (such as for label names). If however, I could do something like:

if (aNode.type === "identifier" && getFieldType(aNode.parent, "whatever this field was").expects(Expression))

Then I could know that this is an identifier behaving like an expression, and thus an actual candidate for a reference to a variable. If there is some other much more trivial way of figuring this out that I'm missing, I'm also happy to hear that.

`import * as foo from "foo"`

@benjamn, we need to expand the definition of ImportDeclaration to include support for a new kind of import, as today, we have named and default, as in this block:

def("ImportDeclaration")
    .bases("Declaration")
    .build("specifiers", "kind", "source")
    .field("specifiers", [def("ImportSpecifier")])
    .field("kind", or("named", "default"))
    .field("source", ModuleSpecifier);

I will suggest batch or binding, but I'm pretty bad naming things.

Once we decide on the name, I can finish the work on esprima to get the new syntax in.

When traversing an object of unknown type, use Object.keys to guess its fields

The alternative is to stop traversing, which can be a cryptic failure mode.

Find out params order for a particular builder

I want to automate building AST subtrees, something like:

var types = require('ast-types');
var nodeParams = { /* my params here */};
types.builders[nodeParams.type].apply(this, ???);

Is there a way to find out what is the proper order of params I should pass to a node builder of a specific type?

There are .buildParams defined on a type, but for now it seems impossible to get access to that list from external module.
It would be nice to have something like types.namedTypes[type].fields I suppose?
Or to make getFieldNames(node) return properly ordered fields?
Or just to make any builder accept object with params, not only list of arguments.

@benjamn I can probably do a PR, if you wish.

Implement a flexible pattern-matching syntax

Just as we currently have .namedTypes.MemberExpression and .builders.memberExpression, perhaps we could also have an auto-generated .patterns namespace containing functions like .patterns.MemberExpression for building arbitrary pattern-matching types, e.g.

var types = require("ast-types");
var p = types.patterns;

var thisFooPattern = p.MemberExpression(
  p.ThisExpression(),
  p.Identifier("foo")
);

types.traverse(ast, function(node) {
  if (thisFooPattern.check(node)) {
    console.log("found this.foo:", node);
  }
});

Docs for scopes

For now it’s not clear what’s behaviour of Scope’s methods, it’d be nice to have some detailed docs sometime.

Implement a way to inspect field types of compound types

It would be very useful if you could evaluate n.LogicalExpression.getFieldType("operator") to retrieve (in this case) the LogicalOperator type.

XJS -> JSX

I wanted to give you a heads up about facebookarchive/esprima#85 (I know you're CC'ed but just in case) and figure out what we can do to make that transition smooth. The change should be easy to make (I literally just did a find+replace). I don't know how you want to version this lib - I'm sure we'll just bump major for esprima.

question: how to remove sibling nodes from parent?

Hi, great job at first!

I'm writing continuous passing style transformation and i have a problem breaking my mind.
I've done almost everything, intercepting callback marker, replacing with a function call and nesting all the sibling nodes.
But i can't figure out how to remove sibling nodes from parent after copying them in the callback function so duplicates persits.

I'm visiting Expressions and calling path.get('elements') or path.parent.get('elements') or path.parentPath.get('elements') and so on... and did't the trick.

I extract siblings from 'path.parent.parentPath.value' starting from the index of the current node, but how can i remove the siblings?

It seems an assignment like

path.parent.parentPath.value = _.initial(path.parent.parentPath.value, current_index)
// (initial returns the first part till the current_index)

is not working, do i have to use the .replace() function?

path.parent.parentPath.value.replace is not defined,
path.parent.parentPath.replace is not defined too.

Any suggestions?

Thank you really much for your work!

Proposal: Make the project generalized

Currently, this project is geared towards ES, but since the interface is so nice for building asts, I'd like to use it in other parsing projects.
Would it be possible to make the project more generalized so that it doesn't load all the types in def/* unless requested?

Export declaration builder syntax seems to be outdated

Using exportDeclaration, it doesn't seem possible to build the current export default syntax:

export default foo;

I'm not sure if I'm missing something or not; currently I'm just manually creating:

{
  type: 'ExportDeclaration',
  default: true,
  declaration: {
    type: 'Identifier',
    name: 'foo'
  }
}

I'd be happy to do a PR to fix this, just want to make sure I'm not missing something in the def for moduleDeclaration that allows building that object

Implement NodePath.prototype.getIndent

The indentation of a node is dependent on the indentation of its ancestors, so this is a natural piece of information for NodePath to provide.

Idea: "Virtual" type definitions

The idea is to be able to treat different node types as if they were the same type. You already have a type which could be considered virtual: Function, which defines the shared properties of FunctionExpressions and FunctionDeclarations. However, FunctionExpression and FunctionDeclaration have to explicitly "inherit" from Function.

Lets consider another example: Identifier and JSXIdentifier. For static analysis and code mods, I might want to treat both of these types as identical.

I would like to be able to write logic to process nodes independently of the concrete node type. Virtual types could be created dynamically and concrete node types could be associated with virtual types dynamically.

I'm not sure if what I'm saying makes sense at all , I'm not especially good at expressing my ideas. Or maybe something like this is already possible to do and I just don't know how. Happy to talk about this via a different channel.

Optimization: in types.visit, traverse tree at first without NodePath information, and only construct a path if any matching nodes are found

Add a convenience function for checking equivalence of nodes

Version 0.5.0 not in repo?

I see 0.5.0 is the latest in npm but I don't see its changes in the repo.

Implement require("ast-types").astNodesAreEquivalent for reliably testing deep equivalence of nodes

This logic gets rewritten in most code bases that operate on ASTs, and would be very useful for testing transforms.

cc @thedekel @jeffmo

passing along source location when building nodes

The problem I had before was due to transformations not keeping the source locations in the final nodes. There are a few key places that need to copy along the source locs.

What's the right pattern to do this with builders? Do I need to assign the node to a variable and then do something like newNode.loc = oldNode.loc? Or can I pass along the source locations to the builders somehow?

VariableDeclaration builder params order

In the definition of VariableDeclaration you have the first kind param and the second declarations, whereas spec describes declarations as first and then kind. Esprima also prints kind as the last param, thogh it doesn’t matter as far it is object.
Is there some reason to keep exactly that order of params in builder?

`scope.isInStrictMode`

Before writing any code, I want to make sure it made sense to add a method to the Scope class to check if the scope is in strict mode?

Update README.md to recommend types.visit instead of types.traverse.

The examples currently use types.traverse, which is still available, but deprecated.

Update README.md to demonstrate Type.prototype.assert

That is, change lines like this:

assert.ok(n.Expression.check(node));

to this:

n.Expression.assert(node);

Function can have an expression as a body?

// core.js
def("Function")
    .bases("Node")
    .field("id", or(def("Identifier"), null), defaults["null"])
    .field("params", [def("Pattern")])
    .field("body", or(def("BlockStatement"), def("Expression")));

What would an ES5 function without a block statement look like? Is this just here to support arrow functions in ES6? If so, wouldn't it be better to override it in es6.js?

Make lib/traverse.js a better tool for AST modification

Specifically, the Path type should support a .replace(...) method that can take any number of replacements (zero means to remove the current node).

Add a test of completeness against various versions of Esprima

That is, make sure every Syntax.* type used by Esprima is represented in our type system. Not a big problem if Esprima doesn't know about all of our types—only if we don't know about all of Esprima's types.

Besides master, I would also like to ensure coverage of harmony and fb-harmony.

show line where deprecated function is used

specifically .traverse()

https://github.com/dougwilson/nodejs-depd

Traversing comment nodes.

Is that doable via:

{
  visitBlock:....
}

this makes me think no

Performance: Reuse NodePath instances when visiting

Currently the traversal will create a 1:1 mapping between actual syntax Nodes and NodePaths, the paths are cached for later visits but they don't survive separate traversals, like when chaining different ES6 transformations.

I've run a very contrived benchmark that would always reuse a single NodePath just changing it's value property for each visit. The results are quite good with over a x2 increase when visiting underscode.js, so I think this route has potential.

Two options here that I see, the first one should be low hanging fruit, expose to the caller the root NodePath used in a traversal (which should have all the children already cached) and allow it to be passed along for further calls.

// First transformation returns the root nodepath
var rootPath = types.visit(ast, firstVisitor)
// Second transformation uses it instead of the AST (so the cache is used)
rootPath = types.visit(rootPath, secondVisitor)

The second option is a bit more hard to pull off but should improve the speed even on single passes. When visiting the tree we don't need to keep the whole structure, just the stack of nodes that took us to a given node. So a single NodePath can be used for the traversal, we just update its stack before passing it on to the visitor.

var path = new NodePath(ast);
assert( path.stack === [Program] === [ast] )

path.stack.push('body');  // The field name
path.stack.push(path.value.body)  // The field value
ast.body.forEach(function (item, idx) {
  path.stack.push(idx);  // field name
  path.stack.push(item);  // field value
  assert( path.stack === [Program, 'body', [Stmt, Stmt], 0, Stmt] )
  visit(path);
  path.stack.pop();
  path.stack.pop();
  assert( path.stack === [Program, 'body'] )
});

The limitation here is that a visitors cannot directly hold a reference to the passed NodePath, since it'll be mutating by the traversal. I don't see why it would want to keep it but in any case, to do so it could be documented that it needs to call first path.clone() or path.freeze() to get a copy of the path for future use. Obviously it's still possible to keep references to path.node.

The current methods in NodePath, like .each will also need to create a new NodePath instance but they are only used on transformations and even in those cases the copy can be optimized:

NodePath.value = function () {
  return this.stack[ this.stack.length - 1 ];
};

NodePath.each = function (cb) {
   var list = Array.copy(this.value);  // Freeze collection for semantics
   var iterPath = this.clone();
   list.forEach(function (item, idx) {
     iterPath.stack.push(idx);
     iterPath.stack.push(item);
     cb(iterPath);
     iterPath.stack.pop();
     iterPath.stack.pop();
   });
}

Note that to handle Scope it can either be included as a virtual item in the stack ([Statement, Scope, Function]) or act as a wrapper for a node ([Statement, Scoped(Function)]) and then handle the unwrapping on the NodePath logic.

So by having the limitation of a mutable NodePath reference, which isn't a bad thing IMHO, we can eliminate the creation of thousands of objects from the visitor logic, improving speed and memory usage.

Edit: I forgot to mention that creation of NodePath instances can probably be controlled from the traversal, so it's even possible to re-use instances like it's done for Context.

Move Type.def files into their own directory

This is to distinguish files that contain lots of def declarations, like lib/core.js, from files that implement the type system, like lib/types.js.

Warn if new properties are added to PathVisitor Context instances by visit methods

Every PathVisitor instance has a class visitor.Context such that Object.getPrototypeOf(visitor.Context.prototype) === visitor, and the value of this within visitor methods is an instance of this Context class.

This means that any properties accidentally added to this during the visitor method will be thrown away (or become inaccessible) from other visitor methods, probably indicating a bug.

It should be easy to warn about this by checking Object.keys(context) after calling the visitor method in context.invokeVisitorMethod.

Make bin/regenerator more scripts/prepublish friendly.

When I use the jsx transpiler from react-tools, I tend to have an entire directory of code that needs to be transpiled before it can be run. In those cases, my package.json includes this stanza:

  "scripts": {
    "prepublish": "node_modules/.bin/jsx --harmony --strip-types src/ lib/"
  }

Because the CLI of bin/regenerator does not take input and output directories as parameters, it is hard to use in this way. Ideally, I think I would do:

  "main": "./lib/main",
  "devDependencies": {
    "react-tools": "~0.11.2",
    "regenerator": "^0.7.1"
  },
  "scripts": {
    "prepublish": "node_modules/.bin/regenerator --include-runtime src/ temp_src/ node_modules/.bin/jsx --harmony --strip-types temp_src/ lib/"
  }

Alternatively, I could create yet another Node package to do this and add that to devDependencies, but I would like to avoid that, if possible.

including tests in NPM distrubtion

Decide what to do about array holes

Currently, this assertion succeeds:

require("ast-types").builtInTypes.string.arrayOf().assert(["a",,"c"])

It probably should not succeed, because the second element of the array is missing/undefined.

Generate a browser-friendly .js file containing all dependencies

Executing the file in a browser should define a global variable called AstTypes.

This will require shimming Object.defineProperty, among other things.

Support ForOfStatement type in defs/es6.js

Automatically generate HTML namedTypes/builders documentation from the declarations in def/*.js

We already use the def syntax to generate AST types and builders automatically, so it would be awesome if we also generated documentation at https://benjamn.github.io/ast-types/documentation, or something like that.

Redundant nodes are not removed.

Not sure if this an issue or not but removing nodes via replace can leave their parent nodes around even if they have no children. This can result in weird appendages being printed out from a post transform AST.

i.e.

var y = 1,
      x = 2;

Lets say you remove the two VariableDeclarators this does not clean up the now redundant VariableDeclaration so printing out results in.

var undefined;

There are similar issues with ExpressionStatements having no expression value resulting in

Being printed out.

Do you feel this is something ast-types should handle?

I can take a look at it if you think it's an issue with ast-types.

A way to break visitor

Is there a way to break traversing?
For example, it can be done via throwing an exception:

var visitor = {};
visitor['visit' + nodeType] = function (path) {
    this.traverse(path);
    if (!test(path.node)) throw 1;
};
try {
 visit(ast, visitor);
} catch (e) {}

But is there a more elegant way to do so?
Like this.break(), or this.stop() within the callback, similar to this.traverse(path)?

off by one error when updating indices after mutation

I can't repro this in an isolated case but there is an off-by-one error for a combination of an
path.insertBefore(...nodes) and path.replace() (deletion). I worked around by moving from:

// path.insertBefore have called multiple times before this.
path.insertBefore(...nodes);
path.replace();

// path.insertBefore have called multiple times before this.
path.replace(...nodes);

I'll keep trying to repro in isolated code but I wanted to report the issue incase you have any ideas

source location missing?

If I have the following file:

function bar(i) {
    if(i > 0) {
        return i + bar(i - 1);
    }
    return 0;
}

function foo() {
    var x = 5;
    var y = bar(x);

    return function() { return y; };
}

And I parse it with ast-types, the bar and foo function nodes don't have source location (loc is null). Weirdly, the anon function returned in foo does. What is the status of source location support? Is it still buggy?

Performance: Maintain a summary of child node types

When having to perform multiple passes to apply different transformations, like for instance when using transpiling from ES6, the speed of the visitor algorithm is specially important. I saw an issue resugar/resugar#64 that suggested running transformations only on files that actually had some syntax to transform. I think that's good but could be done way better if it was integrated here.

Basically the algorithm is like this:

Each type is associated with a hash (more on this below)
Each node is given a new hasTypes property
After the visitor processes a node's children it'll update the current hasTypes with the all the seen children's hasTypes
The traversal will only descend on the tree if the visitor has a handler for one of the types in hasTypes or it hasn't been initialised.

Now, to make that fast we can limit the hasTypes property to be a Number and play with Javascript's bit operators (hence limited to 32bits), it's probably just enough. Since there are more than 32 types we can't fix them to a given bit, instead we can use a Bloom filter and offer a pretty good approximation. So the logic becomes:

When defining a new type its hash is computed as a 32bit number with just k bits set, 2 seems a good k for the number of types (around 50?).
To update the hasTypes property, for each child in the node it's just: node.hasTypes | child.type.hash
To check if it should descend into a node, for each supported type in the visitor: node.hasTypes & type.hash === type.hash (Note: some false positives can occur)

There are edge cases, like a Visitor implementing the generic visit method, in that case the traversal must always descend. There are also probably some cases that would require invalidation of the hasTypes (from the node to the root), I'm not familiar enough with the visiting/transformation algorithm to be sure.

What I don't have clear in my head right now by looking at the current visitor design (using the auxiliary path object) is if the library assumes that nodes are immutable and so we can't assign the hasTypes hash to them. The paths seem to be cached, and could be a candidate to hold it, but I've failed to see if those caches would survive different traversals.

Integrate ast-path into ast-types as lib/path.js

I still think there's some value in having a distinction between the Path type and its derived NodePath type, but we can accomplish that by having two separate source files within the ast-types package, rather than having a separate NPM package.

New replaced node should be traversed

This is first of all a question, shouldn't the new nodes added by replace be traversed?

Ex:

types.traverse(ast, function() {
  if ( ... ) {
    this.replace(functionExpressionNode);
  }
});

functionExpressionNode doesn't get traversed.

Make `Path` more consistent with `Array` for array types.

If a Path object is an array type, it should probably act more like the built-in Array class.
Here's what I'm thinking:

Rename each to forEach
Add a length property
Add some/all of the Array methods, such as: concat, every, find, reduce, reverse, slice, some, sort, splice.

Report line and column number with errors.

I am trying to use regenerator with jstransform and I am getting the following error when transpiling one of my files:

Message:
    regenerator error while transforming <filename>:
did not recognize object of type "ObjectTypeAnnotation"

Lack of line information makes this hard to debug.

Also, I'm trying to use jstransform with harmony and stripTypes enabled. This does not seem to play so well with regenerator.

Is there any ability to extend "prototypes" of nodes?

For writing compiler to JavaScript, I'm using Jison and would like all the Nodes to have method at(...) so I could easily add Jison location to any existing expression or statement by transforming it from Jison format to SourceLocation.

So I would just write smth like b.literal('123').at(@$) and use that argument to set loc property of any Node at was called on.

Is there ability to achieve that?

Eliminate use of deprecated methods because they pull in eval().

I'm trying to call regenerator from Atom using gulp. Atom has a Content Security Policy (CSP) that basically disallows eval().

regenerator depends on recast, which depends on ast-types. traverse.js in ast-types currently contains the following code:

var deprecate = require("depd")('require("ast-types").traverse');

var deprecatedWrapper = deprecate.function(
    traverseWithFullPathInfo,
    'Please use require("ast-types").visit instead of .traverse for ' +
        'syntax tree manipulation'
);

deprecate.function in is wrapFunction in depd, which contains the following code:

  var deprecatedfn = eval('(function (' + args + ') {\n'
    + '"use strict"\n'
    + 'log.call(deprecate, message, site)\n'
    + 'return fn.apply(this, arguments)\n'
    + '})')

This is a long way of saying that I cannot use regenerator in Atom because of this transitive dependency on eval(). Atom has a package named loophole to work around these sorts of issues on a one-off basis, though that would require a change to depd whereas I think the better thing to do would be to eliminate the deprecated code here in ast-types.

Thoughts? Is it hard to eliminate this deprecated code? Or would it be OK to stop flagging it as deprecated, at least for one release :)

Warn about bogus/typo'd visitor methods

All too easy to bang out

types.visit(ast, {
  visitFunctino: function(path) { ... }
});

and somehow never visit any Functino nodes.

License?

Hey @benjamn, what license is ast-types under?