GithubHelp home page GithubHelp logo

acorn's Introduction

Acorn

Build Status NPM version CDNJS

A tiny, fast JavaScript parser, written completely in JavaScript.

Community

Acorn is open source software released under an MIT license.

You are welcome to report bugs or create pull requests on github.

Packages

This repository holds three packages:

To build the content of the repository, run npm install.

git clone https://github.com/acornjs/acorn.git
cd acorn
npm install

Plugin developments

Acorn is designed to support plugins which can, within reasonable bounds, redefine the way the parser works. Plugins can add new token types and new tokenizer contexts (if necessary), and extend methods in the parser object. This is not a clean, elegant API—using it requires an understanding of Acorn's internals, and plugins are likely to break whenever those internals are significantly changed. But still, it is possible, in this way, to create parsers for JavaScript dialects without forking all of Acorn. And in principle it is even possible to combine such plugins, so that if you have, for example, a plugin for parsing types and a plugin for parsing JSX-style XML literals, you could load them both and parse code with both JSX tags and types.

A plugin is a function from a parser class to an extended parser class. Plugins can be used by simply applying them to the Parser class (or a version of that already extended by another plugin). But because that gets a little awkward, syntactically, when you are using multiple plugins, the static method Parser.extend can be called with any number of plugin values as arguments to create a Parser class extended by all those plugins. You'll usually want to create such an extended class only once, and then repeatedly call parse on it, to avoid needlessly confusing the JavaScript engine's optimizer.

const {Parser} = require("acorn")

const MyParser = Parser.extend(
  require("acorn-jsx")(),
  require("acorn-bigint")
)
console.log(MyParser.parse("// Some bigint + JSX code"))

Plugins override methods in their new parser class to implement additional functionality. It is recommended for a plugin package to export its plugin function as its default value or, if it takes configuration parameters, to export a constructor function that creates the plugin function.

This is what a trivial plugin, which adds a bit of code to the readToken method, might look like:

module.exports = function noisyReadToken(Parser) {
  return class extends Parser {
    readToken(code) {
      console.log("Reading a token!")
      super.readToken(code)
    }
  }
}

acorn's People

Contributors

abraidwood avatar adams85 avatar adrianheine avatar andarist avatar aparajita avatar danez avatar davidbonnet avatar dnalborczyk avatar forbeslindesay avatar jdalton avatar jlhwung avatar jmm avatar kaicataldo avatar laosb avatar longtengdao avatar marijnh avatar mathiasbynens avatar mysticatea avatar not-an-aardvark avatar nzakas avatar ota-meshi avatar rich-harris avatar rreverser avatar sebmck avatar sosukesuzuki avatar susiwen8 avatar timothygu avatar tyrealhu avatar wjx avatar xiemaisi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

acorn's Issues

Feature Request: add new visitors to a recursive walker

Could a slight modification to acorn.walk.make be made that allows for the creation of new walker visitors?

I have some code in a known structure I'm trying to parse and wanted to handle these parts of the AST specially.

In essence, I would like to be able to write

acorn.walk.recursive(ast, state, {
  ObjectExpression: function(n, st, c) {
    if (..some rule..) c(n.properties[0], st, 'SpecialPropertyIdent')
  },
  SpecialPropertyIdent: function(n, st, c) {
    ...
  }
});

Right now, I have to make a walker and then attach my new visitors onto it later, which seems a little inconvenient.

Issue with range information and parenthesis

I am in the process of porting the PaperScript parser of Paper.js over to using Acorn.js, using the range information in the AST to directly modify the original source, instead of modifying the AST and converting it back to code. I decided to do so for tow reasons: To preserve line numbers in syntax errors when evaluating the resulting code, and also to keep total code size down, since I can then skip the inclusion of Escodegen.

Things are working pretty well already, but I have discovered that the range information is a little off in certain cases:

In the following statement, I am getting a wrong start offset in the range, excluding the inner opening parenthesis:

if ((1) === 1) {
}

If I log the substrig of the whole BinaryExpression to the console, I get:

1) === 1

Instead of

(1) === 1

This seems to be a bug, right?

strict mode?

Line 17: "strict mode"

Isn't it supposed to be "use strict" ?

:-)

parser fails on "if(1)/x/"

The regex is not parsed correctly in this case: "if(1)/x/"

master ~/work/acorn $ bin/acorn regr003-google.js 
{
  "type": "Program",
  "start": 0,
  "end": 8,
  "body": [
    {
      "type": "IfStatement",
      "start": 0,
      "end": 8,
      "test": {
        "type": "Literal",
        "start": 3,
        "end": 4,
        "value": 1,
        "raw": "1"
      },
      "consequent": {
        "type": "ExpressionStatement",
        "start": 6,
        "end": 8,
        "expression": {
          "type": "Literal",
          "start": 6,
          "end": 8,
          "value": {},
          "raw": "x/"        <<<< missing first slash
        }
      },
      "alternate": null
    }
  ]
}

master ~/work/acorn $ git describe --tags
v0.1-6-g9a55d60

master ~/work/acorn $ cat regr003-google.js 
if(1)/x/

master ~/work/acorn $ esparse --raw regr003-google.js 
{
    "type": "Program",
    "body": [
        {
            "type": "IfStatement",
            "test": {
                "type": "Literal",
                "value": 1,
                "raw": "1"
            },
            "consequent": {
                "type": "ExpressionStatement",
                "expression": {
                    "type": "Literal",
                    "value": {},
                    "raw": "/x/"   <<<< not missing slash
                }
            },
            "alternate": null
        }
    ]
}

Single line comments

Comments that use the // syntax dont seem to show up in the AST even when trackComments is true

Parse failure on MooTools

MooTools 1.4.1 does not get parsed correct. It gives the following error:

"Unsyntactic break (2332:45)"

where line 2332 is

if (name == '*' && this.brokenStarGEBTN) break simpleSelectors;

Most likely simpleSelectors was not in the list of detect labels?

html comments are valid in JS

I've been having an interesting discussion over on literalizer issue #19 about whether HTML comments should be parsed as "valid JS".

If you check it, the following code is totally valid in most JS engines:

var a = 2;
<!-- a = 3;
--> a = 4;
console.log(a); // 2

More specific info from @mathiasbynens can be found on this StackOverflow question.

Briefly, <!-- and --> are treated as single-line comment markers. But, --> has to be the first non-whitespace, non-multi-line-terminator content on a line to be treated as such. This comes not from strict ES spec, but from the extended "Web ES" spec which JS engines in browsers adhere to.

Even node.js handles these, since it uses V8.

However, it appears that most JS tools (acorn, esprima, traceur, etc) don't handle such things.

Why does this matter?

var a = 1, b = 1; a <!--b;

That code would be parsed differently by these tools than by the JS engines. That's a "bad thing"™.

So, should(n't) JS tools adhere to what the browser engines will do with JS rather than the pure academic ES spec?

Moreover, what happens if some tools allow these, and some don't. Is that a healthy thing, or will it hurt tool interop and thus we should all be unanimous one way or the other?

Support for some ES6 features

I would like to parse some js with acorn but it makes use of some es6 features supported by the spider monkey engine. Specifically the use of const and let. I made a kind of quick'n dirty patch to implement support for both and it seems to be working fine. I was wondering if there was any interest in merging such changes. If so I can clean up my patch and submit a PR.

RangeError: Maximum call stack size exceeded on a seemingly simple file

Parsing a file with a single variable declaration initialized to a big string causes RangeError to be raised.

var acorn = require("./acorn.js");
var fs = require("fs");
fs.readFile("./test/jquery-string.js", function(err, data) {
    if (err) throw err;
    acorn.parse(data);
});
master ~/work/acorn $ node test1.js 

/home/denis/work/acorn/acorn.js:804
        return finishToken(_string, String.fromCharCode.apply(null, rs_str));
                                                        ^
RangeError: Maximum call stack size exceeded
    at readString (/home/denis/work/acorn/acorn.js:804:57)
    at getTokenFromCode (/home/denis/work/acorn/acorn.js:646:14)
    at readToken (/home/denis/work/acorn/acorn.js:692:15)
    at next (/home/denis/work/acorn/acorn.js:936:5)
    at eat (/home/denis/work/acorn/acorn.js:1013:7)
    at parseVar (/home/denis/work/acorn/acorn.js:1370:19)
    at parseStatement (/home/denis/work/acorn/acorn.js:1251:14)
    at parseTopLevel (/home/denis/work/acorn/acorn.js:1073:18)
    at Object.exports.parse (/home/denis/work/acorn/acorn.js:42:12)
    at /home/denis/work/acorn/test1.js:5:11

master ~/work/acorn $ node --version
v0.8.20

master ~/work/acorn $ git describe --tags 
v0.1-3-g782259b

Passing --stack-size=1000 to node seem to fix it, but esprima handles the file without any errors without any additional options, so I'm not sure if that's expected behavior or a bug.

Safe tokenizer API

Is there any hope to get a safe (as in, that won't reset the internal state in some cases)
tokenizer with Esprima-compatible output?

I would love to be able to compare Esprima's tokenizer() function that
I'm working on at https://github.com/espadrine/esprima

The basic API that I'm looking for is:

tokenizer(inputSource :: String, options :: Object)
:: Array of {type :: String, value :: String, (optional loc field for line information)}

That would output all the tokens in one shot.

Obviously, we can discuss how to deal with /.
Tim Disney seems to have found a good way to work on it at https://github.com/mozilla/sweet.js/wiki/design

Quick Question: Rebuilding the source

Hi Marijn,

I've done a quick search and couldn't find anything, essentially I'm looking for a walker which is the reverse of parse, fn(Object) -> String

Any clues 😄 ?

AST serialization

First, thank you for such a great project.
Is there any way to serialize resulting AST, e.g. after code instrumentation?

Code generation?

Not an issue, just a question. Do you know of any projects that can do code generation based on an acorn AST?

Thanks,

Chris

`locations` option doesn't work with `parse_dammit`

I can get it to work with parse, but not parse_dammit. By "doesn't work" I mean that the loc object is not included on any nodes.

I'd be happy to attempt to fix this myself, but I'm curious to know if this is currently supposed to work or not.

acorn_loose on really broken code parsing

Awesome lib, thanks so much for it.

I want to handle really broken code parsing, and tried parsing below with parse_dammit.

getDummaryLoc(dummy) throws an error as there is no valid token and I have enabled options.locations

I don't want to fiddle with this code as I'm not worthy, so won't even attempt a pull request.
In the meantime I modified my version to skip setting loc.end if loc.start == undefined

function

** =

function a()
{
}

Can not load acorn in a web worker

Using acorn in a web worker:

importScripts("thirdparty/acorn/acorn.js");

results in the following error: Uncaught ReferenceError: window is not defined

Tag version 0.3.2

0.3.2 appears to be the latest version. It'd be good to tag in on Github so it can be installed through Bower. Thanks!

`throw \n 1;` is a parse error

Currently acorn accepts the following program which should result in a parse error (throw is a restricted production)

throw
1

Feature request: parse multiple files into a single AST

This is useful in UglifyJS to compress multiple files and generate a proper source map.

Basically, the parser would receive a Program node and will append new statements to its body instead of creating a new Program. The tricky part is that besides start/end the "loc" property needs also to contain "source" (specified in the SpiderMonkey parser API).

guardedHandlers

Not sure if this is a bug or I'm missing something. According to https://developer.mozilla.org/en-US/docs/SpiderMonkey/Parser_API
TryStatement is always expected to have "guardedHandlers" property.

But on this link https://bugzilla.mozilla.org/show_bug.cgi?id=742612 "guardedHandlers" is marked as optional. The latter document could be outdated though.

So, esprima always adds empty array for "guardedHandlers" should acorn do this too?

Escodegen assumes this property always present in AST.

Bugs in the parser

Hello!

I am using acorn on a project I am working on, and in the process I found a few bugs. I also have some quick fix for some of them, so I am submitting them to you.

I am reporting all of them here, but if you prefer me to open separate issues, please tell me and i will be happy to do that.

However, the bugs I found are:

  1. an if statement build like this: if(expression)throw"Error";else do_something_else() throws an unexpected token" error.
    REASON: acorn "closes" the token after the closed quotes, so the semicolon is interpreted as an EmptyStatement, which causes the IfStatement to close. Thus, the "else" token becomes "unexpected".
    FIX: Adding in the function finishNode(node, type) (line #924) the line if(type === "ThrowStatement") eat(_semi) fixed the issue for me

  2. the string "something\u0026bsomething else" throws "Bad character escape sequence".
    REASON: readInt(radix, len) with radix=16 (Hexadecimal) stops reading only if code is not an Hex digit. So, to the unicode char "ampersand" (\u0026) is added also the next char ("b", which is a valid hex char), causing the number "0026b" to be an invalid unicode char.
    FIX: if a len parameter is passed to readInt(radix, len) (line #674), read only "len" chars.
    What I did was adding in that function an if(len!=null).
    Then, in the if branch, changing the for(;;) to for (var i = 0; i < len; i++).
    In the else branch, instead, I left the for(;;) as it was before.

  3. statement labeled with the same name as another one throws "Label <label_name> is already declared".
    REASON: in javascript (at least in google-chrome browsers) it's allowed to have the same label name for different statements, if they are not nested. In acorn this seems forbidden.
    I am sorry but I don't have a fix for this yet.

Hope this helps,
thank you for your work, I found it really useful!!

Cheers,
Luca

acorn.walk, descending into MemberExpression.property

The base walker code for MemberExpression is

 base.MemberExpression = function(node, st, c) {
    c(node.object, st, "Expression");
    if (node.computed) c(node.property, st, "Expression");
  };

Is there any reason why to descend into node.property only when the node is marked as "computed" ([...] accessor)?

Comments are reported twice when using strict mode.

This code should log the comment "Comment" only once, as it does without the 'use strict' statement.

var content = "\n\
function plop() {\n\
    'use strict';\n\
    /* Comment */\n\
}";

acorn.parse(content, {
  onComment: function (block, text) {
    console.log(text);
  }
});

Tracking whether comments are same line or not

When 'un-parsing' the ast, it would be nice to know whether comments begin on a new line or not. Right not it is not possible to tell. The following code:

var acorn = require('acorn');
var code1 = 'funcall();//comment same line\n';
var code2 = 'funcall();\n//comment next line\n';
var ast1 = acorn.parse(code1,{trackComments:true});
var ast2 = acorn.parse(code2,{trackComments:true});
console.log(JSON.stringify(ast1));
console.log(JSON.stringify(ast2));

outputs:

{"type":"Program","start":0,"end":10,"body":[{"type":"ExpressionStatement","start":0,"end":10,"expression":{"type":"CallExpression","start":0,"callee":{"type":"Identifier","start":0,"end":7,"name":"funcall"},"arguments":[],"end":9,"commentsAfter":["//comment same line"]}}]}
{"type":"Program","start":0,"end":10,"body":[{"type":"ExpressionStatement","start":0,"end":10,"expression":{"type":"CallExpression","start":0,"callee":{"type":"Identifier","start":0,"end":7,"name":"funcall"},"arguments":[],"end":9,"commentsAfter":["//comment next line"]}}]}

Question

Is it thread safe?
I mean is it safe to parse different pieces of code at the same time?

Problems with occurrence of 'self' in code.

In paper.js we are loading acorn.js through the vm.runInContext(source, context, uri); method. Executing it this way leads to the following error:

ReferenceError: self is not defined
    at ../../lib/acorn.js:26:7
    at ../../lib/acorn.js:27:3
    at vm.createContext.include (……/node_modules/paper/src/node/index.js:45:6)
    at core/PaperScript.js:19:7
    at vm.createContext.include (……/node_modules/paper/src/node/index.js:45:6)
    at new <anonymous> (paper.js:126:7)
    at paper.js:32:13
    at Context.vm.createContext.include (……/node_modules/paper/src/node/index.js:45:6)
    at Object.<anonymous> (……/node_modules/paper/src/node/index.js:51:9)
    at Module._compile (module.js:456:26)

Replacing 'self' with 'this' solves the issue, and should still work in browsers too. Would this make sense?

paperjs/paper.js#205

Semicolons after variable declarations

Hi Marijn,

I noticed that for var x = 0;, the variable declaration's end location does not include the semicolon:

> acorn.parse("var x = 0;")
{ type: 'Program',
  start: 0,
  end: 10,
  body: 
   [ { type: 'VariableDeclaration',
       start: 0,
       end: 9,
       declarations: [Object],
       kind: 'var' } ] }

That has been an issue for me when using acorn to break JavaScript code up into top-level statements, because the statements in the output list lack semicolons after variable declarations even when they were present in the original code.

I started a pull request, but noticed that other tests in the suite treat dropping the semicolon as correct. For example, the test for var x /* comment */; wants the variable declaration to end at column 5.

Should I change the other tests to expect the semicolon?

Thanks.

Parse error: "\0" in "strict" mode

In strict mode it doesn't like "\0":

"use strict";
var x = "\0";

errors out with:

/.../acorn/acorn.js:158
    throw new SyntaxError(message);
          ^
SyntaxError: Octal literal in strict mode (2:9)

Invalid use of tokLineStart

In readToken_plus_min, you have the following line of code:

if (next == 45 && input.charCodeAt(tokPos + 2) == 62 && lastEnd < tokLineStart) {

The problem is tokLineStart has no meaning unless options.locations is on, and that check is not being done here. So either you have to always update tokLineStart irrespective of options.locations or find another way to make this test.

Track comments misses comments within an expression

The following example:

var acorn = require('acorn');
var code1 = 'funcall(1/mid expression comment/,2);';
var ast1 = acorn.parse(code1,{trackComments:true});
console.log(JSON.stringify(ast1));

outputs:

{"type":"Program","start":0,"end":39,"body":[{"type":"ExpressionStatement","start":0,"end":39,"expression":{"type":"CallExpression","start":0,"callee":{"type":"Identifier","start":0,"end":7,"name":"funcall"},"arguments":[{"type":"Literal","start":8,"end":9,"value":1,"raw":"1"},{"type":"Literal","start":36,"end":37,"value":2,"raw":"2"}],"end":38}}]}

Track comments duplicates a comment in commentsBefore and commentsAfter

If a comment shows up in commentsAfter on one node, it would be great if it did not show up in commentsBefore on another node. Duplicating comments makes 'unparsing' the ast with comments tricky, as you have to somehow track whether you emitted the comment yet or not. The following example:

ar acorn = require('acorn');
var code1 = 'funcall();/comment between statements/funcall2();';
var ast1 = acorn.parse(code1,{trackComments:true});
console.log(JSON.stringify(ast1));

outputs:

{"type":"Program","start":0,"end":51,"body":[{"type":"ExpressionStatement","start":0,"end":10,"expression":{"type":"CallExpression","start":0,"callee":{"type":"Identifier","start":0,"end":7,"name":"funcall"},"arguments":[],"end":9,"commentsAfter":["comment between statements"]}},{"type":"ExpressionStatement","start":40,"end":51,"commentsBefore":["comment between statements"],"expression":{"type":"CallExpression","start":40,"callee":{"type":"Identifier","start":40,"end":48,"name":"funcall2"},"arguments":[],"end":50}}]}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.