GithubHelp home page GithubHelp logo

butternut's Introduction

Butternut

The fast, future-friendly minifier. Try before you buy at butternut.now.sh

Warning: this is alpha software. Test thoroughly before using in production! Consider using the check option. Please report any bugs you find!

Why?

Butternut is significantly faster than other JavaScript minifiers, and works with the latest version of JavaScript (ES2015, aka ES6, and beyond). It's typically around 3x faster than UglifyJS with default minify options, and 10-15x faster than Babili.

The compression is better than Babili and closure-compiler-js (in standard compilation mode — you can get better results with Closure in advanced mode, but only by writing your code in a very particular way). It's almost as good as Uglify in its current version.

You can test out the different tools with npm run bench.

Note: UglifyJS supports ES2015+ as of very recently — see uglify-es.

How?

The traditional approach to minification is this: parse your source code into an abstract syntax tree (AST) using something like Acorn, manipulate the AST, and finally generate code from it.

Butternut takes a different approach. It uses Acorn to generate an AST, but instead of steps 2 and 3 it then edits the code in place using magic-string — which is much less costly than AST manipulation and code generation.

Usage

The easiest way to use Butternut is to plug it into your existing build process:

Alternatively, you can use it directly via the CLI or the JavaScript API:

Command Line Interface

Install Butternut globally, then use the squash command:

npm install --global butternut # or npm i -g butternut
squash app.js > app.min.js

Run squash --help to see the available options.

JavaScript API

Install Butternut to your project...

npm install --save-dev butternut # or npm i -D butternut

...then use it like so:

const butternut = require('butternut');
const { code, map } = butternut.squash(source, options);

The options argument, if supplied, is an object that can have the following properties:

Option CLI equivalent Default value Description
check --check false Parse output. See below
allowDangerousEval n/a false Whether to allow direct eval calls
sourceMap -m, --sourcemap true Whether to create a sourcemap. Set to inline to append to the output (not recommended)
file n/a (automatic) null The output filename, used in sourcemap generation
source n/a (automatic) null The source filename, used in sourcemap generation
includeContent n/a true Whether to include the source file in the sourcesContent property of the generated sourcemap

The check option

Since Butternut is a new project, it hasn't yet been battle-tested. It may generate code that you don't expect. If you pass check: true (or use the --check flag, if using the CLI), Butternut will parse the generated output to verify that it is valid JavaScript. If not, it means it's messed something up, in which case it will try to help you find the code that it failed to minify correctly.

If you find bugs while using Butternut, please raise an issue!

License

MIT

butternut's People

Contributors

etsms avatar jbt avatar kzc avatar lguzzon avatar loilo avatar rich-harris avatar rreverser avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

butternut's Issues

extract vars out of removed blocks

Input

function foo() {
  if (false) {
    var a = 1;
  }
  a = 2;
  return a;
}

Now, if the if (false) is DCE-d, a becomes a global.

Expected:

function foo() { var a = 2; return a; }

Error with returning Arrow

Input

function foo() {
  return (a) => a; // return a => a works fine though
}

Actual output

No space between return and a.

function foo(){returna=>a}

Shorthand properties fail

// input
(function () {
  var longname = 1;
  var obj = { longname };
  console.log(obj);
}());

// expected
!function(){var a=1,b={longname:a};console.log(b)}()

// actual
!function(){var a=1,b={a};console.log(b)}()

BinaryExpression literal inlining too eagerly

BinaryExpressions are getting replaced with a literal when both operands aren't UNKNOWN, which erroneously includes the case if one is either TRUTHY or FALSY

So for example things like these are ending up wrong:

  • {}==={} -> !0 (i.e. true - admittedly a bit contrived)
  • 'translate(' + [ x, y ] + ')' -> "translate([object Object])"
  • 'b' in {b:1} -> !1 (i.e. false)

Seems fixable by just checking both operands are definitely also not either TRUTHY or FALSY - PR coming right up

Mangle property names

This can have a substantial impact on generated code size. Would like it to be automated though — no annotations or white/blacklists, instead Butternut should ideally be smart enough to figure out whether a given property is accessible from the outside world (either because it's part of an exported object, returned from an exported function, or is part of the environment). Definitely not trivial, but would be neat

Broken code near switch statement

Not totally figured out what's going on here, but this fails:

function foo () {
  switch (a) {
    case 0: while (++i < l) { b(); } return;
  }
};

This on the other hand is fine:

function foo () {
  switch (a) {
    case 0: while (++i < l) b(); return;
  }
};

`var` declarations that clash with function should not be considered duplicates

Repro:

var x = function thing ( scope ) {
  var thing = fn();
  return thing;
};

// expected
var x=function b(a){var b=fn();return b}

// actual
var x=function b(a){b=fn();return b}

// ideal (removing the argument is debatable)
var x=function(){return fn()}

Assigning to the function name results in a TypeError ('Assigning to constant variable').

`[].slice || ...` generates broken code

Looks like [].slice || (...anything) gets replaced with emptystring.

For example: [].slice||1; 1 becomes ,1 (link)

A less contrived example: var slice = [].slice || function() {...}, somethingElse = 1; becomes var slice=,somethingElse=1 (link)

Seems to happen with .slice and other array methods like indexOf, but not for any old property (i.e. [].foo is fine)

Closure Compiler (Java) compresses better, a lot better.

The compression is better than Babili and Closure Compiler

Unfortunately this claim is factually not true today, not by a long shot in fact:

https://github.com/leeoniya/domvm/blob/2.x-dev/dist/dev/domvm.dev.min.js

GCC yields 15.6 KB.
butternut comes in at 16.7 KB.

The good news is that it passes ~97% of the written tests after compilation and compresses impressively fast. I haven't checked how the final build performs in benchmarks, but GCC does a good amount optimizations that result in faster perf vs Uglify.

Polyfill-ish code gets eliminated

While checking out #54 I completely failed to realise that the "polyfill" side of the || (i.e. if [].slice is falsy) was getting stripped out completely, breaking the point of having the || in the first place.

Guessing this is due to MemberExpression's getValue returning a truthy [].slice at compile-time.

Slice is probably a bit contrived but I imagine something like:

var includes = [].includes || function(x) { return this.indexOf(x) !== -1; };

... probably doesn't want to be reduced to var includes=[].includes just because the minifier's environment supports Array.prototype.inclues.

A suggestion I had was to restrict MemberExpression's getValue to only return a known value for own properties, by adding if ( ! Object.prototype.hasOwnProperty.call( objectValue, this.property.name ) ) return UNKNOWN; here. But that breaks a lot of the nice value-inlining by CallExpression.

There're other alternatives, like having a whitelist of allowable values, or being even more explicit and only returning a foolproof value if objectValue is an array and this.property.name is numeric, but I didn't want to get too opinionated and start making PRs without checking whether it was a good idea first.

DCE can result in broken code

// input
if ( "development" === "production" ) {
  console.log( "running in development mode" );
} else {
  console.log( "running in production mode" );
}

// output
{console.log("running in production mode")}

In some cases that results in a syntax error:

// input
x();

if ( "development" === "production" ) {
  console.log( "running in development mode" );
} else {
  console.log( "running in production mode" );
}

// output
x(),{console.log("running in production mode")}

Readme: How?

It would be nice if the README explained the most interesting part:
What does it do to support bleeding edge JS and yet be significantly faster than Uglify/Babili?

That might boost people's trust and confidence in the project. People like to know what a thing does :)

Plus: Might be worth to provide a small setup for benchmarking Butternut, Uglify and Babili minifying some files. So people can try it on their own machine, maybe run in the benchmark against their own files.

But a good job! This is really useful :)

Deoptimization guard

I'm curious whether it would make sense to introduce some kind of deoptimization guard, which does not apply the optimization (or maybe throws an error to tell that something is wrong) if the result does actually make the code longer that it was before the changes. Here's a contrived edge case example/bug:

Input:

(function({x}){doSomething(x)})

Output:

(function({x:a}){doSomething(a)})

31 B → 33 B (saved -6.5%)

Object literal outputting without parens causes syntax error

The following example:

if ({}.foo) {
  b();
}

gets minified to {}.foo&&b() which causes a syntax error because {}.foo needs parens to be parsed as an expression.

A few other cases where object literals / object destructuring in expressions are getting parens stripped or would need them added, but not sure if they'd be related or need tackling separately:

  • ({a, b, c} = d) -> {a,b,c}=d
  • truthyThing && {}.foo -> {}.foo

The less-contrived real-life example I'm running into is essentially the same as the first example (from setImmediate):

if ({}.toString.call(global.process) === "[object process]") { ...

For the if statements it could be as simple as making sure shouldParenthesiseTest is true if the test starts with {, but it feels like it might need a more general solution also incorporating those other cases.

compress options and timings

Congrats on the new project! The world can never have enough JS minifiers.

Just want to make a small point about compress options and timings. To squeeze an extra couple percent in file size it's not uncommon to take 2 or 3 times as long to compute.

Consider the timings and gzipped file sizes if compress is disabled altogether in uglify and only mangle is enabled:

$ bin/squash -v
Butternut version 0.2.0

$ node_modules/.bin/uglifyjs -V
uglify-js 2.8.23
$ /usr/bin/time bin/squash test/fixture/input/three.js | gzip | wc -c
        1.92 real         2.02 user         0.09 sys
  130312

$ /usr/bin/time node_modules/.bin/uglifyjs test/fixture/input/three.js -m | gzip | wc -c
        1.86 real         2.04 user         0.07 sys
  129135
$ /usr/bin/time bin/squash test/fixture/input/d3.js | gzip | wc -c
        1.42 real         1.49 user         0.07 sys
   73812

$ /usr/bin/time node_modules/.bin/uglifyjs test/fixture/input/d3.js -m | gzip | wc -c
        1.47 real         1.64 user         0.06 sys
   73483

One can see that the uglify timings and sizes are comparable to butternut with uglify compress disabled.

If compress is enabled in uglify it's naturally going to be slower to produce a smaller result:

$ /usr/bin/time node_modules/.bin/uglifyjs test/fixture/input/three.js -mc | gzip | wc -c
        4.31 real         5.08 user         0.12 sys
  127816

but it can be sped up somewhat by disabling certain compress options at the expense of output size:

$ /usr/bin/time node_modules/.bin/uglifyjs test/fixture/input/three.js -mc reduce_vars=false,collapse_vars=false | gzip | wc -c
        3.68 real         4.30 user         0.11 sys
  128466

My only point is that everything is a tradeoff. The most bang for the compute buck comes from variable name mangling.

Minifier options

Will butternut get minifier options that can be optouted?
i.e. preserve comments, preserve newlines and so on

deopt for direct eval calls

Not sure how often direct evals are used now. Just putting here incase

function foo() {
  var a = 1;
  eval("console.log(a)");
}

Class Decl with a method in IIFE throws

// works
class A {}

// also works
class A {
  foo() {}
}

// also works
(function() {
  class A {}
})()

// throws an error - this.scope is undefined
(function () {
  class A {
    foo() {}
  }
})();
/Users/brajaa/workspace/butternut/dist/butternut.cjs.js:3015
		this.scope.mangle( code );
		          ^

TypeError: Cannot read property 'mangle' of undefined

Non-empty loops being collapsed if first node is EmptyStatement

e.g. while (a) { ; b(); } -> while(a);; (link)

I tried making some test-cases + fixing up the check for empty nodes by checking over every descendent node rather than just the first, but my do-while case is failing completely for magic-string reasons I don't fully understand.

Additionally, all cases where a curly-braced loop could be reduced to a single-statement body aren't working because I can't figure out how to trim leading/trailing EmptyStatements.

Incidentally it looks like there's a few more cases where leading empty statements in a block change the output (e.g. if (a) { b(); } vs if (a) { ; b(); }) but the loops are the only case where it's doing something it definitely shouldn't.

Note

Just saw Rich-Harris/butternut.now.sh in my feed and immediately went to check what it was about.

On behalf of every sentient human being on Earth, thank you Rich for all the impressive work and thought you put into every repo you open on GitHub or GitLab.

Hopefully I'm not over-reacting (or should I say over-svelting to avoid runtime overhead?) here, but it seems to me that your work is consistently of the highest quality, usefulness, and intention, and like so, it deserves continuous praise and encouragement.

May it never be enough! Thank you!

Unused identifiers in object/array patterns are not removed

Edited: no longer a bug, but suboptimal output

// input
(function () {
  var {foo,bar} = obj;
  console.log(foo);
}())

// expected
!function(){var {foo:a}=obj;console.log(a)}()

// or maybe
!function(){console.log(obj.foo)}()

// actual
!function(){var {foo:a,bar:b}=obj;console.log(a)}()

Generator in FunctionExpression is not preserved.

Input:

foo = function*() {
  yield 42;
};

Output:

foo=function(){yield 42}

This throws a SyntaxError: Unexpected strict mode reserved word because the yield is now in a normal (non-generator) function. The expected output is:

foo=function*(){yield 42}

Minify/remove top-level declarations in a module

If there's an import/export declaration, we know that this is a module and that it's therefore safe to a) mangle/remove top-level declarations, and b) remove 'use strict' pragmas.

Probably makes sense to have a module: true option to force it, in cases where there's no import/export.

(The mangling part also applies to CommonJS but I'm not sure it's worth worrying about that — it's rare to minify stuff for Node, and CommonJS will die out soon enough anyway.)

The keyword 'await' is reserved

Run this specimen through https://butternut.now.sh/?version=0.4.1

{
  await: function await() {}
}

// -> Error: The keyword 'await' is reserved (2:2)  bundle.js:324

Both await occurrences are perfectly valid and parsed correctly by Chrome (V8), butternut refuses to minify this.

This code is (essentially) used by the AsyncGenerator babel transform plugin.

Body-less loop with semicolon as last line in block

If you have a loop (for, while) that is body-less and is terminated with a semicolon (rather than braces), and it's the last line in its block, the semicolon is removed, resulting in invalid code.

function f() {
  while (g());
}

produces

function f(){while(g())}

which is invalid. REPL, although its use of check: true prevents you from actually seeing this output.

Remove unused variables

Hey,

Great lib 👍. I was checking the online tool and I noticed that:

(function (xyz) {});

gave the output:

(function(a){})

I expected to receive:

(function(){})

`for (;;)` messes up scope

This is only reproducible on latest master, not on the last released version, 0.3.5.

(() => {
  let foo = 0;
  for (;;) {
    f(foo);
  }
})();

squashes to

!(()=>{for(;;)f(foo)})()

If any of the three parts of the head of the for loop contain some sort of expression, this looks to work correctly.

!(()=>{let a=0;for(x;;)f(a)})()

This can be easily worked around by just using while (true) { ... } but I don't think that for (;;) { ... } is all that obscure of an idiom.

New expressions aren't minified

Repro — whitespace inside the parentheses isn't removed as it is with call expressions:

// input
var foo1 = new Foo(
  1 + 1
);

var foo2 = Foo(2 + 2)

// expected
var foo1=new Foo(2),foo2=Foo(4)

// actual
var foo1=new Foo(
  2
),foo2=Foo(4)

Anonymous fn in export default throws

Input:

export default function () {}

throws an error -

/Users/brajaa/workspace/butternut/dist/butternut.cjs.js:1320
		this.id.declaration = this;
		                    ^

TypeError: Cannot set property 'declaration' of null

"linebreak-style" eslint rule is a footgun on Windows

Freshly cherry-picked repo is full of warnings about incorrect line endings in the repo.

This is because Git by default checks out text files with native line endings (CR LF) and commits back with normalised to LF.

If you want to enforce specific line endings even on user's copy of the repo, the usual way to fix this is to add .gitattributes which can tell Git to always check out with preferred line endings instead of using native ones on the target.

However, in this case, test/fixture/input/Rx.js needs to have all line endings changed too (seems that they're currently CR LF even in the repo).

An alternative would be to add .gitattributes which ensures that line endings are always correctly converted (text=auto) and to remove ESLint rule.

In either case, it would be nice to have this fixed so that devs on Windows could contribute too 😄 Just let me know which approach sounds better to you.

Remove dead code blocks

In this example:

let foo = false;
if (foo) console.log('ohno');

The whole if block should get removed. This is very useful for removing development only code blocks from production builds. Example:

if (process.env.NODE_ENV === 'development') {
  // do stuff
}

// before minification this gets transformed into
if ('production' === 'development') {
  // do stuff
}

Unused declarations should not be mangled

Repro:

// input
(function () {
  var foo = 1, bar = 2, baz = 3;
  console.log(baz);
}());

// expected
!function(){var a=3;console.log(a)}()

// actual
!function(){var c=3;console.log(c)}()

Having a and b assigned to unused vars means we get to aa etc sooner, increasing output size.

Optimise assignments in if blocks

// input
if ( a ) {
  obj = {}
} else {
  obj = null;
}

// ideal
obj=a?{}:null

// actual
a?(obj={}):(obj=null)

Uglify and Closure both get to the ideal version.

Mangling function/class declaration names is dangerous

Code might refer to fn.name. Uglify, Closure and Babili all mangle function names by default, so maybe we should too. Either way we should certainly have the option to preserve function and class names.

(In an ideal world we might be able to determine whether it's safe to mangle a given function's name — e.g. in the following snippet there is no possibility of the name mattering, and we could determine as much statically):

function foo () {
  function bar () {
    console.log('the name of this function does not matter');
  }

  bar();
}

Fails with duplicate var declarations

This fails:

function x () {
  for ( var i = 0; i < 10; i += 1 ) {
    console.log(i);
  }

  for ( var i = 0; i < 10; i += 1 ) {
    console.log(i);
  }
}

Separately, this could be better optimised:

var i = 1;
var i = 2;

Fold constants in template literals

This gets folded...

// input
console.log( `running in ${true ? 'development' : 'production'} mode` );

// output
console.log("running in development mode");

...but this doesn't:

// input
console.log( `running in ${true ? 'development' : 'production'} mode. The time is ${Date.now()}` );

// output
console.log(`running in ${'development'} mode. The time is ${Date.now()}`)

// ideal
console.log(`running in development mode. The time is ${Date.now()}`)

Feature: accept a source map object that the output source map will be based on

Hello! Great work with butternut!

I was trying to add sourcemaps support to butternut-webpack-plugin (balthazar/butternut-webpack-plugin#2) but I dont know how to pass a sourcemap.

Maybe something like what babel does will be good:

babel.transform(input, {
  inputSourceMap
})

https://babeljs.io/docs/usage/api/#babel-transform-code-string-options-object-
https://babeljs.io/docs/usage/api/#options

Also Im not sure if this is the right place for this request, maybe it should be in https://github.com/Rich-Harris/magic-string ?

Take memory usage into account

Although butternut appears to be much faster than uglifyjs, it would be nice to also see memory comparisons.
I've got a few tasks where uglifyJS gives me no problems, but butternut goes out of memory without increasing the --max-old-space-size.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.