GithubHelp home page GithubHelp logo

clarinet's Introduction

clarinet

clarinet is a sax-like streaming parser for JSON. works in the browser and node.js. clarinet is inspired (and forked) from sax-js. just like you shouldn't use sax when you need dom you shouldn't use clarinet when you need JSON.parse. for a more detailed introduction and a performance study please refer to this article.

design goals

clarinet is very much like yajl but written in javascript:

  • written in javascript
  • portable
  • robust (~110 tests pass before even announcing the project)
  • data representation independent
  • fast
  • generates verbose, useful error messages including context of where the error occurs in the input text.
  • can parse json data off a stream, incrementally
  • simple to use
  • tiny

motivation

the reason behind this work was to create better full text support in node. creating indexes out of large (or many) json files doesn't require a full understanding of the json file, but it does require something like clarinet.

installation

node.js

  1. install npm
  2. npm install clarinet
  3. var clarinet = require('clarinet');

browser

  1. minimize clarinet.js
  2. load it into your webpage

usage

basics

var clarinet = require("clarinet")
  , parser = clarinet.parser()
  ;

parser.onerror = function (e) {
  // an error happened. e is the error.
};
parser.onvalue = function (v) {
  // got some value.  v is the value. can be string, double, bool, or null.
};
parser.onopenobject = function (key) {
  // opened an object. key is the first key.
};
parser.onkey = function (key) {
  // got a key in an object.
};
parser.oncloseobject = function () {
  // closed an object.
};
parser.onopenarray = function () {
  // opened an array.
};
parser.onclosearray = function () {
  // closed an array.
};
parser.onend = function () {
  // parser stream is done, and ready to have more stuff written to it.
};

parser.write('{"foo": "bar"}').close();
// stream usage
// takes the same options as the parser
var stream = require("clarinet").createStream(options);
stream.on("error", function (e) {
  // unhandled errors will throw, since this is a proper node
  // event emitter.
  console.error("error!", e)
  // clear the error
  this._parser.error = null
  this._parser.resume()
})
stream.on("openobject", function (node) {
  // same object as above
})
// pipe is supported, and it's readable/writable
// same chunks coming in also go out.
fs.createReadStream("file.json")
  .pipe(stream)
  .pipe(fs.createReadStream("file-altered.json"))

arguments

pass the following arguments to the parser function. all are optional.

opt - object bag of settings regarding string formatting. all default to false.

settings supported:

  • trim - boolean. whether or not to trim text and comment nodes.
  • normalize - boolean. if true, then turn any whitespace into a single space.

methods

write - write bytes onto the stream. you don't have to do this all at once. you can keep writing as much as you want.

close - close the stream. once closed, no more data may be written until it is done processing the buffer, which is signaled by the end event.

resume - to gracefully handle errors, assign a listener to the error event. then, when the error is taken care of, you can call resume to continue parsing. otherwise, the parser will not continue while in an error state.

members

at all times, the parser object will have the following members:

line, column, position - indications of the position in the json document where the parser currently is looking.

closed - boolean indicating whether or not the parser can be written to. if it's true, then wait for the ready event to write again.

opt - any options passed into the constructor.

and a bunch of other stuff that you probably shouldn't touch.

events

all events emit with a single argument. to listen to an event, assign a function to on<eventname>. functions get executed in the this-context of the parser object. the list of supported events are also in the exported EVENTS array.

when using the stream interface, assign handlers using the EventEmitter on function in the normal fashion.

error - indication that something bad happened. the error will be hanging out on parser.error, and must be deleted before parsing can continue. by listening to this event, you can keep an eye on that kind of stuff. note: this happens much more in strict mode. argument: instance of Error.

value - a json value. argument: value, can be a bool, null, string on number

openobject - object was opened. argument: key, a string with the first key of the object (if any)

key - an object key: argument: key, a string with the current key

closeobject - indication that an object was closed

openarray - indication that an array was opened

closearray - indication that an array was closed

end - indication that the closed stream has ended.

ready - indication that the stream has reset, and is ready to be written to.

samples

some samples are available to help you get started. one that creates a list of top npm contributors, and another that gets a bunch of data from twitter and generates valid json.

roadmap

check issues

contribute

everyone is welcome to contribute. patches, bug-fixes, new features

  1. create an issue so the community can comment on your idea
  2. fork clarinet
  3. create a new branch git checkout -b my_branch
  4. create tests for the changes you made
  5. make sure you pass both existing and newly inserted tests
  6. commit your changes
  7. push to your branch git push origin my_branch
  8. create an pull request

helpful tips:

check index.html. there's two env vars you can set, CRECORD and CDEBUG.

  • CRECORD allows you to record the event sequence from a new json test so you don't have to write everything.
  • CDEBUG can be set to info or debug. info will console.log all emits, debug will console.log what happens to each char.

in test/clarinet.js there's two lines you might want to change. #8 where you define seps, if you are isolating a test you probably just want to run one sep, so change this array to [undefined]. #718 which says for (var key in docs) { is where you can change the docs you want to run. e.g. to run foobar i would do something like for (var key in {foobar:''}) {.

meta

(oO)--',- in caos

clarinet's People

Contributors

isaacs avatar dscape avatar thejh avatar jimhigson avatar laurie71 avatar mikeal avatar jlank avatar smh avatar brettz9 avatar tmpvar avatar henryrawas avatar jmakeig avatar fent avatar

Watchers

Navid Nikpour avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.