GithubHelp home page GithubHelp logo

jkehres / csv-parser Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mafintosh/csv-parser

0.0 1.0 0.0 517 KB

Streaming csv parser inspired by binary-csv that aims to be faster than everyone else

License: MIT License

JavaScript 100.00%

csv-parser's Introduction

csv-parser

Streaming CSV parser that aims for maximum speed as well as compatibility with the csv-spectrum CSV acid test suite

npm install csv-parser

build status dat

csv-parser can convert CSV into JSON at at rate of around 90,000 rows per second (perf varies with data, try bench.js with your data).

Usage

Simply instantiate csv and pump a csv file to it and get the rows out as objects

You can use csv-parser in the browser with browserify

Let's say that you have a CSV file some-csv-file.csv like this:

NAME, AGE
Daffy Duck, 24
Bugs Bunny, 22

You can parse it like this:

var csv = require('csv-parser')
var fs = require('fs')

fs.createReadStream('some-csv-file.csv')
  .pipe(csv())
  .on('data', function (data) {
    console.log('Name: %s Age: %s', data.NAME, data.AGE)
  })

The data emitted is a normalized JSON object. Each header is used as the property name of the object.

The csv constructor accepts the following options as well

var stream = csv({
  raw: false,     // do not decode to utf-8 strings
  separator: ',', // specify optional cell separator
  quote: '"',     // specify optional quote character
  escape: '"',    // specify optional escape character (defaults to quote value)
  newline: '\n',  // specify a newline character
  strict: true    // require column length match headers length
})

It accepts too an array, that specifies the headers for the object returned:

var stream = csv(['index', 'message'])

// Source from somewere with format 12312,Hello World
origin.pipe(stream)
  .on('data', function (data) {
    console.log(data) // Should output { "index": 12312, "message": "Hello World" }
  })

or in the option object as well

var stream = csv({
  raw: false,     // do not decode to utf-8 strings
  separator: ',', // specify optional cell separator
  quote: '"',     // specify optional quote character
  escape: '"',    // specify optional escape character (defaults to quote value)
  newline: '\n',  // specify a newline character
  headers: ['index', 'message'] // Specifing the headers
})

If you do not specify the headers, csv-parser will take the first line of the csv and treat it like the headers.

Another issue might be the encoding of the source file. Transcoding the source stream can be done neatly with something like iconv-lite, Node bindings to iconv or native iconv if part of a pipeline.

Events

The following events are emitted during parsing.

data

For each row parsed (except the header), this event is emitted. This is already discussed above.

headers

After the header row is parsed this event is emitted. An array of header names is supplied as the payload.

fs.createReadStream('some-csv-file.csv')
  .pipe(csv())
  .on('headers', function (headerList) {
    console.log('First header: %s', headerList[0])
  })

Other Readable Stream Events

The usual Readable stream events are also emitted. Use the close event to detect the end of parsing.

fs.createReadStream('some-csv-file.csv')
  .pipe(csv())
  .on('data', function (data) {
    // Process row
  })
  .on('end', function () {
    // We are done
})

Command line tool

There is also a command line tool available. It will convert csv to line delimited JSON.

npm install -g csv-parser

Open a shell and run

$ csv-parser --help # prints all options
$ printf "a,b\nc,d\n" | csv-parser # parses input

Options

You can specify these CLI flags to control how the input is parsed:

Usage: csv-parser [filename?] [options]

  --headers,-h        Explicitly specify csv headers as a comma separated list
  --output,-o         Set output file. Defaults to stdout
  --separator,-s      Set the separator character ("," by default)
  --quote,-q          Set the quote character ('"' by default)
  --escape,-e         Set the escape character (defaults to quote value)
  --strict            Require column length match headers length
  --version,-v        Print out the installed version
  --help              Show this help

For example, to parse a TSV file:

cat data.tsv | csv-parser -s $'\t'

Related

  • neat-csv - Promise convenience wrapper

License

MIT

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.