GithubHelp home page GithubHelp logo

csvlint's Introduction

csvlint

csvlint is a library and command-line utility for linting CSV files according to RFC 4180.

It assumes that your CSV file has an initial header row.

Everything in this README file refers to the command-line utility. For information about the library, see godoc.

Installing

Standalone executables for multiple platforms are available via Github Releases.

You can also compile from source:

go get github.com/Clever/csvlint/cmd/csvlint

Usage

csvlint [options] /path/to/csv/file

Options

NOTE: The default settings validate that a CSV conforms to RFC 4180. By changing the settings, you can no longer strictly guarantee a CSV conforms to RFC 4180.

  • delimiter: the field delimiter, can be any single unicode character
    • default: "," (comma)
    • valid options: "\t", "|", "เฎƒ", etc
    • if you want multi-character delimiters, you're probably doing CSVs wrong
  • lazyquotes: allow a quote to appear in an unquoted field and a non-doubled quote to appear in a quoted field. WARNING: your file may pass linting, but not parse in the way you would expect

Examples

$ csvlint bad_quote.csv
Record #1 has error: bare " in non-quoted-field

unable to parse any further

$ csvlint --lazyquotes bad_quote.csv
file is valid

$ csvlint mult_long_columns.csv
Record #2 has error: wrong number of fields in line
Record #4 has error: wrong number of fields in line

$ csvlint --delimiter='\t' mult_long_columns_tabs.csv
Record #2 has error: wrong number of fields in line
Record #4 has error: wrong number of fields in line

$ csvlint one_long_column.csv
Record #2 has error: wrong number of fields in line

$ csvlint perfect.csv
file is valid

Exit codes

csvlint uses three different exit codes to mean different things:

  • 0 - the file is valid
  • 1 - couldn't parse the entire file
  • 2 - could parse the file, but there were lint failures

Vendoring

Please view the dev-handbook for instructions.

csvlint's People

Contributors

azylman avatar bgveenstra avatar chrisscotmartin avatar drhurd avatar johnhuangclever avatar mohit avatar natebrennand avatar nathanleiby avatar peggyl avatar prime-time avatar renatoprime avatar rgarcia avatar taylor-sutton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csvlint's Issues

Make the help information syntax obtained at the command line match that in the readme.md file

The readme.md file has these examples:

$ csvlint bad_quote.csv
$ csvlint --lazyquotes bad_quote.csv
$ csvlint mult_long_columns.csv
$ csvlint --delimiter=tab mult_long_columns_tabs.csv
$ csvlint one_long_column.csv
$ csvlint perfect.csv

whereas the command line help shows this:

Usage of csvlint:
  -delimiter="comma": field delimiter in the file. options: comma, tab
  -help=false: print help and exit
  -lazyquotes=false: try to parse improperly escaped quotes

McAfee Endpoint Security 10.7 on Windows 10 says that it is malicious

McAfee Endpoint Security 10.7 on Windows 10 prevents it from working saying that it is malicious with threat severity as "Critical". Here's a screenshot:

image

Since the screenshot does not capture full information, here is a copy-paste text version of it:

Adaptive Threat Protection repaired D:\copyNrun\cmdTools\csvlint.exe TargetType, because its reputation (Known Malicious) is below the configured Clean threshold.
Analyzer / Detector
Product name McAfee Endpoint Security
Product version 10.7.0.1929
Feature name Real Protect Cloud

Threat
Action taken Clean
Threat category Malware Detected
Threat event ID 35107
Threat handled Yes
Threat name Real Protect-XGPE!D8FF91EB72FC
Threat severity Critical
Threat timestamp 8/16/2021 2:19 PM
Threat type Trojan

Source
Source access time 8/16/2021 2:18 PM
Source create time 4/12/2018 5:04 AM
Source file path C:\WINDOWS\SysWOW64
Source file size 232960
Source hostName DDO-SECTION
Source modify time 4/12/2018 5:04 AM
Source process name cmd.exe
Source user name DDO-SECTION\SECTION-04

Target
Target hash d8ff91eb72fcc0f7b029f60c38ddf718
Target host name DDO-SECTION
Target name csvlint.exe
Target path D:\copyNrun\cmdTools

Other
Vector type Local System
Detection message Adaptive Threat Protection Detection
Detection quarantine ID {9BC8D7C7-FA76-4FDC-968B-1ACCBC7E5689}


Online testing at https://www.virustotal.com/gui/home/upload says that it is clean. That list also includes McAfee-GW-Edition according to which it is clean. The issue only seems to be with McAfee Endpoint Security!

Edit: I've sent an email requesting them to check the file.

Please add support for delimiter "pipe" which looks like "|" Also: colon and semicolon.

The pipe delimiter "|" is very popular and is the default for applications like sqlite which I use. Please add the pipe to csvlint as a delimiter. Also, next in line are probably colon (":") and then semicolon (";"). Adding them when you are in there might save time in the future. Since I don't have a go compiler, please update the windows executable "csvlint.exe".

UTF-8 CSV files with BOM aren't parsed correctly if the first header field contains quotes

I have a CSV file encoded with UTF8-BOM:

"first_column","second_column"
"Hello","how are you"

This is a correct CSV file but there is the result:

Record #0 has error: bare " in non-quoted-field

The issue happens with an UTF-8 with BOM encoded file if the first header field is surrounded by quotes.

Suggestion: This could be solved by removing the UTF-8 BOM in the header line:
Pseudocode:
if (line_number == 1) { sub(/^\xef\xbb\xbf/, "", line) }

Flag to show more verbose errors/failed row

Hi,

I'm running csvlint as follows:

./csvlint large_csv.csv

I get:

$ ./csvlint large_csv.csv
Record #13361 has error: wrong number of fields in line

Is there a flag or option we can use to find the exact error/raw lines surrounding record #13361?

Add support for reading CSV from shell stdin via the `-` file input

I have a shell script that tests the conversion of JSON to CSV that I'd like to use with csvlint but I don't want to force the script to save (and later delete) an actual file. Rather I'd like to pass the CSV in via stdin in the same way most command-line utilities work.

I tried:

cat file.csv | csvline -

but it throws an error file '-' does not exist.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.