GithubHelp home page GithubHelp logo

specs's People

Contributors

lizzie avatar

Stargazers

 avatar

Watchers

 avatar

specs's Issues

Commit Message Formate

Formate

<type>: subject
<emptyline>
Long description (if useful)
<emptyline>
Closes gh-<pullRequestNumber>
Fixes #<issueNumber>
Ref #<issueNumber>

Type

  • feat (feature)
  • fix (bug fix)
  • docs (documentation)
  • style (formatting, missing semi colons, …)
  • refactor
  • test (when adding missing tests)
  • chore (maintain)

Refer

CSV Specification

http://www.ietf.org/rfc/rfc4180.txt

1. 简介

逗号分割格式文件(CSV),常常用于在不同的电子表格软件之间进行数据转换。虽然这种格式非常常见,但是在这之前都没有很好的规范化说明过。另外,csv 格式文件的 MIME 类型也一直没有规范过,这份文档中,指定了 "text/csv" 为 CSV 文件的 MIME 类型。

2. CSV 格式的定义

  1. 每条记录占一行,且以 CRLF 作为换行符

    aaa,bbb,ccc CRLF
    zzz,yyy,xxx CRLF
    
  2. 最后一条记录的换行符 CRLF 可省略

    aaa,bbb,ccc CRLF
    zzz,yyy,xxx
    
  3. 首行记录可以是标题,每个标题字段对应下面的字段值,且个数及顺序一致。是否包含标题行,可在 MIME 类型中的 header 参数中表明

    field_name,field_name,field_name CRLF
    aaa,bbb,ccc CRLF
    zzz,yyy,xxx CRLF
    
  4. 每行(包含标题行),如果有多个字段,各个字段以 , 分割。每行包含的字段数需一致。各个字段中的空格是有效的值,不能被忽略。每条记录的最后一个字段不需要添加 ,

    aaa,bbb,ccc
    
  5. 每个字段可以加双引号包裹(但有些软件,如 Microsoft Excel不能识别此双引号)。如果没使用双引号包裹的字段,那么,字段内部不能出现双引号。

    "aaa","bbb","ccc" CRLF
    zzz,yyy,xxx
    
  6. 如果字段中包含换行符 CRLF、双引号、逗号,那么该字段应需要双引号包裹。

    "aaa","b CRLF
    bb","ccc" CRLF
    zzz,yyy,xxx
    
  7. 如果被双引号包裹的字段中有双引号字符的话,那么需求将字段内的双引号字符转义,即在双引号字符前再加个双引号字符。

    "aaa","b""bb","ccc"
    

范式:

file = [header CRLF] record *(CRLF record) [CRLF]
header = name *(COMMA name)
record = field *(COMMA field)
name = field
field = (escaped / non-escaped)
escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE
non-escaped = *TEXTDATA
COMMA = %x2C
CR = %x0D ;as per section 6.1 of RFC 2234 [2]
DQUOTE =  %x22 ;as per section 6.1 of RFC 2234 [2]
LF = %x0A ;as per section 6.1 of RFC 2234 [2]
CRLF = CR LF ;as per section 6.1 of RFC 2234 [2]
TEXTDATA =  %x20-21 / %x23-2B / %x2D-7E

MIME 类型说明 text/csv

  • MIME 主类型名: text
  • MIME 次类型名: csv
  • 是否需要参数:否
  • 可选参数:charset, header
    • 一般的 charsetUS-ASCII,但是你可以定义 IANA 规定 text 类型支持的所有编码方式。
    • header 参数定义了是否包含标题行。有效值为"present""absent"
  • 编码上的考虑:根据 RFC 2046 4.1.1 中的说明,使用 CRLF 作为换行符。但不排除实现者使用其他字符作为换行符。
  • 安全上的考虑:CSV 文件包含了大量纯文本。理论上是很可能存在恶意的二进制值的。通过此格式的文件,里面包含的私有数据也会被公开。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.