burntsushi / xsv Goto Github PK
View Code? Open in Web Editor NEWA fast CSV command line toolkit written in Rust.
License: The Unlicense
A fast CSV command line toolkit written in Rust.
License: The Unlicense
As per the subject. I'd like to keep the delimiter if possible
In more than one context I've observed that when xsv is pointed a path that doesn't exist, it prints this error message:
No such file or directory (os error 2)
In the context of a big script that crunches through a few hundred files, that means that you can't tell which path was the problem.
Wow, xsv
is cool!
I was exploring Pachyderm for large-scale data ingestion tasks, and one use-case popped up fairly often. This is basically a map/reduce implementation using an immutable, versioned cluster file system, and worker jobs running inside of Docker containers. You read files from /pfs/$INPUT_NAME/*
and write them to /pfs/out
.
Imagine an input file which looks like:
value,count
FL,11957
CA,11816
TX,10157
IL,5633
OH,5556
GA,4535
NY,4088
MI,4008
NJ,3890
CO,3690
I would love to be able to write something like:
xsv partition value /pfs/out
...and create a file named /pfs/out/FL
containing 11957
, a file named /pfs/out/CA
containing 11816
, and so on. It would also be OK if the files contained the partitioning column: FL,11957
and CA,11816
. If there are more than two columns, they should all be included. All rows with the same value
column should be in the appropriate output file.
Does xsv
have support for anything like this? Would you like support for something like this? If not, no worries, it would be a trivial standalone tool using your csv
library. But I thought I'd check whether you wanted it upstream first.
When I do xsv stats -s COLUMN --everything FILE
, the metrics I can see are min, max, min_length, max_length, mean, stddev, median, mode, and cardinality. How can I get sum? Is it unimplemented or hiding somewhere?
BTW, thank you for the nice work!
cargo build --release
yields the following error on mac os
/Users/<username>/.cargo/registry/src/github.com-1ecc6299db9ec823/filetime-0.1.2/src/lib.rs:83:29: 83:57 error: unresolved import `std::os::macos::fs::MetadataExt`. Could not find `macos` in `std::os`
/Users/<username>/.cargo/registry/src/github.com-1ecc6299db9ec823/filetime-0.1.2/src/lib.rs:83 use std::os::$i::fs::MetadataExt;
Cannot build on macos because of this issue. Any tips/hints would be greatly appreciated
The final output would have the union of all the field names across all input files.
Fields without data (e.g., that column was missing from the original) could be left blank or a user-supplied value (e.g., "--nodata='NA'").
I was comparing xsv join
to GNU join
and noticed that xsv seems to be doing random seeks on its input. This is very slow for even moderate-sized files. From the docs ("Uses a simple hash index to make it fast") I assumed it reads one file into memory and then streams through the other, but perhaps the design was meant to operate out-of-core? (Explicitly building indexes does not help either.)
#!/bin/bash
export LC_ALL=C LINES=1000000 SORT_MEM=1G
seq $LINES | awk -vOFS=, "{print \$1, int(rand() * $LINES)}" > a.csv
seq $LINES | awk -vOFS=, "{print \$1, int(rand() * $LINES)}" > b.csv
time join -t, -j2 -o 1.1,1.2,2.1,2.2 <(sort -S"$SORT_MEM" -t, -k2,2 a.csv) <(sort -S"$SORT_MEM" -t, -k2,2 b.csv) | wc -l
#time (xsv index a.csv ; xsv index b.csv)
time xsv join -n 2 a.csv 2 b.csv | wc -l
rm -f a.csv b.csv a.csv.idx b.csv.idx
$ ./join.sh
1999368
real 0m0.830s
user 0m0.480s
sys 0m0.076s
1999368
real 0m27.801s
user 0m3.004s
sys 0m24.836s
If it is meant to operate out-of-core, an option to operate in-core would be awesome.
SELECT, JOIN and ORDER BY seems to be already in, but filtering rows (WHERE) or replacing groups of rows with stats seems not to be implemented.
I expect the interface to be like this:
xsv filter Population lt 1000 worldcitiespop.csv | xsv group Region | xsv select Region,Latitude/avg,Longitude/median | xsv table
Can xsv eventually replace q?
For fun: my most complicated q
usage is this:
q 'SELECT COUNT(a.c3),SUM((a.c3 - b.c3)/1000/1000),SUM((a.c3 - b.c3)/1000/1000*(a.c3 - b.c3)/1000/1000) FROM l.txt a INNER JOIN l.txt b ON a.c2 = b.c2 WHERE a.c1 = "o" AND (b.c1 = "s" OR b.c1 = "i")'
Amongst the widely used options of grep
and other regexp-based search programs is -i, --ignore-case
, which allows to ignore case distinctions in the regexp pattern as well as in the input. I think that adding it to the search
command would be a nice enhancement since it implements a feature not easily achievable in other ways, and most people (I guess) are already familiar with it from using grep
or others (and could maybe even actually expect its presence in such a context).
Hello,
Does the sample
flag just return whatever it loads first?
Running xsv search -s Population '[0-9]' worldcitiespop.csv | xsv select Country,AccentCity,Population | xsv sample 10 | xsv table
a few times results in various data each time. Is there a method to sort this all the time is that not really desired?
Thanks!
Currently, even when foo
is detected as integer-valued, search -s foo 42
matches on the values of foo
which contain 42
in their digits, instead of matching on the values 42
, as one could possibly expect. A workaround is search -s foo '^42$'
or even search -s foo '^0*42$'
in case of leading zeros in the file. It could also be useful to search for fields greater or less than a given value. Would it make sense to implement these operations?
My specific use case is unzip -p data.csv.zip | xsv some_transform, but more generally it'd be nice to be able to just pipe straight into xsv, to make it more unix-y
I feel almost ashamed to ask for a feature that has such an easy workaround, but sometimes, I need to manipulate a CSV file, select rows and columns based on their headers, and output a table without headers. The --no-headers
option is of no help because I rely on the headers of the input file. The workaround on Unix is | tail -n+2
, but I believe that an option in xsv
to implement this behaviour could make sense as well.
Is it worth considering adding this feature?
First, thanks for the great tool!
I was wondering if it would be possible to add some rudimentary support for NULL values?
I want to use xsv
to get the column types for a CSV file before loading it into a database. This way I would know the correct column types when creating the database table. While xsv
is very fast at calculating file column statistics, my files often have int / float columns with NULL values (e.g. "-", ".'", "\N", "NA"), and xsv
reports those as Unicode. It would be nice to have an option like --na-values
, so that xsv
could skip NULL values when calculating statistics, and could report whether a column contains at least one NULL value.
Hi,
I'm wondering if a windows binary is currently on the agenda. I have tried compiling the project there a while ago, but ran into some issues. I wanted to follow up on that once I understood Rust a bit more.
Last week, I saw you added AppVeyor CI and thought you would release a Windows binary any moment now. It's been almost 2 weeks now, however, so I feel that there is a chance you have missed the artifacts feature of Appveyor that lets you download the binaries it compiles.
Would you be interested in a pull request in that direction? I assume you didn't already hit a wall with that attempt...
Kind regards,
Matthias
The help message printed by xsv stats --help
mentions the --no-headers
option:
-n, --no-headers When set, the first row will NOT be interpreted
as column names. i.e., They will be included
in statistics.
And yet I'm seeing the tool use the first column as the header in cases where I specify --no-headers
:
% head -n 1 2015-3-18_CA_5E63F293-8C7A-4737-A6CE-9E53DFB7DAF7.txt
209,0000000
% xsv stats --no-headers 2015-3-18_CA_5E63F293-8C7A-4737-A6CE-9E53DFB7DAF7.txt
field,type,min,max,min_length,max_length,mean,stddev
209,Integer,209,951,3,3,650.419291,216.581229
0000000,Integer,0,9999999,7,7,5714603.611216,2326579.121568
% xsv slice --no-headers -i 0 2015-3-18_CA_5E63F293-8C7A-4737-A6CE-9E53DFB7DAF7.txt
209,0000000
Contrast with xsv frequency
, which behaves as I expect:
% xsv frequency --no-headers 2015-3-18_CA_5E63F293-8C7A-4737-A6CE-9E53DFB7DAF7.txt
field,value,count
1,310,1562992
1,714,1406572
1,916,1322587
1,818,1291127
1,760,1285504
1,408,1185066
1,805,1120246
1,619,1113284
1,510,1104539
1,415,1100438
2,0000000,22
2,9999999,20
[...]
:~/xsv-0.8.14-x86_64-unknown-linux-gnu$ ./xsv
./xsv: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.18' not found (required by ./xsv)
:~$ ldd --version
ldd (Ubuntu EGLIBC 2.15-0ubuntu10.10) 2.15
Basically, I want to to be able to run xsv impute
and have it replace blank values (or values with an out-of-bounds marker you can specify on the command line, such as "NA"). Obvious choices to fill in with would be a fixed value specified on the command line, or the same value as the previous cell in a row or column.
Value: It's hard to do this using sed or such, when a "blank value" could be the strings ,$
,""$
,,
,"",
and so on.
Not sure how far one wants to go with this, since it'd be easy to feature creep "well why don't we add the option to fill in blanks with an average" "well why don't we add interpolation" "well why don't we let you put in an arbitrary equation" and so on. Which is obviously not something xsv should care about.
Hi, I am the author of a csv-plugin for vim and consider using xsv if available to align csv files nicely into tables. At first I thought, xsv table would be all I need, however it seems, that xsv table just makes the fields larger than needed and does not consider header lines. So my question is, is there a recommended way to align csv files so that each field is as large as needed but not larger?
Hi,
I've been playing with frequency
a bit, but got odd results, and here is what I found.
Take this simple csv file
alpha,num
"a",1
"b",2
"b",1
"a",2
run xsv frequency -s alpha,num
result:
field,value,count
alpha,b,2
alpha,a,2
num,2,2
num,1,2
then run xsv frequency -s num,alpha
result:
field,value,count
num,a,2
num,b,2
alpha,1,2
alpha,2,2
I get the correct result only if I put the fields in the same order as the file header
The case is when i use xsv count myfile.csv -d "##"
and how to use mutil char for delimiter ?
if s.len() != 1 {
let msg = format!("Could not convert '{}' to a single \
ASCII character.", s);
return Err(d.error(&*msg));
}
I'm not sure that rewriting this could be worked and do not know why it can not be supported by mutil char
.
An error is raised if records are different lengths (differing number of columns/field) across input files, but is silent if the number of columns/fields are the same even if the names are different.
Could we have a flag that forces an error if the input files have columns of different names, even if the number is the same?
Example:
data1.csv:
f1,f2,f3
1, 2, 3
4, 5, 6
data2.csv:
f1,f4,f5
1, 2, 3
4, 5, 6
Calling xsv cat rows --check-field-names data1.csv data2.csv
raises an error.
csvkit uses -t to specify tab as separator in file, instead of using -d "\t". It's really sweet shortcut.
First, this is a great piece of work.
I wanted to know whether stats functionality will in the future detect date time data types.
I'm planning to use this library to generate sql statements for postgres
Thanks
Ranjan
Hello,
Notably, the same type of "statistics" command in another CSV command line toolkit takes about 2
Your link is to https://csvkit.readthedocs.org/en/0.9.6/ which does not exist.
correct link:
https://csvkit.readthedocs.org/en/0.9.1/
Thanks!
I think it would be handy to have a command to exclude matching columns from 2 csvs and print the remainder. For example:
source.csv:
id,boolean
1,yes
2,no
3,yes
exclusions.csv:
id,boolean
2,yes
4,no
syntax could be similar to join
: xsv exclude <columns1> <input1> <columns2> <input2>
, so
$ xsv exclude id exclusions.csv id source.csv
id,boolean
1,yes
3,yes
Hello,
Thank you for xsv and the desire to have one tool do well with a large amount of data.
I noticed you have this handy page:
https://github.com/BurntSushi/xsv/blob/master/BENCHMARKS.md
You may want to consider displaying data for less beefy systems as well, just so you can see if there's any performance differences between revisions of your code, etc.
Here's just some data running time
on a freebsd-10.1 digital ocean droplet 1 gig of RAM droplet:
xsv stats worldcitiespop.csv --everything 12.62s user 1.28s system 16% cpu 1:24.95 total xsv table 0.00s user 0.00s system 0% cpu 1:24.95 total
Certainly not as noteworthy as your results, and I understand you wouldn't really want to do heavy data science on just a gig of RAM. Do you have any input on what would contribute to the length of time it took? Or do you think this is adequate?
Thanks!
fiatjaf@spooner ~>
curl -sOL https://github.com/BurntSushi/xsv/releases/download/0.8.14/xsv-0.8.14-x86_64-unknown-linux-gnu.tar.gz
fiatjaf@spooner ~> tar xf xsv-0.8.14-x86_64-unknown-linux-gnu.tar.gz
fiatjaf@spooner ~> cd xsv-0.8.14-x86_64-unknown-linux-gnu/
fiatjaf@spooner ~/xsv-0.8.14-x86_64-unknown-linux-gnu> ./xsv --version
Failed to execute process './xsv'. Reason:
exec: Exec format error
The file './xsv' is marked as an executable but could not be run by the operating system.
I tested it in two very different machines I have access to with the same result (but coincidentally they both are running Ubuntu 14.04).
And maybe that's intended, but it seems like it would be useful to (at least have the option to) say join on "fileA.col1 = fileB.col2 and fileA.col2 = fileB.col1". Right now these both give the same results:
xsv join 1,2 a.txt 1,2 b.txt
xsv join 1,2 a.txt 2,1 b.txt
I expected the second to match column 1 from a.txt to column 2 from b.txt.
By the way, thanks for a great (and very fast) tool.
I expected something like
for i in /proc/*/status ; do cat "$i"; echo '#'; done | xsv unflatten | xsv sort -s Threads: | xsv select Name:,Threads:
Can I also override row separator to handle /proc/*/environ
or find -print0
?
A feature that I don't see in 0.9.6 that might be useful to add: insert new fields into a data set with a constant value.
Example use case: you have a set of files, each of which has data that pertains to a different date. The dates are in the file names but not in any field in the file. You would like to merge them into one data set, but still be able to tell which record came from which file. So you'd do something like:
for date in `generate-dates 2015-04-01 2015-06-30`
do
xsv insert columns file_date:"$date" file_${date}.csv > file_${date}_with_file_date.csv
done
xsv cat rows file_*_with_file_date.csv > file_2015Q2.csv
Hi,
I'm fairly new to using xsv so may have missed something,
but I couldn't figure how I could force it to surround all values with double quotes when exporting data.
Is there an option I missed?
Reading the README brought to mind the thought "What if xsv could export to an SQL database?".
SQLite is the most logical one, since it's small.
This could be me doing something wrong...
$ cargo build --release
Compiling byteorder v0.3.7
Compiling threadpool v0.1.4
Compiling streaming-stats v0.1.23
Compiling regex v0.1.28
Compiling rustc-serialize v0.3.12
Compiling libc v0.1.6
Compiling log v0.3.1
/Users/luis.casillas/.cargo/registry/src/github.com-1ecc6299db9ec823/streaming-stats-0.1.23/src/lib.rs:1:1: 1:41 error: unstable feature
/Users/luis.casillas/.cargo/registry/src/github.com-1ecc6299db9ec823/streaming-stats-0.1.23/src/lib.rs:1 #![feature(collections, core, std_misc)]
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: this feature may not be used in the beta release channel
error: aborting due to previous error
Build failed, waiting for other jobs to finish...
Could not compile `streaming-stats`.
$ git log | head -n 5
commit c19cf0bf43d001ee0555d943eb6703c058152877
Author: Andrew Gallant <[email protected]>
Date: Thu Apr 16 17:50:25 2015 -0400
crates.io merit badge
I often chain several searches with pipes:
$ echo -e "foo,bar\nbli,blo" | xsv search -s foo bla | xsv search -s bar blo
Selector name 'bar' does not exist as a named header in the given CSV data.
Here, the first search does not find any result, so nothing is sent to the second one, which will issue a warning because the column bar
is not defined any more. Isn't it a bit strange, semantically? Especially in comparison to stats
or frequency
which always print a header line even on an empty input. Shouldn't search
always print the headers line of the input file as well?
OK, can I just say first of all that I'm IN LOVE with this toolkit!? It is an absolute joy and it fills a gaping void for me! ;]
Anyway, I work with a lot of TSV files (tab-separated) and it's kind of annoying to have to type -d "\t"
for every single command I run (but thank you for providing the option!). I'd love to be able to change the default separator from ,
to \t
.
Unfortunately it's not easy to alias since the -d
must come AFTER the command. I suppose I could do something like xsvtab() { xsv $1 -d "\t" ${@:2}; }
(and maybe even alias xsv=xsvtab
if I'm feeling really lazy - but then I lose the ability to override the delimiter if I actually do have a different type of file and I have to run it with \xsv
to use the original command instead of the alias (specifying -d
again with the above alias results in an Invalid arguments.
error because now -d
is specified twice)). That all feels a little clunky, although it does kind of work for my use case, it would just be slicker to be able to override the default.
XSV is failing to build on Windows:
src\util.rs:17:9: 17:30 error: unresolved name `libc::GetSystemInfo` [E0425]
src\util.rs:17 ::libc::GetSystemInfo(&mut sysinfo);
That's from num_cpus
. If I hard code num_cpus
to return 4
it builds fine.
It looks to me like GetSystemInfo
was removed from libc
for Windows on this commit.
xsv stats gives a lot of useful information, and I use it a lot.
I'm currently missing one thing that may not be too much of a hassle to add.
Would you consider adding the count of empty/non empty values for each field?
Hello,
Could you explain what role sample.csv
is in this example of yours?
$ xsv join --no-case Country sample.csv Abbrev countrynames.csv | xsv table
If I run xsv join --no-case Country worldcitiespop.csv Abbrev countrynames.csv | xsv table
my shell runs out of memory and dies.
Here's some output of it processing the above:
load: 0.99 cmd: xsv 15196 [runnable] 2.71r 0.17u 0.05s 11% 102328k
load: 0.99 cmd: xsv 15196 [runnable] 6.80r 0.44u 0.12s 21% 243884k
load: 0.99 cmd: xsv 15195 [runnable] 14.83r 0.80u 0.70s 39% 4632k
load: 1.05 cmd: xsv 15196 [runnable] 46.44r 2.55u 0.77s 29% 659956k
zsh: done xsv join --no-case Country worldcitiespop.csv Abbrev countrynames.csv |
zsh: killed xsv table
sample.csv
is also mentioned here:
xsv join --no-case Country sample.csv Abbrev countrynames.csv \
| xsv select 'Country[1],AccentCity,Population' \
| xsv table
Thanks!
I have a line of valid CSV data, which is selected regardless of which columns I select, whether I used inverted matching, or which search term I use.
I installed version 0.11.0 with cargo install.
Line of CSV data (y.csv
):
1970-01-01 00:00:51.000000,1970-01-01 00:31:34.000000,222.91.247.73,172.31.13.174,17,"{0,0,0,10}"
I would expect that not all of these invocations will print the line:
xsv search -v 999999999999999 /tmp/y.csv
xsv search 999999999999999 /tmp/y.csv # prints the line even though there is no match
xsv search 1970 /tmp/y.csv
xsv search -v 1970 /tmp/y.csv # prints the line even though there is a match and inverted filtering is on
It would be nice that it would be possible to set a missing value string when using join, so it is possible to put for example "NA" instead of empty fields.
Using xsv sample -d $'\t'
it correctly reads a TSV (Tab Separated Values) input, but then writes CSV as output. In my case I want to keep the input format and sample it.
An option to control the output delimeter would be very useful.
Hello,
Sample data here:
http://www.sharecsv.com/s/9096d32f98aa0ac671a1cca16fa43be8/SalesJan2009.csv
Running xsv stats SalesJan2009.csv --everything | xsv table
Why does it say the price max is 1200 and the min is 800? The max is actually 7500 and I see a record for 250.
Thanks!
The documentation in the README.md
doesn't explain what is xsv's support or policy for character encodings. I think it really ought to.
Looking through the code for xsv and the csv crate, it looks like there isn't a consistent policy:
byte_records()
function.xsv search
, however, uses the records()
function, which interprets the data as UTF-8.str::from_utf8()
on byte data.select
module uses String
to represent field names, which is UTF-8. What happens when you try to xsv select
from a file that has Latin-1 field names?I caught this panic while performing a join:
xsv join --no-case url images.csv url machines.csv
thread '<main>' panicked at 'index out of bounds: the len is 0 but the index is 9', src/select.rs:352
Image,index,url,Year,Manufacturer,Model,Title,Serial,Stock,Pricing,Description,index,url
I was able to get by it by flipping the two CSVs to be joined.
Right now it says there are only linux binaries available. Could it be updated to mention that there are MacOS and Windows binaries available too?
Is there any way to show double quotes as a single quote symbol
Test:
echo '1,"Test ""22"""'|xsv table -p 10
Result:
1 "Test ""22"""
Wanted:
1 Test "22"
~/s/g/xsv (master=) rustc --version
rustc 0.13.0-nightly (636663172 2014-12-28 16:21:58 +0000)
~/s/g/xsv (master=) cargo --version
cargo 0.0.1-pre-nightly (fd5d7a9 2014-12-25 04:28:40 +0000)
~/s/g/xsv (master=) cargo build --release
Updating registry `https://github.com/rust-lang/crates.io-index`
Downloading streaming-stats v0.1.8
Downloading rustc-serialize v0.1.5
Downloading regex v0.1.4
Downloading docopt v0.6.20
Downloading csv v0.12.0
Downloading tabwriter v0.1.5
Compiling streaming-stats v0.1.8
Compiling regex v0.1.4
Compiling rustc-serialize v0.1.5
Compiling csv v0.12.0
Compiling docopt v0.6.20
Compiling tabwriter v0.1.5
Compiling xsv v0.8.7 (file:///Users/evanchan/src/github/xsv)
warning: using multiple versions of crate `regex`
/Users/evanchan/src/github/xsv/src/main.rs:8:1: 8:18 note: used here
/Users/evanchan/src/github/xsv/src/main.rs:8 extern crate csv;
^~~~~~~~~~~~~~~~~
note: crate name: regex
/Users/evanchan/src/github/xsv/src/main.rs:9:1: 9:21 note: used here
/Users/evanchan/src/github/xsv/src/main.rs:9 extern crate docopt;
^~~~~~~~~~~~~~~~~~~~
note: crate name: regex
/Users/evanchan/src/github/xsv/src/config.rs:41:25: 41:39 error: type `char` does not implement any method in scope named `to_ascii_opt`
/Users/evanchan/src/github/xsv/src/config.rs:41 match c.to_ascii_opt() {
^~~~~~~~~~~~~~
/Users/evanchan/src/github/xsv/src/config.rs:42:49: 42:64 error: the type of this value must be known in this context
/Users/evanchan/src/github/xsv/src/config.rs:42 Some(ascii) => Ok(Delimiter(ascii.as_byte())),
^~~~~~~~~~~~~~~
error: aborting due to 2 previous errors
Could not compile `xsv`.
When I issue the command xsv help
, this happens:
% xsv help
Could not match 'help' with any of the allowed variants: ["Cat", "Count", "FixLengths", "Flatten", "Fmt", "Frequency", "Headers", "Index", "Join", "Sample", "Search", "Select", "Slice", "Sort", "Split", "Stats", "Table"]
Because xsv is structured around the xsv <command> [<args> ...]
paradigm, I find myself running into this all the time, and I can't imagine I'm the only one who will.
I'm guessing it's low effort and high value to:
xsv help
subcommandxsv help <command>
to the same code as xsv <command> --help
(when <command>
exists)xsv --help
when the user issues xsv help
or xsv help <command>
for a non-existent command.I am a heavy csvkit user, and gradually transit to xsv now, only one exception in2csv that often makes me come back to csvkit. A subcommand like, import, is very nice to import csv from Excel (xlsx or xls).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.