GithubHelp home page GithubHelp logo

aggr's Introduction

aggr(1)                                                            aggr(1)



NAME
       aggr - aggregate CSV-data and print on the standard output


SYNOPSIS
       aggr [OPTIONS] COLUMNS...


DESCRIPTION
       aggr  aggregates  CSV-data  (comma seperated values) from a file or
       standard input thru well-known aggregate functions and outputs  the
       result  to  standard output. It tries to simulate some of the func‐
       tionality that is provided by SQL's "GROUP BY"-statement.


OPTIONS
       -f INPUTFILE
              read data from INPUTFILE instead of standard input


       -d DELIMITER
              use DELIMITER instead of Tab as field delimiter  (for  input
              and output)


COLUMNS
       The  columns that will be printed in output are defined by the COL‐
       UMNS arguments. There are two kinds of columns:


       Grouping columns
              define which columns are used to group the input data.  They
              are  simply  specified by their column number. These columns
              are the attributes that one would mention in  the  GROUP  BY
              statement in SQL.


       Aggregated columns
              are  columns  whose  values  are  calculated by an aggregate
              function. The syntax is AGGREGATEFUNCTION:COLUMNNUMBER.  See
              below for supported aggregate functions.


       The columns will appear in the output in the same order as in which
       they were specified in the arguments. It is possible to output  the
       same column several times (with different aggregate functions). The
       column numbering scheme is 0-based: the  first  column  has  column
       number 0, the second 1 and so on.



AGGREGATE FUNCTIONS
       These aggregate functions are supported: count, sum, avg, min, max,
       first, last, median.



EXAMPLES
       Let's say there is a file name_city_age.csv with the following con‐
       tent (list of persons with name, city and age):

       Hans;Regensburg;42
       Heidi;Straubing;40
       Peter;Straubing;20
       Else;Regensburg;10
       Hans;Augsburg;63
       Hans;Ingolstadt;71

       Then

       aggr -f name_city_age.csv -d ";" 1 avg:2

       will return:

       Regensburg;26.0
       Straubing;30.0
       Augsburg;63.0
       Ingolstadt;71.0

       The  grouping-criteria is the second column (column number 1) which
       is the column with the city names. The  aggregated  column  is  the
       average  age  of  the  listed  inhabitants of this city. In SQL the
       query would be something like  SELECT  city,AVG(age)  FROM  persons
       GROUP BY city;.

       aggr -f name_city_age.csv -d ";" avg:2 1 | sort -rn

       Will  basically do the same as in the previous example, but outputs
       the average-age column first and uses sort -rn to output  the  rows
       in descending order.

       aggr -f name_city_age.csv -d ";" 0 count:*

       This  time  we're  using  the name column for grouping and count as
       aggregate function. This gives us the number how often  which  name
       is in the list.

       Find   more   examples   in  the  wiki:  https://github.com/stefan‐
       schramm/aggr/wiki



MAJOR DIFFERENCES TO SQL'S GROUP BY:
       In aggr an aggregate function can only be applied to a single  col‐
       umn. That means expressions like SUM(price * amount) in SQL are not
       possible here. To achieve this you should first calculate the prod‐
       uct  with  some  other commandline utility into a single column and
       then pipe it's output to aggr.


AUTHOR
       Stefan Schramm <[email protected]>



                               NOVEMBER 2010                       aggr(1)

aggr's People

Contributors

stefanschramm avatar

Stargazers

 avatar

Watchers

 avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.