GithubHelp home page GithubHelp logo

jx's Introduction

JX

JX is a tiny utility script to eXecute J commands on the output from *nix command-line pipes. The syntax is,

linux-command1 | linux-comand2 ... | jx `verb`

The `verb` first argument above is any J command that should be executed on the piped data (stdin in J parlance). Typical use-cases are provided in the Examples section.

Features

The following packages are automatically preloaded each time jx is called,

  1. stats
  2. stats/distribs
  3. math/fftw

More libraries can easily be added to the beginning of the script. Check the addons page for further information.

Currently jx can only deal with numeric data types. Use it on other types at your own risk!

Requirements

J needs to be installed and available on the system. ijconsole is used in jx and is assumed to be in /usr/bin/ijconsole. Replace the path in the script if this is not the case on your system.

This script has only been tested on WSL with Ubuntu 18.04.1 LTS (bionic).

Why?

J is a wonderful tool for rapid-prototyping. It is especially well suited for data analysis and transformations. It’s vast ecosystem and terse notation makes it especially suited for command-line data analysis. However, I could not find any solutions that run J commands on CSV type data.

The page on shell scripting provided an excellent starting point. I hope that this script benefits other beginners like me who are interested in leveraging the power of J. There are few languages that provide such potency in so few characters. Happy scripting!

Installation

Download the file called jx and give it exec permissions (chmod +x).

Alternatively, open the package.org file in Emacs, tangle to get the jx file and give it exec permissions.

Examples

Ensure shell is set to bash,

bash

A typical use case might be,

for i in $(seq 1 10); do echo $i; done | jx "*/"
3628800

Let’s consider a more realistic example using the classic iris dataset (see accompanying .csv for a copy). The excellent csvkit utility is used to extract information from the csv.

The dataset contains the following columns,

csvcut -n iris.csv
1:
  2: Sepal.Length
  3: Sepal.Width
  4: Petal.Length
  5: Petal.Width
  6: Species

Columns 2-5 are numeric,

csvcut -c 2-5 iris.csv | head
Sepal.Length,Sepal.Width,Petal.Length,Petal.Width
5.1,3.5,1.4,0.2
4.9,3,1.4,0.2
4.7,3.2,1.3,0.2
4.6,3.1,1.5,0.2
5,3.6,1.4,0.2
5.4,3.9,1.7,0.4
4.6,3.4,1.4,0.3
5,3.4,1.5,0.2
4.4,2.9,1.4,0.2

(+/ % #) is the mean in J. So to calculate the mean of the columns jx is used as follows,

csvcut -c 2-5 iris.csv | tail -n +2 |  ./jx "(+/ % #)"
5.84333,3.05733,3.758,1.19933

Since the stats addon is preloaded, various statistics could also be calculated as,

csvcut -c 2-5 iris.csv | tail -n +2 |  jx "(#, mean, median, stddev)"
150,150,150,150
5.84333,3.05733,3.758,1.19933
5.8,3,4.35,1.3
0.828066,0.435866,1.7653,0.762238

Or using the dstat (descriptive statistics),

csvcut -c 2-5 iris.csv | tail -n +2 |  jx "dstat"
"sample size:       150","sample size:       150","sample size:        150","sample size:        150"
"minimum:           4.3","minimum:             2","minimum:              1","minimum:            0.1"
"maximum:           7.9","maximum:           4.4","maximum:            6.9","maximum:            2.5"
"median:            5.8","median:              3","median:            4.35","median:             1.3"
"mean:          5.84333","mean:          3.05733","mean:             3.758","mean:           1.19933"
"std devn:     0.828066","std devn:     0.435866","std devn:        1.7653","std devn:      0.762238"
"skewness:     0.311753","skewness:     0.315767","skewness:     _0.272128","skewness:     _0.101934"
"kurtosis:      2.42643","kurtosis:      3.18098","kurtosis:       1.60446","kurtosis:       1.66393"

There are many more functions in the addon. See the stats page for further details.

One of the major advantages of using jx is that the entire J ecosystem is available. This facilitates calculations not normally available in many other command-line statistical packages.

For example, the cumulative standard deviation is easily calculated as,

csvcut -c 2-5 iris.csv | tail -n +2 |  jx "stddev \\" | head -n 10
0,0,0,0
0.141421,0.353553,0,0
0.2,0.251661,0.057735,3.39935e_17
0.221736,0.216025,0.0816497,0
0.207364,0.258844,0.0707107,0
0.288097,0.343026,0.13784,0.0816497
0.294392,0.313202,0.127242,0.0786796
0.274838,0.290012,0.119523,0.0744024
0.308671,0.316228,0.113039,0.0707107
0.291357,0.307137,0.108012,0.0788811

To calculate the fft,

csvcut -c 2-5 iris.csv | tail -n +2 |  jx "fftw" | head -n 10
876.5,458.6,563.7,179.9
_6.1619j53.5655,20.0516j_17.2772,_37.1193j145.149,_9.47561j62.0101
1.06414j32.9092,_7.05433j_9.32116,9.866j74.9139,3.54867j31.8348
_10.4029j_2.34397,_1.39779j_0.140479,_5.56232j_2.05679,1.27583j_0.313196
_3.82541j11.3016,1.11997j_3.65419,_10.001j32.7223,_0.713453j16.687
_7.80278j14.757,_6.34392j_0.147459,_1.59774j31.7484,_0.652352j17.7651
4.35145j_1.51585,0.4113j1.55174,4.72981j_2.98952,_0.141559j3.7895
_5.89409j6.59066,0.728365j_1.2031,_7.02916j14.2964,_3.42976j8.93904
0.639293j16.2784,_0.697862j2.37043,_2.56524j20.8029,_1.72254j8.12098
5.64899j_3.28078,6.46114j0.0480785,2.27507j_7.72158,0.337604j_1.55001

Contributions

This is a tiny useful tool that should nicely supplement other solutions available today. Of course, there are many venues for expanding this. Please fork this repo and continue development!

This code is released under GPLv3.

jx's People

Contributors

prasxanth avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.