GithubHelp home page GithubHelp logo

oshka's Introduction

oshka - Recursive Quoted Language Expansion

Project Status: WIP - Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.

Programmable Non-Standard Evaluation

Non-Standard Evaluation (NSE hereafter) occurs when R expressions are captured and evaluated in a manner different than if they had been executed without intervention. subset is a canonical example, which we use here with the built-in iris data set:

subset(iris, Sepal.Width > 4.1)
##    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 16          5.7         4.4          1.5         0.4  setosa
## 34          5.5         4.2          1.4         0.2  setosa

Sepal.Width does not exist in the global environment, yet this works because subset captures the expression and evaluates it within iris.

A limitation of NSE is that it is difficult to use programmatically:

exp.a <- quote(Sepal.Width > 4.1)
subset(iris, exp.a)
## Error in subset.data.frame(iris, exp.a): 'subset' must be logical

oshka::expand facilitates programmable NSE, as with this simplified version of subset:

subset2 <- function(x, subset) {
  sub.exp <- expand(substitute(subset), x, parent.frame())
  sub.val <- eval(sub.exp, x, parent.frame())
  x[!is.na(sub.val) & sub.val, ]
}
subset2(iris, exp.a)
##    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 16          5.7         4.4          1.5         0.4  setosa
## 34          5.5         4.2          1.4         0.2  setosa

expand is recursive:

exp.b <- quote(Species == 'virginica')
exp.c <- quote(Sepal.Width > 3.6)
exp.d <- quote(exp.b & exp.c)

subset2(iris, exp.d)
##     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
## 118          7.7         3.8          6.7         2.2 virginica
## 132          7.9         3.8          6.4         2.0 virginica

We abide by R semantics so that programmable NSE functions are almost identical to normal NSE functions, with programmability as a bonus.

Documentation

  • Intro vignette for a more in depth introduction to oshka, including a brief comparison to rlang.
  • NSE Functions with oshka in which we recreate simplified versions of dplyr and data.table that implement programmable NSE with oshka::expand.

Installation

This package is proof-of-concept. If it elicits enough interest we will re-write the internals in C and add helper functions for common use patterns.

install.packages('oshka')
# or development version
devtools::instal_github('brodieg/oshka@development')

Feedback is welcome, particularly if you are aware of some NSE pitfalls we may be ignoring.

Acknowledgements

About

Brodie Gaslam is a hobbyist programmer based on the US East Coast.

The name of this package is derived from "matryoshka", the Russian nesting dolls.

oshka's People

Contributors

brodieg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

oshka's Issues

More extensive helper functions

We purposefully reduced the footprint of the package to make it simpler. However, it might be useful to add the following functions:

  • evalr, runs recsub on the expression before evaluating
  • evalqr, quotes expression and runs recsub on the expression before evaluating
  • evalbqr, back quotes expression and runs recsub on the expression before evaluating

Align search path lookup to standard semantics

Don't start at top level with each substitution. So, if we find a symbol at certain level in the search path, we should only look for symbols further down the search path.

Not 100% sure this is the right thing to do, but I think this is necessary to conform to standard R semantics. That said, this is somewhat debatable since there is no real analog with normal expressions. Since this will be a real pain to implement (and will likely be slow) would should really make sure it is needed.

Started to think about the normal eval comparable:

env1 <- new.env()
env2 <- new.env(parent=env1)

env2$x <- 2

But then stopped when I realized that normal evaluation only substitutes a symbol once so what should happen on recursive evaluation is not really defined.

Mechanism for (un)shielding symbols from substitution

Implement as an additional arg, e.g. shield.with='.', which then would allow things like:

recsub(quote(f(a + .(b)))

where b would be shielded from sub.

Potentially could add a second argument to allow skipping matching symbols, e.g. in a situation like:

Species <- quote(Species == 'versicolor')
Sepal.Width <- runif(150)
b <- quote(Sepal.Width > 2)
recsub(quote(iris[.(Species, 1) & b,], iris))

where we do not want the Species in iris to prevent expansion, but we do want normal behavior for everything else (hmm, this is not a very good example).

recsub should recurse through lists and pairlists

Currently stops as soon as hits not language object, but really no reason why it can't keep going through objects and substitute any language they contain. Could it be an issue for attributes? Came up when implementing super_df:

s.exp <- quote(weighted.mean(Income, Population))
sd <- as.super_df(state.data)
sd[f.exp, s.exp, by=Region]
yy <- list(s.exp)
sd[f.exp, yy, by=Region]

Where yy is not expanded because it points to a list.

Expand with missing symbols

> expand(quote(df[, y]))
Error in exists(symb.chr, envir = envir, inherits = FALSE) : 
  invalid first argument
Error in get_with_env(symb.as.chr, envir = envir, mode = mode) : 
  Internal error: exists failed, envir type: environment

Investigate Potential Peformance Problems from Large Objects on Call Stack

From vignette:

One drawback of the eval/bquote/.() pattern is that the actual objects inside .() are placed on the call stack. This is not an issue with symbols, but can be bothersome with data or functions.

Per Hadley this is apparently more than just a cosmetic issue, potentially unspecified performance problems that ggplot2 suffered from, although in that case it was from using do.call.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.