GithubHelp home page GithubHelp logo

vetr's Issues

`vetr` eval much slower within `knitr`

Run cli:

> secant <- function(f, x, dx) (f(x + dx) - f(x)) / dx
> 
> secant_valaddin <- valaddin::firmly(secant, list(~x, ~dx) ~ is.numeric)
> secant_stopifnot <- function(f, x, dx) {
+   stopifnot(is.numeric(x), is.numeric(dx))
+   secant(f, x, dx)
+ }
> secant_vetr <- function(f, x, dx) {
+   vetr(x=numeric(), dx=numeric())
+   secant(f, x, dx)
+ }
> microbenchmark(
+   secant_valaddin(log, 1, .1),
+   secant_stopifnot(log, 1, .1),
+   secant_vetr(log, 1, .1)
+ )
Unit: microseconds
                          expr     min       lq      mean   median       uq
  secant_valaddin(log, 1, 0.1) 123.589 129.4050 160.07218 139.3860 154.9075
 secant_stopifnot(log, 1, 0.1)   9.168  11.3390  16.11966  13.1475  14.4380
      secant_vetr(log, 1, 0.1)  14.780  16.7175  22.29358  20.1230  21.2600
     max neval
 328.443   100
  57.431   100
  64.839   100

run on knitr:

secant <- function(f, x, dx) (f(x + dx) - f(x)) / dx

secant_valaddin <- valaddin::firmly(secant, list(~x, ~dx) ~ is.numeric)
secant_stopifnot <- function(f, x, dx) {
  stopifnot(is.numeric(x), is.numeric(dx))
  secant(f, x, dx)
}
secant_vetr <- function(f, x, dx) {
  vetr(x=numeric(), dx=numeric())
  secant(f, x, dx)
}
microbenchmark(
  secant_valaddin(log, 1, .1),
  secant_stopifnot(log, 1, .1),
  secant_vetr(log, 1, .1)
)

## Unit: microseconds
##                           expr     min       lq      mean
##   secant_valaddin(log, 1, 0.1) 132.051 148.1325 201.55885
##  secant_stopifnot(log, 1, 0.1)  10.504  13.4080  21.38084
##       secant_vetr(log, 1, 0.1)  32.141  37.5715  57.02122
##   median       uq     max neval
##  166.021 228.5780 616.862   100
##   16.137  23.4070  82.141   100
##   43.360  66.7915 213.646   100

with_vetr?

Implement something that takes a function and transforms it into a vetter function:

fun_v <- with_vetter(fun, x=numeric(), y=character())

Forgetting `.(` Can Be Confusing

If we do something like:

validate(character() && !any(is.na(.)))

instead of the intended:

validate(character() && .(!any(is.na(.))))

the error message risk being very confusing since the value of the second token is used as a template instead of being interpreted as a value.

NULL being wildcard problematic?

Very convenient in most instanced, but the common expect this argument to be "X" or NULL falls flat on its face. What's the workaround?

Consistency between Passing Quoted Objects And Putting them In

> validate(quote(quote(a + b)), quote(x2 + x3))
Error in validate(quote(quote(a + b)), quote(x2 + x3)) : 
  `quote(x2 + x3)[[1]]` should be a call to `quote` (is a call to `+`)

unitizer> validate(quote(a + b), quote(x2 + x3))
[1] TRUE

> x <- quote(quote(a + b))
> validate(x, quote(x2 + x3))
[1] TRUE

`..` Does not cause `.` to be Substituted

Somehow the escaping of the dot doesn't permit substitution to happen with a . variable in the substitution environment.

Also, clarify whether only things that are all dots, or just all leading dots must be escaped. Should probably be the latter.

Multi Option Error Message

> fun1(matrix(1:9, ncol = 3), "fail", "fail")
Error in fun1(x = matrix(1:9, ncol = 3), y = "fail", z = "fail") : 
  Argument meet at least one of the following:
  - `y` should be type "integer-like" (is "character")
  - `y` should be "NULL" (is "character")
  - `y` should be type "logical" (is "character")

Use "could" instead of "should"?

Internal INTEGER error in track hash

Can't reproduce this consistently. Seems like it only happens the first time the code is run.

> vetr:::track_hash(letters[1:5], 2L)
Error in vetr:::track_hash(letters[1:5], 2L) : 
  INTEGER() can only be applied to a 'integer', not a 'NULL'

| Value mismatch: 

< .ref           > .new         
@@ 1 @@          @@ 1 @@        
< [1] 1 1 4 1 8  > NULL         

| Conditions mismatch: 

< .REF$conditions                                                         
> .NEW$conditions                                                         
@@ 1 / 1,3 @@                                                             
< Empty condition list                                                    
> Condition list with 1 condition:                                        
> 1. Error in vetr:::track_hash(letters[1:5], 2L) : INTEGER() can only be 
>    applied to a 'integer', not a 'NULL'                      

Remove ggplot2 suggests

Causes massive installation. Think about how to test abstract without including this package.

`alike` options

Should be similar to the alike_settings business, and should obviously include the alike options, for example, turning off the integer-like numerics matching integer templates.

More verbose `validate_args` error

Could potentially consider dumping out either str or a snippet of the print output of an object in addition to the error message to accelerate debugging of objects since typically that is the very first thing one will do upon seeing the error. For complex objects could even pull out the nested element that is not matching, though that starts becoming more difficult, particularly since alike doesn't return the coordinates.

Provide Context About Vetting Expression in Failure

Ideally would return full vetting expression, and the token that triggered the failure, potentially as an attribute for vet, and as part of the error message for vetr?

For example, in:

a <- quote(integer() && . > 0)
b <- quote(logical(1L) && !is.na(.))
c <- quote(a || b)

vet(c, -1)

The returned attribute might be structured as:

list(
  vet.exp=quote((integer() && . > 0) || (logical(1L) && !is.na(.))),
  fail.tokens=list(vet.exp[[2]][[2]][[3]], vet.exp[[3]][[2]][[2]])
)

although the vet.exp part in fail.tokens may need to be expanded.

Error Message Should Match Original Call

Right now we throw error with matched call:

analyze(laps.1)   # Invalid object
# Error in analyze.laps(x = laps.1): 
#   Argument `x` should be "car" at index [[1]] for "names" (is "lap")

but maybe should be with actual call. This might be better as is though

COPYRIGHT/LICENSE Issues

  1. Description comment about seeing COPYRIGHTS seems out of date
  2. Make sure license info in every file

Substituting Arg When Combining `.` and Diff Arg Name

fun2 <- function(x, y)
  validate_args(
    x=integer(),
    y=character() && length(x) == length(.)
  )
fun2(1:3, letters[1:4])
## Error in fun2(x = 1:3, y = letters[1:4]) : 
##  For argument `y`, `length(x) == length(letters[1:4])` is not TRUE (FALSE)

Would be nice if x was also subbed so the above is consistent? As it is this is almost worse than:

length(x) == length(y)

Ensure `match.call` corner cases handled properly

Now that we've switched away from 'match_call', need to verify all the cases we wrote 'match_call' for are properly handled.

Also, some errors in existing tests:

> fun7 <- function(x, y = z + 2) {
+     z <- "boom"
+     vetr(x = TRUE, y = 1L)
+ }
> fun7a <- function(x, y = z + 2) {
+     z <- 40
+     vetr(x = TRUE, y = 1L)
+ }
> z <- 1

# fail because z in fun is character

> fun7(TRUE)
Error in vetr(x = TRUE, y = 1L) : 
  Need to implement deparsing of tag since this could be lang now

| Conditions mismatch: 

< .REF$conditions                                                               
> .NEW$conditions                                                               
@@ 1,3 / 1,3 @@                                                                 
  Condition list with 1 condition:                                              
< 1. Error in fun7(x = TRUE, y = z + 2) : Argument `y` produced error during    
> 1. Error in vetr(x = TRUE, y = 1L) : Need to implement deparsing of tag since 
<    evaluation; see previous error.                                            
>    this could be lang now                                                     

unitizer> N

# works

> fun7a(TRUE)
Error in vetr(x = TRUE, y = 1L) : 
  Need to implement deparsing of tag since this could be lang now

| Value mismatch: 

< .ref      > .new    
@@ 1 @@     @@ 1 @@   
< [1] TRUE  > NULL    

| Conditions mismatch: 

< .REF$conditions                                                               
> .NEW$conditions                                                               
@@ 1 / 1,3 @@                                                                   
< Empty condition list                                                          
> Condition list with 1 condition:                                              
> 1. Error in vetr(x = TRUE, y = 1L) : Need to implement deparsing of tag since 
>    this could be lang now         

Better Mechanism For Token Messages

How do we attach the message "be TRUE or FALSE" to:

logical(1L) && !is.na(.)

Right now we can't do:

identity(logical(1L) && !is.na(.))

because from that point forward the expression stops making sense, and make_val_token

Camparisons to `checkmate`

  • checkmate probably faster, but diff might not be too bad after we implement #48
  • simplicity of structural checks (and possibly speed) should be an advantage for vetr

Segfault when testing call

This could well be an alike issue:

> x <- quote(a + b)
> validate(x, 2 + 3)
Error: object 'a' not found
Error in validate(x, 2 + 3) : 
  Validation expression for argument `current` produced an error (see previous error).
> x <- quote(quote(a + b))
> validate(x, 2 + 3)
Error in validate(x, 2 + 3) : 
  Argument `current` should be type "language" (is "double")
> validate(x, quote(2 + 3))
Error in validate(x, quote(2 + 3)) : 
  Argument `current` should be "symbol" (is "double") for token `2` in: `{2}` + 3
> validate(x, quote(x2 + x3))
Warning: stack imbalance in '.Call', 6 then 4

 *** caught bus error ***
address 0x106c29ff8, cause 'non-existent physical address'

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

Confusing Error Messages for "NULL"

Could be interpreted as the string "NULL":

## Error in fun(x = 1, y = 2): For argument `y`, `2` should be "NULL", or type "character" (is "double")

Default arguments evaluated in wrong frame

Right now in calling frame of function, instead of in function frame. Unfortunately not completely trivial to fix since we need to keep track of which args are default, vs which ones are not. One option might be to just no validate default args that have not been changed by user (not ideal though).

Avoid Double Evaluation of Args

validate will evaluate the arguments as captured from the calls, in the correct frames. The problem is that then the arguments will also be evaluated by the function. Really validate should force the arguments to validate (need to think a little bit though as to where the alike call should be evaluated)

Ambiguity of when `err.msg` is used

cust.tok.2 <- quote(TRUE)
 attr(cust.tok.2, "err.msg") <- letters
 vet(cust.tok.2, TRUE)

uses cust.tok.2 as a template rather than a language object with a custom error message.

`.(` should imply `.(all`

Basically, tests should pass if expression evaluates to all TRUEs, allows for things such as:

integer() && .(!is.na(.))

instead of

integer() && .(all(!is.na(.))

Seems like there is no harm to this and it saves a bit of typing

`validate` return value

Need to think through, right now is:

> validate(integer(1L) && NO.NA && NO.INF, 5.2)
Error in validate(integer(1L) && NO.NA && NO.INF, 5.2) : 
  Argument `current` should be type "integer-like" (is "double")

but maybe the whole current blah blah shouldn't show up here to allow people to use validate however they want instead as we're dictating here. For validate_args the more processed return makes sense, but here perhaps not.

Run Valgrind

There are gremlins lurking, including issue #36, and:

> unitize_dir()

Prepping Unitizers...                                                           
 *** caught bus error ***
address 0x7fd3cd81ad78, cause 'non-existent physical address'

Traceback:
 1: initialize(value, ...)
 2: initialize(value, ...)
 3: new("unitizerBrowseSubSectionFailed", show.out = TRUE, show.msg = TRUE,     items.new = [email protected][[email protected] & sect.map], show.fail = [email protected][[email protected] &         sect.map], items.ref = [email protected][[email protected][[email protected] &         sect.map]], new.conditions = [email protected][[email protected] &         sect.map], tests.result = [email protected][[email protected] &         sect.map, , drop = FALSE])
 4: .local(x, mode, ...)
 5: (function (x, mode, ...) standardGeneric("browsePrep"))(dots[[1L]][[10L]], mode = dots[[2L]][[1L]],     start.at.browser = dots[[3L]][[10L]], hist.con = 3L, interactive = TRUE)
 6: (function (x, mode, ...) standardGeneric("browsePrep"))(dots[[1L]][[10L]], mode = dots[[2L]][[1L]],     start.at.browser = dots[[3L]][[10L]], hist.con = 3L, interactive = TRUE)
 7: mapply(browsePrep, as.list(unitizers), mode = mode, start.at.browser = (identical(mode,     "review") | !to.review) & !force.update, MoreArgs = list(hist.con = hist.obj$con,     interactive = interactive.mode), SIMPLIFY = FALSE)
 8: unitize_browse(unitizers = unitizers[valid], mode = mode, interactive.mode = interactive.mode,     force.update = force.update, auto.accept = auto.accept, history = history,     global = global)
 9: doWithOneRestart(return(expr), restart)
10: withOneRestart(expr, restarts[[1L]])
11: withRestarts(unitizers[valid] <- unitize_browse(unitizers = unitizers[valid],     mode = mode, interactive.mode = interactive.mode, force.update = force.update,     auto.accept = auto.accept, history = history, global = global),     unitizerInteractiveFail = function(e) interactive.fail <<- TRUE)
12: unitize_core(test.files = test.files, store.ids = store.ids,     state = state, pre = pre, post = post, history = history,     interactive.mode = interactive.mode, force.update = force.update,     auto.accept = auto.accept, mode = "unitize")
13: unitize_dir()

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 2
Save workspace image? [y/n/c]: n

But difficult to reproduce.

Constant error message in formulas not good

alike(y ~ x ^ 2, a ~ b ^ 3)
## [1] "`(a ~ b^3)[[3]][[3]]` should have identical constant values"

Would be better to have something along the lines of "is 2, should be 3" or some such.

Rationalize token names in docs

We need better nomenclature for:

  • Templates: expression that evaluate to R objects to use as templates
  • Custom expressions: user expressions to evaluate for truth, possibly pre-substituting . before eval
  • Validation expressions: mix of template and custom expression tokens.

Alikeness of Functions

Currently signature is required to be a possible generic to a method. In the future we might relax that or at least provide a mode that allows a more relaxed fit.

This all came from the valaddin example where we wanted to add checks that we ensured would lead to two argument functions, but they failed because the function arguments were incorrect.

Thinking about it further it seems that a function should be able to be called with the argument specified, so the # of arguments only deal probably is not a good idea. We could however provide a special object along the lines of elist and vlist being considered in #29 that would vet purely the number of arguments.

Implement Variable Length Lists with `elist` and `vlist`

elist (Extensible List, could be xlist too) is an extensible list, where objects are accepted assuming that they have every element that is present in the template. This is supposed to mimic S4 objects where objects that inherit from another contain all the slots of the other. Some unresolved questions are whether the subset of elements must be first and in the same order as in the template, and whether named objects should be treated differently. In terms of implementation, elist will probably produce and S4 object that will trigger special treatment.

One question is how we do something like structure(elist(...), attra, attrb) etc. as then the return value of elist can hardly be S4 as there could be conflicts between slots and attributes.

vlist (Vector List) is a variable length list with the same template repeated n times. TBD whether we allow a repetitions argument, or whether people should use a normal list template for those. Ideally the template would allow the same syntax present at the top level (i.e. use of template and evaluated tokens, etc.).

Properly check for numeric overflows

Right now we check that numbers wrap, and although that works in theory we should really be checking against INT_MAX and the like since the wrapping is not defined behavior.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.