brodieg / vetr Goto Github PK

View Code? Open in Web Editor NEW

76.0 5.0 2.0 6.74 MB

Trust, but Verify

R 29.34% HTML 13.37% C 56.51% CSS 0.78%

r argument-checks input-validation

vetr's Issues

type_alike docs

Docs are out of date with switch to `vetr_settings.

`vetr` eval much slower within `knitr`

Run cli:

> secant <- function(f, x, dx) (f(x + dx) - f(x)) / dx
> 
> secant_valaddin <- valaddin::firmly(secant, list(~x, ~dx) ~ is.numeric)
> secant_stopifnot <- function(f, x, dx) {
+   stopifnot(is.numeric(x), is.numeric(dx))
+   secant(f, x, dx)
+ }
> secant_vetr <- function(f, x, dx) {
+   vetr(x=numeric(), dx=numeric())
+   secant(f, x, dx)
+ }
> microbenchmark(
+   secant_valaddin(log, 1, .1),
+   secant_stopifnot(log, 1, .1),
+   secant_vetr(log, 1, .1)
+ )
Unit: microseconds
                          expr     min       lq      mean   median       uq
  secant_valaddin(log, 1, 0.1) 123.589 129.4050 160.07218 139.3860 154.9075
 secant_stopifnot(log, 1, 0.1)   9.168  11.3390  16.11966  13.1475  14.4380
      secant_vetr(log, 1, 0.1)  14.780  16.7175  22.29358  20.1230  21.2600
     max neval
 328.443   100
  57.431   100
  64.839   100

run on knitr:

secant <- function(f, x, dx) (f(x + dx) - f(x)) / dx

secant_valaddin <- valaddin::firmly(secant, list(~x, ~dx) ~ is.numeric)
secant_stopifnot <- function(f, x, dx) {
  stopifnot(is.numeric(x), is.numeric(dx))
  secant(f, x, dx)
}
secant_vetr <- function(f, x, dx) {
  vetr(x=numeric(), dx=numeric())
  secant(f, x, dx)
}
microbenchmark(
  secant_valaddin(log, 1, .1),
  secant_stopifnot(log, 1, .1),
  secant_vetr(log, 1, .1)
)

## Unit: microseconds
##                           expr     min       lq      mean
##   secant_valaddin(log, 1, 0.1) 132.051 148.1325 201.55885
##  secant_stopifnot(log, 1, 0.1)  10.504  13.4080  21.38084
##       secant_vetr(log, 1, 0.1)  32.141  37.5715  57.02122
##   median       uq     max neval
##  166.021 228.5780 616.862   100
##   16.137  23.4070  82.141   100
##   43.360  66.7915 213.646   100

with_vetr?

Implement something that takes a function and transforms it into a vetter function:

fun_v <- with_vetter(fun, x=numeric(), y=character())

Infinite Recursion in Symbol Substitution in Template

> b <- quote(a + b)
> library(vetr)
> vet(b, 1)
Error: C stack usage  7969192 is too close to the limit

Add how-to guide for S3 Object Templates

Forgetting `.(` Can Be Confusing

If we do something like:

validate(character() && !any(is.na(.)))

instead of the intended:

validate(character() && .(!any(is.na(.))))

the error message risk being very confusing since the value of the second token is used as a template instead of being interpreted as a value.

NULL being wildcard problematic?

Very convenient in most instanced, but the common expect this argument to be "X" or NULL falls flat on its face. What's the workaround?

Consistency between Passing Quoted Objects And Putting them In

> validate(quote(quote(a + b)), quote(x2 + x3))
Error in validate(quote(quote(a + b)), quote(x2 + x3)) : 
  `quote(x2 + x3)[[1]]` should be a call to `quote` (is a call to `+`)

unitizer> validate(quote(a + b), quote(x2 + x3))
[1] TRUE

> x <- quote(quote(a + b))
> validate(x, quote(x2 + x3))
[1] TRUE

Add a `frame` Argument to `vet`

This would make it possible to substitute a provided vetting expression in an alternate environment.

`..` Does not cause `.` to be Substituted

Somehow the escaping of the dot doesn't permit substitution to happen with a . variable in the substitution environment.

Also, clarify whether only things that are all dots, or just all leading dots must be escaped. Should probably be the latter.

Internal INTEGER error in track hash

Can't reproduce this consistently. Seems like it only happens the first time the code is run.

> vetr:::track_hash(letters[1:5], 2L)
Error in vetr:::track_hash(letters[1:5], 2L) : 
  INTEGER() can only be applied to a 'integer', not a 'NULL'

| Value mismatch: 

< .ref           > .new         
@@ 1 @@          @@ 1 @@        
< [1] 1 1 4 1 8  > NULL         

| Conditions mismatch: 

< .REF$conditions                                                         
> .NEW$conditions                                                         
@@ 1 / 1,3 @@                                                             
< Empty condition list                                                    
> Condition list with 1 condition:                                        
> 1. Error in vetr:::track_hash(letters[1:5], 2L) : INTEGER() can only be 
>    applied to a 'integer', not a 'NULL'

Do we need both `substitute(target)` and sys.call in `vet`?

Timing Implications of Recrusive Substitution

Haven't actually tested how fast/slow these are. Could potentially be important enough to warrant documentation and guidance.

Remove ggplot2 suggests

Causes massive installation. Think about how to test abstract without including this package.

`alike` options

Should be similar to the alike_settings business, and should obviously include the alike options, for example, turning off the integer-like numerics matching integer templates.

More verbose `validate_args` error

Could potentially consider dumping out either str or a snippet of the print output of an object in addition to the error message to accelerate debugging of objects since typically that is the very first thing one will do upon seeing the error. For complex objects could even pull out the nested element that is not matching, though that starts becoming more difficult, particularly since alike doesn't return the coordinates.

Provide Context About Vetting Expression in Failure

Ideally would return full vetting expression, and the token that triggered the failure, potentially as an attribute for vet, and as part of the error message for vetr?

For example, in:

a <- quote(integer() && . > 0)
b <- quote(logical(1L) && !is.na(.))
c <- quote(a || b)

vet(c, -1)

The returned attribute might be structured as:

list(
  vet.exp=quote((integer() && . > 0) || (logical(1L) && !is.na(.))),
  fail.tokens=list(vet.exp[[2]][[2]][[3]], vet.exp[[3]][[2]][[2]])
)

although the vet.exp part in fail.tokens may need to be expanded.

Error Message Should Match Original Call

Right now we throw error with matched call:

analyze(laps.1)   # Invalid object
# Error in analyze.laps(x = laps.1): 
#   Argument `x` should be "car" at index [[1]] for "names" (is "lap")

but maybe should be with actual call. This might be better as is though

COPYRIGHT/LICENSE Issues

Description comment about seeing COPYRIGHTS seems out of date
Make sure license info in every file

Collapse multi-errors into one when they are the same.

Validators such as:

validate(x = integer(1L) || integer(2L))

lead to annoying error messages as the failure is the same for both (argument x is not integer-like).

Word-wrap in multi option error messages

Substituting Arg When Combining `.` and Diff Arg Name

fun2 <- function(x, y)
  validate_args(
    x=integer(),
    y=character() && length(x) == length(.)
  )
fun2(1:3, letters[1:4])
## Error in fun2(x = 1:3, y = letters[1:4]) : 
##  For argument `y`, `length(x) == length(letters[1:4])` is not TRUE (FALSE)

Would be nice if x was also subbed so the above is consistent? As it is this is almost worse than:

length(x) == length(y)

Ensure `match.call` corner cases handled properly

Now that we've switched away from 'match_call', need to verify all the cases we wrote 'match_call' for are properly handled.

Also, some errors in existing tests:

> fun7 <- function(x, y = z + 2) {
+     z <- "boom"
+     vetr(x = TRUE, y = 1L)
+ }
> fun7a <- function(x, y = z + 2) {
+     z <- 40
+     vetr(x = TRUE, y = 1L)
+ }
> z <- 1

# fail because z in fun is character

> fun7(TRUE)
Error in vetr(x = TRUE, y = 1L) : 
  Need to implement deparsing of tag since this could be lang now

| Conditions mismatch: 

< .REF$conditions                                                               
> .NEW$conditions                                                               
@@ 1,3 / 1,3 @@                                                                 
  Condition list with 1 condition:                                              
< 1. Error in fun7(x = TRUE, y = z + 2) : Argument `y` produced error during    
> 1. Error in vetr(x = TRUE, y = 1L) : Need to implement deparsing of tag since 
<    evaluation; see previous error.                                            
>    this could be lang now                                                     

unitizer> N

# works

> fun7a(TRUE)
Error in vetr(x = TRUE, y = 1L) : 
  Need to implement deparsing of tag since this could be lang now

| Value mismatch: 

< .ref      > .new    
@@ 1 @@     @@ 1 @@   
< [1] TRUE  > NULL    

| Conditions mismatch: 

< .REF$conditions                                                               
> .NEW$conditions                                                               
@@ 1 / 1,3 @@                                                                   
< Empty condition list                                                          
> Condition list with 1 condition:                                              
> 1. Error in vetr(x = TRUE, y = 1L) : Need to implement deparsing of tag since 
>    this could be lang now

Better Mechanism For Token Messages

How do we attach the message "be TRUE or FALSE" to:

logical(1L) && !is.na(.)

Right now we can't do:

identity(logical(1L) && !is.na(.))

because from that point forward the expression stops making sense, and make_val_token

Camparisons to `checkmate`

checkmate probably faster, but diff might not be too bad after we implement #48
simplicity of structural checks (and possibly speed) should be an advantage for vetr

Segfault when testing call

This could well be an alike issue:

> x <- quote(a + b)
> validate(x, 2 + 3)
Error: object 'a' not found
Error in validate(x, 2 + 3) : 
  Validation expression for argument `current` produced an error (see previous error).
> x <- quote(quote(a + b))
> validate(x, 2 + 3)
Error in validate(x, 2 + 3) : 
  Argument `current` should be type "language" (is "double")
> validate(x, quote(2 + 3))
Error in validate(x, quote(2 + 3)) : 
  Argument `current` should be "symbol" (is "double") for token `2` in: `{2}` + 3
> validate(x, quote(x2 + x3))
Warning: stack imbalance in '.Call', 6 then 4

 *** caught bus error ***
address 0x106c29ff8, cause 'non-existent physical address'

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

Confusing Error Messages for "NULL"

Could be interpreted as the string "NULL":

## Error in fun(x = 1, y = 2): For argument `y`, `2` should be "NULL", or type "character" (is "double")

Implement Helper Funs such as `allBw`

Idea is to have more functions like anyNA to minimize overhead of value checks.

Default arguments evaluated in wrong frame

Right now in calling frame of function, instead of in function frame. Unfortunately not completely trivial to fix since we need to keep track of which args are default, vs which ones are not. One option might be to just no validate default args that have not been changed by user (not ideal though).

Issue Special `vetr` Condition

Should inherit from simpleError, and could be helpful in certain contexts.

Ensure language stored in `.` by user is properly substituted

Avoid Double Evaluation of Args

validate will evaluate the arguments as captured from the calls, in the correct frames. The problem is that then the arguments will also be evaluated by the function. Really validate should force the arguments to validate (need to think a little bit though as to where the alike call should be evaluated)

Ambiguity of when `err.msg` is used

cust.tok.2 <- quote(TRUE)
 attr(cust.tok.2, "err.msg") <- letters
 vet(cust.tok.2, TRUE)

uses cust.tok.2 as a template rather than a language object with a custom error message.

CRAN rcheck and SAN warnings

Should additional args for `vet` be set via options?

Potentially useful if say we want vet to always stop rather than return text.

`.(` should imply `.(all`

Basically, tests should pass if expression evaluates to all TRUEs, allows for things such as:

integer() && .(!is.na(.))

instead of

integer() && .(all(!is.na(.))

Seems like there is no harm to this and it saves a bit of typing

Implement `tev` for compatibility with Maggritr

Just swaps target and current arguments so we can do stuff like:

1:10 %>% tev(numeric())

Use static more with C funs

`validate` return value

Need to think through, right now is:

> validate(integer(1L) && NO.NA && NO.INF, 5.2)
Error in validate(integer(1L) && NO.NA && NO.INF, 5.2) : 
  Argument `current` should be type "integer-like" (is "double")

but maybe the whole current blah blah shouldn't show up here to allow people to use validate however they want instead as we're dictating here. For validate_args the more processed return makes sense, but here perhaps not.

Run Valgrind

There are gremlins lurking, including issue #36, and:

> unitize_dir()

Prepping Unitizers...                                                           
 *** caught bus error ***
address 0x7fd3cd81ad78, cause 'non-existent physical address'

Traceback:
 1: initialize(value, ...)
 2: initialize(value, ...)
 3: new("unitizerBrowseSubSectionFailed", show.out = TRUE, show.msg = TRUE,     items.new = [email protected][[email protected] & sect.map], show.fail = [email protected][[email protected] &         sect.map], items.ref = [email protected][[email protected][[email protected] &         sect.map]], new.conditions = [email protected][[email protected] &         sect.map], tests.result = [email protected][[email protected] &         sect.map, , drop = FALSE])
 4: .local(x, mode, ...)
 5: (function (x, mode, ...) standardGeneric("browsePrep"))(dots[[1L]][[10L]], mode = dots[[2L]][[1L]],     start.at.browser = dots[[3L]][[10L]], hist.con = 3L, interactive = TRUE)
 6: (function (x, mode, ...) standardGeneric("browsePrep"))(dots[[1L]][[10L]], mode = dots[[2L]][[1L]],     start.at.browser = dots[[3L]][[10L]], hist.con = 3L, interactive = TRUE)
 7: mapply(browsePrep, as.list(unitizers), mode = mode, start.at.browser = (identical(mode,     "review") | !to.review) & !force.update, MoreArgs = list(hist.con = hist.obj$con,     interactive = interactive.mode), SIMPLIFY = FALSE)
 8: unitize_browse(unitizers = unitizers[valid], mode = mode, interactive.mode = interactive.mode,     force.update = force.update, auto.accept = auto.accept, history = history,     global = global)
 9: doWithOneRestart(return(expr), restart)
10: withOneRestart(expr, restarts[[1L]])
11: withRestarts(unitizers[valid] <- unitize_browse(unitizers = unitizers[valid],     mode = mode, interactive.mode = interactive.mode, force.update = force.update,     auto.accept = auto.accept, history = history, global = global),     unitizerInteractiveFail = function(e) interactive.fail <<- TRUE)
12: unitize_core(test.files = test.files, store.ids = store.ids,     state = state, pre = pre, post = post, history = history,     interactive.mode = interactive.mode, force.update = force.update,     auto.accept = auto.accept, mode = "unitize")
13: unitize_dir()

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 2
Save workspace image? [y/n/c]: n

But difficult to reproduce.

Constant error message in formulas not good

alike(y ~ x ^ 2, a ~ b ^ 3)
## [1] "`(a ~ b^3)[[3]][[3]]` should have identical constant values"

Would be better to have something along the lines of "is 2, should be 3" or some such.

Add a mechanism to prevent recursive substitution

Maybe something like:

vet(a && ._(c && d), x)

Where ._ can be escaped with .._.

Error message mis-construction

Here is an example:

Error in validate(LGL.1, 1:2 == 1:2) : 
  Argument `current` should should be length 1 (is 2)

Rationalize token names in docs

We need better nomenclature for:

Templates: expression that evaluate to R objects to use as templates
Custom expressions: user expressions to evaluate for truth, possibly pre-substituting . before eval
Validation expressions: mix of template and custom expression tokens.

Alikeness of Functions

Currently signature is required to be a possible generic to a method. In the future we might relax that or at least provide a mode that allows a more relaxed fit.

This all came from the valaddin example where we wanted to add checks that we ensured would lead to two argument functions, but they failed because the function arguments were incorrect.

Thinking about it further it seems that a function should be able to be called with the argument specified, so the # of arguments only deal probably is not a good idea. We could however provide a special object along the lines of elist and vlist being considered in #29 that would vet purely the number of arguments.

Implement Variable Length Lists with `elist` and `vlist`

elist (Extensible List, could be xlist too) is an extensible list, where objects are accepted assuming that they have every element that is present in the template. This is supposed to mimic S4 objects where objects that inherit from another contain all the slots of the other. Some unresolved questions are whether the subset of elements must be first and in the same order as in the template, and whether named objects should be treated differently. In terms of implementation, elist will probably produce and S4 object that will trigger special treatment.

One question is how we do something like structure(elist(...), attra, attrb) etc. as then the return value of elist can hardly be S4 as there could be conflicts between slots and attributes.

vlist (Vector List) is a variable length list with the same template repeated n times. TBD whether we allow a repetitions argument, or whether people should use a normal list template for those. Ideally the template would allow the same syntax present at the top level (i.e. use of template and evaluated tokens, etc.).

Properly check for numeric overflows

Right now we check that numbers wrap, and although that works in theory we should really be checking against INT_MAX and the like since the wrapping is not defined behavior.

Allow subbing in of User Expressions in Validation Tokens

Things like:

INT.1 <- quote(integer(1L) && .(!any(is.na(.))))
validate(x = INT.1 || NULL)

would ideally work.

brodieg / vetr Goto Github PK

vetr's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs