r-lib / lobstr Goto Github PK

View Code? Open in Web Editor NEW

297.0 13.0 29.0 1.59 MB

Understanding complex R objects with tools similar to str()

Home Page: https://lobstr.r-lib.org/

License: Other

R 69.09% C++ 30.91%

lobstr's People

Contributors

Stargazers

Watchers

lobstr's Issues

Don't print 0 in complex constants

lobstr::ast(5i)
#> 0+5i

^{Created on 2018-10-29 by the reprex package (v0.2.1)}

Equivalent of .Internal(inspect(x))

e.g.

lobstr::sxp(f(1))
█─LANGSXP <0x12345>
├─SYMSXP  <0x12345> `f` 
└─INTSXP  <0x12345> [1]

lobstr::sxp(f(1), pairlist = TRUE)
█─LANGSXP <0x12345>
├─SYMSXP  <0x12345> `f`
└─█─LANGSXP <0x12345>
  ├─INTSXP  <0x12345> [1] 
  └─NILSXP

lobstr::sxp(f(1), pairlist = TRUE, symbol = TRUE)
█─LANGSXP <0x12345>
├─█─SYMSXP  <0x12345> `f`
│ └─STRSXP  <0x12345> "f"
└─█─LANGSXP <0x12345>
  ├─INTSXP  <0x12345> [1] 
  └─NILSXP

e <- new.env(parent = emptyenv())
e$e <- e

lobstr::str(e)
█─ENVSXP <0x12345>
└─e = █─ENVSXP <0x12345> ...

Variant of ast that shows where symbols are found

Would either take expression + env, or quosure.

Prints parent envs at end, and printing an indexing into them with numbers/colours next to each symbol.

Add a plot method

Since a lot of the point of this package seems to be to visualize the structure of objects, it would be nice to have a plot method to show the AST/CST as a network (either with ggraph or threejs).

It looks like ast() returns a character vector though; so the package might need a data frame representation of the tree first. (Or it could be done from utils::getParseData().)

ast layers display as 'X' on Windows

'X' makes it hard to read. The blocks on linux/mac looks much better. I think the windows font problem was recently fixed in the skimr package.

Edit: Here is the skimr code to fix windows unicode https://github.com/ropenscilabs/skimr/blob/master/R/utils.R

Release lobstr 1.1.0

Prepare for release:

Check that description is informative
Check licensing of included files
usethis::use_cran_comments()
devtools::check()
devtools::check_win_devel()
rhub::check_for_cran()
rhub::check(platform = 'ubuntu-rchk')
rhub::check_with_sanitizers()
Polish pkgdown reference index

Submit to CRAN:

usethis::use_version('minor')
Update cran-comments.md
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version()

ast() should print quosures specially

library(rlang)
library(lobstr)

q1 <- new_quosure(expr(x), env(x = 1))
q2 <- new_quosure(expr(x), env(x = 10))

lobstr::ast(!!q1 + !!q2)
#> Warning: `quo_expr()` is soft-deprecated as of rlang 0.2.0.
#> Please use `quo_squash()` instead
#> This warning is displayed once per session.
#> █─`+` 
#> ├─x 
#> └─x

^{Created on 2018-11-08 by the reprex package (v0.2.1)}

(Probably what I was thinking in #6)

`format()` method for `lobstr_bytes`

The carrier package used to do format(pryr::object_size(x)). This has stopped working when pryr started to create lobstr objects, because these don't have a format method.

Short name for `obj_inspect()`

Maybe lobstr::sxp()?

Rewrite obj_addr to avoid taking references

As it's (understandably) causing confusion

Release lobstr 1.1.1

Prepare for release:

devtools::check()
Check in Latin 1 local
Polish NEWS

Submit to CRAN:

usethis::use_version('patch')
Update cran-comments.md
devtools::submit_cran()
Approve email

Wait for CRAN...

Accepted 🎉
usethis::use_github_release()
usethis::use_dev_version()

object_size fails with "unimplemented type" for some types

When trying to use object_size to get the true size of a "hash" object (instantiated with library(hash)" I get the following error:

class(hash_table)
[1] "hash"
attr(,"package")
[1] "hash"
str(hash_table)
Formal class 'hash' [package "hash"] with 1 slot
..@ .xData:<environment: 0x24f72ec8>
object_size(hash_table)
type: 17
Error: Unimplemented type

obj_size causing segfault with an example from Advanced R

When I try to run the example in Exercise 2 of Section 2.4.1 in the second edition of Advanced R, I am getting a segfault on a call to lobstr::obj_size. The example code is shown below. I would be happy to work on a fix if it turns out to not be just a dependency issue or something similar, and given some direction.

devtools::session_info("lobstr")
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.4.4 (2018-03-15)
#>  os       Ubuntu 16.04.5 LTS          
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_US                       
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2018-11-17                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package * version date       lib source        
#>  crayon    1.3.4   2017-09-16 [1] CRAN (R 3.4.1)
#>  lobstr    1.0.0   2018-11-04 [1] CRAN (R 3.4.4)
#>  Rcpp      1.0.0   2018-11-07 [1] CRAN (R 3.4.4)
#>  rlang     0.3.0.1 2018-10-25 [1] CRAN (R 3.4.4)
#> 
#> [1] /home/dpritch/R/x86_64-pc-linux-gnu-library/3.4
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

x <- list(mean, sd, var)
lobstr::obj_size(x)
#>  *** caught segfault ***
#> address 0x2, cause 'memory not mapped'
#> 
#> Traceback:
#>  1: .Call(`_lobstr_obj_size_`, objects, base_env, sizeof_node, sizeof_vector)
#>  2: obj_size_(dots, env, size_node(), size_vector())
#>  3: lobstr::obj_size(x)
#>  4: eval(expr, envir, enclos)
#>  5: eval(expr, envir, enclos)
#>  6: withVisible(eval(expr, envir, enclos))
#>  7: withCallingHandlers(withVisible(eval(expr, envir, enclos)), warning = wHandler,     error = eHandler, message = mHandler)
#>  8: doTryCatch(return(expr), name, parentenv, handler)
#>  9: tryCatchOne(expr, names, parentenv, handlers[[1L]])
#> 10: tryCatchList(expr, classes, parentenv, handlers)
#> 11: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 7

`obj_size` returns different values before and after accessing an element

I am chasing memory leaks and/or high memory fragmentation in a Shiny app. I have noticed that after closing the Shiny app and returning to the R prompt, gc() still reports some 3.8 GB of memory used even after emptying all environments (except baseenv()) and unloading all namespaces, which is quite a bit. (For context, Linux top reports some 11 GB of virtual memory used.)

I have started looking at the remaining elements in baseenv(), and found that almost all of the memory is in baseenv()$.__S3MethodsTable__. (disclaimer: I have no idea what this is for and why this could ever be this big):

lobstr::obj_size(baseenv()$.__S3MethodsTable__.)
# 3,314,445,280 B

So I went ahead to see which of its elements is the culprit:

sizes <- lapply(names(baseenv()$.__S3MethodsTable__.), \(name) lobstr::obj_size(baseenv()$.__S3MethodsTable__.[[name]]))
names(sizes) <- names(baseenv()$.__S3MethodsTable__.)
sizes <- sizes[order(unlist(sizes))]

To my astonishment, after I have done this, the reported size of baseenv()$.__S3MethodsTable__. has drastically decreased:

lobstr::obj_size(baseenv()$.__S3MethodsTable__.)
# 157,085,600 B

The same happens when I inspect members of baseenv()$.__S3MethodsTable__.:

sizes
# $`[[.SQL`
# 93,406,504 B
# 
# $`[.SQL`
# 93,406,504 B
# 
# $toString.Id
# 93,411,712 B

lobstr::obj_size(baseenv()$.__S3MethodsTable__.$toString.Id)
# 93,411,712 B
baseenv()$.__S3MethodsTable__.$toString.Id
# function(x, ...) {
#   paste0("<Id> ", paste0(names(x@name), " = ", x@name, collapse = ", "))
# }
# <bytecode: 0x55a1f4ea97c8>
# <environment: namespace:DBI>
lobstr::obj_size(baseenv()$.__S3MethodsTable__.$toString.Id)
# 297,264 B

What is going on here? I almost feel like the memory leak is playing whac-a-mole. Is this a bug in obj_size, expected behavior, or actually a sign of memory leaks?

~~Interestingly, in a clean R session, this element has yet another size:~~ (That was comparing Windows with Linux - it's consistent on Linux.)

library(DBI) baseenv()$.__S3MethodsTable__.$toString.Id # function (x, ...) # { # paste0("<Id> ", paste0(names(x@name), " = ", x@name, collapse = ", ")) # } # <bytecode: 0x00000000194d22f0> # <environment: namespace:DBI> lobstr::obj_size(baseenv()$.__S3MethodsTable__.$toString.Id) # 4,320 B

modify ast tree

Hi,
My aim is to replace & with function min() and replace | with max(). So, a & (b | c) should be rewritten as min(a, max(b,c)). I thought modifying the syntax tree is the way to go. Is it possible to modify ast() output and then convert back into an expression?

I'm sharing a small example:

a <- 5
b <- 10
c <- 4
express <- a & (b | c)
express_adv <- (c & b) & (a & (b | c))

The rewritten express should evaluate to 5 and express_adv should evaluate to 4. Expected expressions are :

min(a, max(b,c))  # modified express 
min( min(c,b), min(a, max(b,c)))  # modified express_adv

So, this is before:

ast( a & (b | c) )

█─`&` 
├─a 
└─█─`(` 
  └─█─`|` 
    ├─b 
    └─c

and this is after:

ast(min(a,(max(b,c))))

█─min 
├─a 
└─█─`(` 
  └─█─max 
    ├─b 
    └─c

Bad binding access

I'm seeing this error pop up inside of RStudio when attempting to call lobstr::obj_size(). I'm working on attempting to make a reprex. Just wanted to see if this is a known issue that anyone else has come across:

Error in obj_size_(dots, env, size_node(), size_vector()) : 
  bad binding access
Calls: obj_size -> obj_size_
Backtrace:
    █
 1. └─lobstr::obj_size(object)
 2.   └─lobstr:::obj_size_(dots, env, size_node(), size_vector())

Remove dependency on rlang remotes?

I'm having a hard time installing the development version of packages on my Windows machine. As far as I can tell, the CRAN version of rlang is up-to-date enough that removing rlang from Remotes will fix these installation issues for lobstr.

I imagine the changes in devtools will eventually fix the errors in the dev version of rlang (and by extension lobstr) but they currently don't work for Windows. Not filing the issue in those packages because it seems like the work may already be in progress.

Not entirely sure how to reprex installation issues, but here's what I've got:

> devtools::find_rtools()
WARNING: Rtools is required to build R packages, but no version of Rtools compatible with R 3.5.1 was found. (Only the following incompatible version(s) of Rtools were found:3.4)

Please download and install the appropriate version of Rtools from http://cran.r-project.org/bin/windows/Rtools/.
> pkgbuild::find_rtools()
[1] TRUE
> devtools::install_github("r-lib/rlang")
Using GitHub PAT from envvar GITHUB_PAT
Downloading GitHub repo r-lib/rlang@master
from URL https://api.github.com/repos/r-lib/rlang/zipball/master
WARNING: Rtools is required to build R packages, but no version of Rtools compatible with R 3.5.1 was found. (Only the following incompatible version(s) of Rtools were found:3.4)

Please download and install the appropriate version of Rtools from http://cran.r-project.org/bin/windows/Rtools/.
Installing rlang

These seem to be related:
r-lib/devtools#1772
r-lib/rlang#510

Let me know if there's something I can do on my end to help with this!

Move object_size() and the bytes class from pryr to lobstr

ast() docs should mention quasiquotation

Release lobstr 1.0.1

Prepare for release:

Submit to CRAN:

usethis::use_version('1.0.1')
devtools::check_win_devel() (again!)
devtools::submit_cran()
Approve email

Wait for CRAN...

usethis::use_release_tag()
usethis::use_dev_version()
Tweet

object.size() vs obj_size()

Sorry I couldn't understand the difference from the documentation

library(lobstr)
x <- 1:1e6
y <- list(x, x, x)
ref(y)
#> █ [1:0x7fee181b95e8] <list> 
#> ├─[2:0x7fee142b72a8] <int> 
#> ├─[2:0x7fee142b72a8] 
#> └─[2:0x7fee142b72a8]

object.size(x)
#> 4000048 bytes
obj_size(x)
#> 680 B


object.size(y)
#> 12000224 bytes
obj_size(y)
#> 760 B


ref(x)
#> [1:0x7fee142b72a8] <int>
ref(y)
#> █ [1:0x7fee181b95e8] <list> 
#> ├─[2:0x7fee142b72a8] <int> 
#> ├─[2:0x7fee142b72a8] 
#> └─[2:0x7fee142b72a8]

Why is such a huge difference between the two and also why is object.size() doubling the size of y but obj_size doesn't - does it mean the memory allocation has been doubled?

Feature request: Logging eval_tidy

As discussed in r-lib/rlang#678 I recently needed to log what symbols are getting hit in my data_mask using eval_tidy. @lionel- thought this might be a more appropriate home.

I have prototype using makeActiveBinding, although Lionel suggested there's a cleaner alternative using objectable.

The protoype is here: https://gist.github.com/cfhammill/4244e29bc667455a2406bf972e4287c6

I'm happy to open this as a PR with the active bindings version, although I'm not sure if/when I would be able to switch this to objectable.

ast() fails for `=` assignment

lobstr::ast() and pryr::ast() both return an error, when I want to assign a value via =

> lobstr::ast(a = 1)
Error in lobstr::ast(a = 1) : unused argument (a = 1)
> pryr::ast(a = 1)
Error in pryr::ast(a = 1) : unused argument (a = 1)

In contrast the <- assignment is accepted in both

> lobstr::ast(a <- 1)
X-`<-` 
+-a 
\-1 
> pryr::ast(a <- 1)
\- ()
  \- `<-
  \- `a
  \-  1

Edit: It is probably because a = implies to look for an argument called a, which can not be found, since x is the only argument of lobstr::ast() and hence, the best I can come up with is a workaround involving extra brackets

> lobstr::ast(x = 1)
1 
> lobstr::ast(a = 1)
Error in lobstr::ast(a = 1) : unused argument (a = 1)
> lobstr::ast(x = a = 1)
Error: unexpected '=' in "lobstr::ast(x = a ="
> lobstr::ast(x = {a = 1})
X-`{` 
\-X-`=` 
  +-a 
  \-1 
> lobstr::ast(x = (a = 1))
X-`(` 
\-X-`=` 
  +-a 
  \-1

Remove prim_length

obj_size crashes R if run on parent of global env

Example:

obj_size(parent.env(.GlobalEnv), env = emptyenv())

Use consistent type abbreviations

make_type_abrev() and type_sum() are inconsistent, and I think neither are correct. If it's a vector (how do we tell?) we should probably use vctrs::type_sum() (+ vctrs::vec_size() ?) and otherwise use something like R7:::obj_desc().

obj_size kills R session if asked to measure size of an object with linear models

While preparing reprex for someone else, I've created an object with some linear models in a list-column. My goal was to show the size of the object, so I've used lobstr::obj_size(), but using it kills the R session. pryr::object_size() works without an issue.

Minimal reprex with session info:

library(magrittr)

models <- tibble::as_tibble(mtcars) %>%
  dplyr::group_by(carb) %>%
  tidyr::nest() %>%
  dplyr::mutate(model = purrr::map(data, ~lm(disp ~ ., data = .x))) %>%
  tidyr::unnest(data, .drop = FALSE)

pryr::object_size(models)
#> 90 kB

## this would kill R session
#lobstr::obj_size(models)

^{Created on 2019-01-04 by the reprex package (v0.2.1)}

Session info

devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.5.2 (2018-12-20)
#>  os       Ubuntu 18.04.1 LTS          
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_US                       
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/Berlin               
#>  date     2019-01-04                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package     * version     date       lib source                         
#>  assertthat    0.2.0       2017-04-11 [1] CRAN (R 3.5.0)                 
#>  backports     1.1.3       2018-12-14 [1] CRAN (R 3.5.1)                 
#>  callr         3.1.1       2018-12-21 [1] CRAN (R 3.5.2)                 
#>  cli           1.0.1       2018-09-25 [1] CRAN (R 3.5.1)                 
#>  codetools     0.2-16      2018-12-24 [4] CRAN (R 3.5.2)                 
#>  crayon        1.3.4       2017-09-16 [1] CRAN (R 3.5.1)                 
#>  desc          1.2.0       2018-05-01 [1] CRAN (R 3.5.0)                 
#>  devtools      2.0.1       2018-10-26 [1] CRAN (R 3.5.1)                 
#>  digest        0.6.18      2018-10-10 [1] CRAN (R 3.5.1)                 
#>  dplyr         0.7.99.9000 2018-11-25 [1] local                          
#>  evaluate      0.12        2018-10-09 [1] CRAN (R 3.5.1)                 
#>  fs            1.2.6       2018-08-23 [1] CRAN (R 3.5.1)                 
#>  glue          1.3.0       2018-10-14 [1] Github (tidyverse/glue@4e74901)
#>  highr         0.7         2018-06-09 [1] CRAN (R 3.5.0)                 
#>  htmltools     0.3.6       2017-04-28 [1] CRAN (R 3.5.0)                 
#>  knitr         1.21        2018-12-10 [1] CRAN (R 3.5.1)                 
#>  magrittr    * 1.5         2014-11-22 [1] CRAN (R 3.5.0)                 
#>  memoise       1.1.0       2017-04-21 [1] CRAN (R 3.5.0)                 
#>  pillar        1.3.1       2018-12-15 [1] CRAN (R 3.5.2)                 
#>  pkgbuild      1.0.2       2018-10-16 [1] CRAN (R 3.5.1)                 
#>  pkgconfig     2.0.2       2018-08-16 [1] CRAN (R 3.5.1)                 
#>  pkgload       1.0.2       2018-10-29 [1] CRAN (R 3.5.1)                 
#>  prettyunits   1.0.2       2015-07-13 [1] CRAN (R 3.5.0)                 
#>  processx      3.2.1       2018-12-05 [1] CRAN (R 3.5.1)                 
#>  pryr          0.1.4       2018-02-18 [1] CRAN (R 3.5.2)                 
#>  ps            1.3.0       2018-12-21 [1] CRAN (R 3.5.2)                 
#>  purrr         0.2.5       2018-05-29 [1] CRAN (R 3.5.0)                 
#>  R6            2.3.0       2018-10-04 [1] CRAN (R 3.5.1)                 
#>  Rcpp          1.0.0       2018-11-07 [1] CRAN (R 3.5.1)                 
#>  remotes       2.0.2       2018-10-30 [1] CRAN (R 3.5.1)                 
#>  rlang         0.3.0.1     2018-10-25 [1] CRAN (R 3.5.1)                 
#>  rmarkdown     1.11        2018-12-08 [1] CRAN (R 3.5.1)                 
#>  rprojroot     1.3-2       2018-01-03 [1] CRAN (R 3.5.0)                 
#>  sessioninfo   1.1.1       2018-11-05 [1] CRAN (R 3.5.1)                 
#>  stringi       1.2.4       2018-07-20 [1] CRAN (R 3.5.1)                 
#>  stringr       1.3.1       2018-05-10 [1] CRAN (R 3.5.0)                 
#>  testthat      2.0.1       2018-10-13 [1] CRAN (R 3.5.1)                 
#>  tibble        1.4.2       2018-01-22 [1] CRAN (R 3.5.0)                 
#>  tidyr         0.8.2       2018-10-28 [1] CRAN (R 3.5.1)                 
#>  tidyselect    0.2.5       2018-10-11 [1] CRAN (R 3.5.1)                 
#>  usethis       1.4.0       2018-08-14 [1] CRAN (R 3.5.1)                 
#>  withr         2.1.2       2018-03-15 [1] CRAN (R 3.5.0)                 
#>  xfun          0.4         2018-10-23 [1] CRAN (R 3.5.1)                 
#>  yaml          2.2.0       2018-07-25 [1] CRAN (R 3.5.0)                 
#> 
#> [1] /home/misha/R/x86_64-pc-linux-gnu-library/3.5
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

What I noticed is that if I omit last line in the pipeline tidyr::unnest(data, .drop = FALSE) same problem happens, but it takes ~10 seconds for R session to die. With this line present R session dies immediately.

RStudio info:
Version 1.2.1114
Build 1104 (2e0f7658)
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) QtWebEngine/5.10.1 Chrome/61.0.3163.140 Safari/537.36

Tools for tracking memory "leaks"

obj_size() provides the basic structure for the recursion. The main difference would be that each call returns a nested list containing the current object, and a (potentially named) list of its children. Would want options to control the maximum depth to recurse (so you could explore iteratively, if needed), whether or not to show CHARSXPs, and how to handle ALTREP. By default, would show environments in the same way as lists, rather than exposing the details of their internal hash tables (but needs option to show full details for debugging other code).

Need small S3 class to capture an object — it's address (string), type (string), NAMED (integer?), and id (integer, if seen before, NULL otherwise), and its children (named list). Its print method would take care of generating the tree display.

This would effectively become a replacement for .Internal(inspect(x)), with a better display (i.e. using colour and unicode tree characters) and displaying a spanning tree (instead of potentially getting stuck in infinite loops). Would replace internal obj_formals() etc, and altrep().

obj_size() formatting for large objects

When I typically reach for obj_size I am using it on relatively large objects (MBs or GBs). I am curious if there is a recommended formatting method for converting from printing as Bytes to KB, MB, GB, etc. I can manually convert with a combination of division + fsprint, paste0, round etc.

There are a few r-lib packages that provide nice printing of bytes, but as far as I can tell do not have a method for lobstr_bytes objects. Does it make sense to have prettyunits operate natively on lobstr_bytes via a method or even having a function argument inside obj_size() for unit conversion/prettier printing?

Thank you for your time reviewing. 🖖

library(prettyunits)
library(lobstr)
library(magrittr)

car_size <- obj_size(mtcars)

# lobstr_bytes format
class(car_size)
#> [1] "lobstr_bytes"

car_size_num <- obj_size(mtcars) %>% 
  as.numeric()

# numeric
class(car_size_num)
#> [1] "numeric"

# scales::number_bytes() is deprecated
# and only works on numeric converted obj_size
scales::number_bytes(car_size)
#> Error in UseMethod("round_any"): no applicable method for 'round_any' applied to an object of class "lobstr_bytes"
scales::number_bytes(as.numeric(car_size))
#> [1] "7 KiB"
scales::number_bytes(unclass(car_size))
#> [1] "7 KiB"

# scales::label_bytes() works for numeric or unclassed objects
# but feels a bit awkward outside of ggplot2
scales::label_bytes()(unclass(car_size))
#> [1] "7 kB"

# prettyunits::pretty_bytes works well
# but again requires numeric or unclassed conversion
prettyunits::pretty_bytes(car_size)
#> Error in as.data.frame.default(x[[i]], optional = TRUE): cannot coerce class '"lobstr_bytes"' to a data.frame
prettyunits::pretty_bytes(unclass(car_size))
#> [1] "7.21 kB"

^{Created on 2021-08-27 by the reprex package (v2.0.0)}

Request: make `ast()` not automatically unwrap quosures

I was using rlang::blast(), and trying to understand how it works.

lazyeval::ast_() shows the actual call:

blast <- function(expr, env = caller_env()) {
  eval_bare(enexpr(expr), env)
}
exp <- quo(a + b)

f <- function(x) {
  lazyeval::ast_(sys.call())
  invisible()
}
blast(f(!!exp))
#> ┗ ()
#>  ┗ `f
#>  ┗ ()
#>   ┗ `~
#>   ┗ ()
#>    ┗ `+
#>    ┗ `a
#>    ┗ `b

However lobstr::ast(!!x) unwraps the quosures so you don't see the actual call:

g <- function(x) {
  lobstr::ast(!!sys.call())
  invisible()
}
blast(g(!!exp))
#> █─g 
#> └─█─`+` 
#>   ├─a 
#>   └─b

A better str for simple objects

Starting to noodle on this idea because I think it will be useful for the data structures chapter in R4DS

# Atomic vectors ----------------------------------------------------------

# Compactly displays type and length
somethingstr(1:100)
#> int[100]
somethingstr(letters)
#> chr[26]

# Also displays attributes. 
# Class gets special handling
somethingstr(factor(letters))
#> int[26] <factor>
#> @ levels: chr[26]

# (I imagine this being used once you've taught basic data structures
# and purrr, so it's useful to see the details, instead the helpful
# lies that str() tells you.)

somethingstr(Sys.time())
#> dbl[1] <POSIXct, POSIXt>
#> @ tzone: chr[1]  

# Lists -------------------------------------------------------------------

# Shows hierarchy
x <- list(
  list(
    1, 
    2
  ),
  list(
    3,
    4
  )
)
somethingstr(x)
#> list[4]
#> - 1: list[2]
#>    - 1: dbl[1]
#>    - 2: dbl[1]
#> - 2: list[2]
#>    - 1: dbl[1]
#>    - 2: dbl[1]

# Very long lists are truncated
x <- replicate(100, list(runif(5)))
somethingstr(x)

#> list[100]
#> - 1: dbl[5]
#> - 2: dbl[5]
#> - 3: dbl[5]
#> - 4: dbl[5]
#> - 5: dbl[5]
#> ...

# So are very deep lists
x <- list()
for (i in 1:100) x$x <- list(x)
somethingstr(x)

#> list[1]
#> 1. list[1]
#>    1. list[1]
#>       1. list[1]
#>          1. ...

# And length and depth interplay in some complicated way. Maybe the way
# to think about it is that you want to (say) print at most 100 lines.  
# How should you allocate those lines to best display the structure of
# the object? I don't think simple cut-offs for length vs. depth will
# work in general. 

# Think about something() on a data frame containing models etc.
# Maybe can assume unnamed lists are generally homogeneous?

# Names get special treatment
somethingstr(mtcars)
#> list[11] <data.frame>
#> $ mpg : dbl[32]
#> $ cyl : dbl[32]
#> $ disp: dbl[32]
#> $ hp  : dbl[32]
#> $ drat: dbl[32]
#> $ wt  : dbl[32]
#> $ qsec: dbl[32]
#> $ vs  : dbl[32]
#> $ am  : dbl[32]
#> $ gear: dbl[32]
#> $ carb: dbl[32]
#> @ row.names: chr[32]

# Very long names get truncted
x <- list(this_is_a_very_very_very_very_long_name = 1:10)
somethingstr(x)
#> list[1]
#> $ this_is_a_very_...: int[10]


# Environments ----------------------------------------------------------------

# Need someway to control recursion into environments. Probably don't
# want it on by default because there are too many objects that have 
# (possibly big) environments attached (e.g. formulas)
somethingstr(globalenv())
#> env[2] [R_GlobalEnv]

somethingstr(globalenv(), show_env = 0L)
#> env[2] [R_GlobalEnv]
#> $ df: list[1] <data.frame>
#>       $x: int[100]
#>       @row.names: int[1]
#> $ i:  int[10]
#> @parent.env: env[10] [tools:rstudio]

# show_env = 0L would also show the contents of parent.env.

# Functions ---------------------------------------------------------------

somethingstr(function(x = 1:10, y = x) {})
#> func[2] 
#>   $x: `1:10
#>   $y: `x
#> * env: env[4] [R_GlobalEnv]

cc @jennybc, @lionel-

Move `master` branch to `main`

The master branch of this repository will soon be renamed to main, as part of a coordinated change across several GitHub organizations (including, but not limited to: tidyverse, r-lib, tidymodels, and sol-eng). We anticipate this will happen by the end of September 2021.

That will be preceded by a release of the usethis package, which will gain some functionality around detecting and adapting to a renamed default branch. There will also be a blog post at the time of this master --> main change.

The purpose of this issue is to:

Help us firm up the list of targetted repositories
Make sure all maintainers are aware of what's coming
Give us an issue to close when the job is done
Give us a place to put advice for collaborators re: how to adapt

message id: euphoric_snowdog

Use the ALTREP inspect method in sxp

To print ALTREP objects. The inspect method typically provides more information about the object. E.g.

❯ .Internal(inspect(1:100))
@7fa458bec718 13 INTSXP g0c0 [NAM(7)]  1 : 100 (compact)

❯ lobstr::sxp(1:100)
[1:0x7fa458be6a60] <INTSXP[100]> (altrep named:7)

❯ .Internal(inspect(tl))
@7fa458bec8d8 13 INTSXP g1c0 [MARK,NAM(7)]  ALTREP progress::tick_along
  @7fa458bec868 13 INTSXP g1c0 [MARK,NAM(7)]  1 : 10000 (compact)

❯ lobstr::sxp(tl)
[1:0x7fa458bec8d8] <INTSXP[10000]> (altrep named:7)

Explore changes to object.size in R-devel

i.e. how should ALTREP be handled?

Display quosure environment

So you know it's a quosure

Cannot ref shiny reactiveValues

ref method will call as.list.reactiveValues for shiny reactive values (such as session), which causes errors.

List all parent environments for a given environment

to help explain how mockr/mockery works.

Don't crash with large pairlists

lobstr::obj_size(as.pairlist(1:1E6))

Does avoiding incrementing NAMED still matter?

tracemem() does a better job of telling you when a copy occurs, and the exact meaning of obj_refs() is likely to experience some churn in the next few years.

obj_sizes gives NA for objects larger than 2^31 bytes

library(lobstr)
n <- 1.343E8
largeframe <- data.frame(runif(n), 
                         sample(LETTERS, n, T), 
                         sample(letters, n, T))
largecopy <- largeframe
obj_size(largeframe, largecopy)
obj_sizes(largeframe, largecopy)

List R-core as contributor in DESCRIPTION

Since object_size() code derived from object.size()

Relate AST nodes back to parse data

(I'm happy to take a stab at this if it seems useful! cc @filipsch who is interested in this issue)

Really enjoyed the NYC R conference talk on lobstr! This might be outside the scope of lobstr, but it seems like being able to enhance the AST with line / column info could lead to useful ways of allowing people to explore the AST, and its relationship back to their code.

I worked briefly on connecting AST and parse data while flying somewhere, but never finished. I think the key points are...

getParseData returns a data.frame, but there is a handy function in https://github.com/halpo/parser to turn it into a tree (and plot it)
In the AST, a node's first child is always the function called, but in the parse data their order corresponds to their position in the code (shown below).
As far as I can tell, the layers of the AST and parse data graph match, so relating them requires...

filtering parse data nodes corresponding to "expr" tokens
potentially finding a non-"expr" token corresponding to leftmost AST child node in other tokens (e.g. in 1 + 2, the + is not an expr token)

There may be a much easier way to do this. Here's is a comparison of the parse data for a binary operator.

Code used to graph parsed data

# taken from https://github.com/halpo/parser/blob/master/R/plot.parser.R
library(igraph)
plot.parser <- function(x, ...){
  y = getParseData(x)
  stopifnot(require(igraph))
  y$new.id <- seq_along(y$id)
  h <- graph.tree(0) + vertices(id = y$id, label= y$text)
  for(i in 1:nrow(y)){
    if(y[i, 'parent'])
      h <- h + edge(c(y[y$id == y[i, 'parent'], 'new.id'], y[i, 'new.id']))
  }
  plot(h, layout=layout.reingold.tilford, ...)   
}

create_parse_graph <- function(pd) {
  pd$new.id <- seq_along(pd$id)
  h <- graph.tree(0) + vertices(id = pd$id, label= pd$text)
  for(i in 1:nrow(pd)){
    if(pd[i, 'parent'])
      h <- h + edge(c(pd[pd$id == pd[i, 'parent'], 'new.id'], pd[i, 'new.id']))
  }
  
  h
}

Some notes on matching nodes

# Steps to map to AST from parse data
#   1. start at root AST node (call ast), and PD node (call pd)
#
#   2a. if is.atomic(ast), build atomic node by returning ast[1]. Otherwise...
#   2b. if is.call(ast) and is.name(ast[[1]]), then ast[[1]] is child token of pd
#     i. standard call will make it the first child token
#    ii. binary will be in the middle (so easier to search for ast[[1]])
#   2c. if is.call(ast) and is.call(ast[[1]]), then it is first child of pd
#
#   3. Get all child exprs of pd (in order), move one matching ast[[1]] to front
#
#   4. You have matched up ast to pd! Recurse

Release lobstr 1.0.0

Prepare for release:

Wait for rlang release
Remove remotes
Review description
rhub::check_for_cran()
rhub::check(platform = "solaris-x86-patched")
rhub::check(platform = "ubuntu-rchk")
rlang 0.3.0 CRAN windows binary available
devtools::check_win_devel()

Perform release:

Bump version (in DESCRIPTION and NEWS)
devtools::check_win_devel() (again!)
devtools::submit_cran()
Approve email

Wait for CRAN...

Tag release
Bump dev version
Write blog post
Tweet

Template from r-lib/usethis#338

>= vs >

for ALTREP

Determine why size of S4 objects is different

test_that("size of S4 objects same as base", {
  
  Z <- methods::setClass("Z", slots = c(x = "integer"))
  z <- Z(x = 1L)
  
  expect_same(z)
})

r-lib / lobstr Goto Github PK

lobstr's People

Contributors

Stargazers

Watchers

Forkers

lobstr's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs