GithubHelp home page GithubHelp logo

Comments (5)

MilesMcBain avatar MilesMcBain commented on September 13, 2024

I am now finally starting to get interested in this idea too, now that I can snapshot quickly.

I've made the lockfiles generated by capshot() able to be minified, which might help. Although I am thinking it would be ideal to have a solution that would work in both HTML and pdf documents. The first thing that comes to mind is encoding the lockfile as an image.

from capsule.

cderv avatar cderv commented on September 13, 2024

Hi Miles,

I did not follow closely the why of capshot() and how it is implemented, so I am not sure to see which is the minified version of it. capshot_str() ? For now, I just read the post on it (https://milesmcbain.micro.blog/2021/07/15/unlocking-fast-rstats.html). I would have thoughts that playing with type argument in renv::snapshot() would be enough to get a faster version of lockfile creation adapted to your workflow. Or maybe playing with snapshot.type settings so that renv:::lockfile (unfortunately not exported) would help creating a lockfile quicker for your usage.

Anyway, regarding the embeding, did you see that renv as a new mechanism that could be adapted to Rmd file or even single script out of a project ? It is renv::use() - see article for usage: https://rstudio.github.io/renv/articles/use.html

There is also an experimental function called renv::embed() to help insert a lockfile into a document. For Rmd, it will insert lockfile as renv::use() call into a chunk.

I am just sharing this in case you don't know. It is always interesting to see what others do. I know you have the using package also that do a similar thing that renv::use() probably, but with a different mechanism and purpose.

Regarding the main topic here, how do you see things working by embeding as as image ? You would call capsule::run_*() or a new capsule::render_*() and it would find the lockfile to use ?

Initially, I had in mind something not necessarily embeded but an easy way of rendering a document in a capsule. Somthing like capsule::render(rmd_file, lockfile, ...=) or simply with capsule::run() where we could pass the lockfile or package.R script. However, something embeded could be more interesting for single Rmd file. There is always the limit of rmarkdown own deps that you need to run a chunk. renv::use() in a chunk works ok for this but I did not look at all the implication when run in interactive session (because temp lib dir is still active after rendering with rmarkdown::render()). When run in a background session or job (like with Knit button), it works quite well. I find it to be a good idea overall to the single file dependencies management.

It is a bit unstructured thoughts above but I am taking the opportunity of this issue for discussion on this.

from capsule.

MilesMcBain avatar MilesMcBain commented on September 13, 2024

Thanks for the info Chrisophe!

So I've had a few performance issues with {renv} snapshotting I describe here: rstudio/renv#774

Part of it seems to be with using certain kinds of repos, e.g. r-universe, but even without those deps it's sluggish. I guess a lot of this is validation, and sometimes it makes network calls to do that.

Yeah using has similar syntax and does some similar things, but it still has the fundamental difference that it doesn't mess with your .libPaths(). I remember seeing an issue for renv::use but I didn't know it was implemented, so thanks for pointing that out.

Regarding rendering you're thinking very much in the same direction to me. Some function to render using a lockfile like capsule::render is a great idea.

I was talking about a different kind of embedding though. Not embedding a lock file to be used internally by {rmarkdown}/{knitr} as the library, but embedding the lockfile as a full description of the packages used in case that becomes useful in the future. For example, you have a result you need to reproduce following someone's report, but you get a slightly different output. You could make extracting and diffing a lockfile for your env vs the author's a one liner.

Thinking about this some more though, this is kind of moot if you have access to the author's source repository, since you could look at it as at the report date and read the lockfile from there if it was committed.

from capsule.

MilesMcBain avatar MilesMcBain commented on September 13, 2024

renv::use() in a chunk works ok for this but I did not look at all the implication when run in interactive session (because temp lib dir is still active after rendering with rmarkdown::render()). When run in a background session or job (like with Knit button), it works quite well. I find it to be a good idea overall to the single file dependencies management.

Yes! So the way capsule::run() used to work was it hot swapped the libpaths for the current session - kind of like how I think the isolate option for renv::use works. This turned out to be a bit of a reproducibility issue since you could get 'leakage' between libraries. For example if you used renv::use() inside an Rmd, and you rendered that in the interactive session, then all of rmarkdown's deps from the original library paths will be the ones on the search path, and not any different versions specified inside the Rmd with use, since they won't be attached because that package is already attached.

I hope that makes sense. I eventually rewrote capsule::run to always run in a separate session to avoid this.

from capsule.

MilesMcBain avatar MilesMcBain commented on September 13, 2024

So actually this leakage issue is something you need to be careful of with use. Any code higher up in the script than the use call might attach packages that can't be reattached by use. So while it's fine to say 'always put it at the top of your script' you need to make sure the receiver of the script isn't doing anything in their R profile?

from capsule.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.