Comments (37)
@joshmgross is working on this
from toolkit.
Hi all, we expect it to be fully released by GitHub Actions GA on Nov 13th. It's likely we'll release some preview version some time this week or next, and I'll update this issue when that occurs.
from toolkit.
Yes, caching is on the backlog however it won't be a toolkit feature. It will be a new action with runner support. I'll leave open though to track the enhancement. It's definitely on the feature backlog short list.
from toolkit.
Hi all! actions/cache
is now in preview! Go check it out at https://github.com/actions/cache and we would love any feedback.
Re: cache size limits, 200MB is the compressed size so at least for node_modules
you should be okay for a lot of use cases. That being said, we are working increasing that limit fairly soon.
I'll reply to actions/cache#6 with more info.
Thanks again!
Closing this issue, all new issues related to caching should go to https://github.com/actions/cache/issues
from toolkit.
We don't have caching yet, but we definitely will - personally agreed that I want that π
For artifacts, we have upload and download artifact actions that I think do pretty much exactly what you're looking for. So you could have a bunch of test jobs that depend on the build job, where the build job uploads your build and the test jobs download it again. https://github.com/actions/upload-artifact and https://github.com/actions/download-artifact
from toolkit.
https://github.com/actions/cache π₯³
It's limited to 200mb, though - so it might exclude most node_modules
folders π¬
from toolkit.
The third-party stopgap solution I was working on involves sending data to a different data centre, which may be why itβs not performing well enough for use. I have no idea where GitHub is hosted these days, but it was sometimes taking longer to retrieve the compressed node modules cache from a server on Azure than to just npm install
(which itself took 3 minutes).
Looks like I'm abandoning that idea for now while waiting for an official solution from GitHub.
from toolkit.
from toolkit.
If that sounds right, can we maybe rename this to something like "Add Caching Support" and keep it open to track that? That way we can communicate out when that feature is added
from toolkit.
Thinking of putting something together for caching to mimic CircleCI's functionality.
- save_cache:
key: v1-npm-dependency-cache-{{ checksum "package-lock.json" }}
paths:
- node_modules
Does this cover the basis?
- specify a key, including the ability to checksum a file, and the option to increment a version to bust the cache
- archive up one or more paths
- store the archive in S3 or similar, perhaps using The Go Cloud Development Kit. Unlike an implementation provided and hosted by GitHub, this would require some secrets set and passed along to the environment.
- restore and unarchive the paths into the appropriate locations with a separate action
CircleCI also has the ability to specify multiple restore keys, though I'm not sure how useful that is.
I'm still looking into how the Node.js Actions are written to see if this can be done in a nice reusable way without having to write the go get
steps.
from toolkit.
We're looking for a stopgap solution while waiting for GitHub's official support for caching.
In the meantime, I've put together this command line tool (written in Go) that can checksum files and compress a folder, such as package-lock.json
and node_modules
.
https://github.com/RobotsAndPencils/cache-money-client
It's only a partial solution because it communicates with a private server for file storage. But it should be fairly easy for someone to implement their own server, or to fork the project and plug in The Go Cloud Development Kit to target S3 and the like.
As of now, the client is 555 lines of Go code, including tests, and only requires the standard library.
from toolkit.
@joshmgross is working on this
Any indication on when this will be available? Will this for example be available when GitHub Actions goes to GA?
from toolkit.
@damccorm I changed the title as suggested
from toolkit.
Sure, but I'm asking about putting it inside the action. To use the action you currently do:
- name: Setup Flutter ${{ matrix.dart-version }}
uses: DanTup/gh-actions/[email protected]
with:
channel: ${{ matrix.dart-version }}
I don't think the person using the action should have to start knowing what goes on inside the action to know what to cache (which is how it sounds if caching "is a new action" and "won't be a toolkit feature"?). I think an actions implementation should be able to read and write from a cache (for example by exposing a single boolean flag on the action that says whether to use caching) so that it can provide the benefit of caching without overhead to the user of the action.
from toolkit.
Exactly :) There are already a bunch of GitHub-authored setup-xxx actions (node, ruby, dotnet, etc.) that could benefit from this (but if users have to do it, it won't happen). Most installers are probably small enough it's not a big deal, some things (for ex. an Android SDK) can be quite large.
I'm not saying there shouldn't also be an action for caching - but it does feel like something that would be useful to have also in this toolkit for action-authors to use.
from toolkit.
@tvdeyen GitHub never stated that caching would be available by November 13th. π
from toolkit.
https://github.com/actions/cache/blob/master/src/save.ts#L60
200mb for the archived (tar) size
from toolkit.
@joshmgross done, works like a charm... I will try to do a PR with this feature
from toolkit.
Should we be having these discussions in the cache repo? That's where other folks will look for cache discussions. I think we should freeze this thread.
from toolkit.
caching is on the backlog however it won't be a toolkit feature. It will be a new action with runner support.
I don't know what "runner support" means, but will this allow actions to use caching internally? For example I have a setup-flutter
action that currently takes a few minutes because it does a bunch of downloading when it's first invoked (it downloads additional modules it needs - the URLs for are all internal to it and from the workflow I don't know or care). It'd be great if my setup-flutter action could deal with this all internally (download it, invoke it so it downloads what it needs, then cache its folder under a version stamp).
from toolkit.
@DanTup I'm not sure what the GitHub implementation will look like, but with other systems the caching layer just compresses/stores a folder and then puts it back.
As an example, I would cache node modules something like this:
run: |
cache restore ...
npm install
cache store ...
So in this case npm install
runs every time, but it should run significantly faster with a node_modules/
folder restored from the cache than if it were empty. So if your setup-flutter
task is written in a similar way, it should run quickly when it sees that everything is already downloaded.
Furthermore, a cache has a key like "v1-{{ checksum "package-lock.json" }}"
to find and restore the right cache. The key also allows it to skip compressing/storing the folder if it was already cached in the past.
from toolkit.
@DanTup I like that idea. So someone would publish an action for "npm install" or "yarn install" that has caching built in, and is aware of the lock files and folders to cache. That makes for a nice API for the users.
Not sure what GitHub will do, but I'd be for a solution like that (so long as making a new action is relatively painless).
from toolkit.
from toolkit.
It goes GA on Nov 13th, according to the email from GH: Thanks for being a part of the GitHub Actions and GitHub Package Registry beta programs! Weβre excited to announce that GitHub Actions and GitHub Package Registry will be moving to general availability on November 13.
from toolkit.
Here's was a slight hint about getting it from ethomson (GitHub Staff)
from toolkit.
Maybe the 200mb limit is only there in the preview version? π
from toolkit.
The 200mb limit is not only an issue for projects with Npm/Yarn. Projects using CocoaPods can easily exceed the 200mb limit as well. And rubygems too.
from toolkit.
Yeah my node modules is 500mb so this is a no-go for me. :(
from toolkit.
There is an issue on the cache repo for the small file size limit, maybe it's better to mention it there? π actions/cache#6
from toolkit.
@joshmgross hi! i'm having issues with yarn workspaces... some packages are not found with action-cache, like babel when I need to compile my code. Thanks!
from toolkit.
@chemitaxis Hey! Can you check out the updated example here: actions/cache#70 and see if that helps?
Please file an issue if you still need help
from toolkit.
With workspaces, there's not only node_modules in the top level directory, but also in packages dir that has cli tools (as babel-cli babel
) or a none hoisted dep.
If you cache only top level node_modules,
And then run yarn install, unfortunately yarn won't re-add the node_nodules on packages directories,
But he will tell you that there is nothing to install
Maybe this is your issue
from toolkit.
Thanks @Bnaya and @joshmgross , I will check everything on Monday and if fails, I will create an issue with all the configuration. We are migrating from CircleCI to Github Actions and we have found some βproblemsβ, but we think we have more options for a monorepo with GH actions.
from toolkit.
@Bnaya could I add multiples cache path in different actions for each workspace and not install dependencies if all actions result output are true? Thanks!
from toolkit.
@joshmgross is it possible to use the cache from within an action built using the toolkit? (for ex. if you're building a reusable action like setup-dart, it would be nice for it to be able to cache it's SDK downloads without the user having to fill their workflow file with caching info for each action).
from toolkit.
@chemitaxis Yes you can do multiple cache actions for each path and &&
the outputs
@DanTup Not yet, no. See actions/cache#55 for tracking that, and if you could provide your use case there that would be great.
from toolkit.
Locking thread, please file issues at https://github.com/actions/cache/issues for further discussions
from toolkit.
Related Issues (20)
- Problem Matchers: Higher scope for `message` property
- Enable core.debug to work like familiar console.log
- Typos in the codebase
- tool-cache.getManifestFromRepo() has hard coded URL and does not support GHES repo
- tool-cache._getOsVersion should improve to detect RHEL compatible Linux distro version.
- Relative path does not work, also error throws exception
- Decompress based on `content-encoding` in `downloadTool`
- Links created for annotations are (sometimes) broken
- Multiline annotations are not correctly displayed.
- Call another github action like an async function call HOT 2
- `@actions/core` can't be imported in Node 20 HOT 3
- Error logging is lossy
- Question: Why are cache entries uploaded in chunks of 32MB?
- Allow customizing the `GITHUB_SERVER_URL` for GHE instances
- licenseosi HOT 3
- Please remove misleading hope-giving line from main README.md
- Artifact upload speeds are artificially bottle necked to 8 MB chunks HOT 5
- Introduce the ability to interrupt exec() using signal (PR: #1469) HOT 2
- Allow caching without compression HOT 2
- Issue for bug
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from toolkit.