GithubHelp home page GithubHelp logo

Docker caching about cache HOT 48 OPEN

actions avatar actions commented on June 1, 2024 188
Docker caching

from cache.

Comments (48)

joshmgross avatar joshmgross commented on June 1, 2024 89

πŸ‘‹ Hi all, right now we are focusing on other priorities and there are no updates for docker caching. We appreciate your feedback and revisit priorities based on user feedback, so please continue sending us your input.

from cache.

steebchen avatar steebchen commented on June 1, 2024 83

Since docker is now natively integrated in Github actions, I think it would be nice if there was an option to enable docker caching globally, like CircleCI does.

from cache.

tuler avatar tuler commented on June 1, 2024 70

@steebchen I agree. Docker caching should be natively supported, not through this action.

from cache.

steebchen avatar steebchen commented on June 1, 2024 25

What's bothering me most is that all Github Actions (which are based on Docker) are always rebuild from scratch, in every single action. I've never worked in projects as big as GitHub, but I can't imagine how caching is not the solution here – just for the reason that it costs bandwidth to fetch the images each time. I'd also be happy to upgrade to a more expensive plan or pay for the cache storage costs.

(edit: I just realised I already commented in this thread... πŸ˜„)

from cache.

Sytten avatar Sytten commented on June 1, 2024 23

Any update of the progress?

from cache.

kaka-ruto avatar kaka-ruto commented on June 1, 2024 20

Hoping this bubbles up the priority list soon!

from cache.

torreytsui avatar torreytsui commented on June 1, 2024 18

Would love to see this feature in 2023.

from cache.

joshmgross avatar joshmgross commented on June 1, 2024 16

The cache file limit has been increased from 400 MB to 2 GB, so this may be more viable than previously due to docker layers being larger than the previous limit.

from cache.

jeacott1 avatar jeacott1 commented on June 1, 2024 14

My ci processes use several pre-built docker images, some of them are very large - 3.5GB!
github actions pulls these images again and again on every run burning a lot of time and cpu.
A built in cache thats got a reasonable expiry behind each repo is really required here. Sooo much dealing with github actions seems to be spent working around paradigm/api failures or missing features.
This is a big one, please give us some better options.

from cache.

dtinth avatar dtinth commented on June 1, 2024 14

Hi, I wrote the article mentioned above which has been outdated and superseded by another Evil Martians’ article.

TL;DR β€” Docker has published an official action docker/build-push-action which has an option to make Docker Buildx use GitHub Cache API directly as a cache storage, skipping the need to use a local-filesystem-based cache such as actions/cache entirely. With this settings my build time dropped from 5 minutes to 1 minute.

          cache-from: type=gha
          cache-to: type=gha,mode=max

from cache.

Nefcanto avatar Nefcanto commented on June 1, 2024 12

What would be the benefit of implementing this here if there's a GH caching available via docker/build-push-action?

Example:

      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: user/app:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

Links: https://docs.docker.com/build/ci/github-actions/cache/#github-cache https://github.com/docker/build-push-action

It does not cache pulled images. For us, that is the most time spent in our actions.

from cache.

rahulbirari-cpi avatar rahulbirari-cpi commented on June 1, 2024 9

Would love to see this feature as well in 2022 .

from cache.

potatoqualitee avatar potatoqualitee commented on June 1, 2024 7

Would love to see this built-in as well.

from cache.

peter-evans avatar peter-evans commented on June 1, 2024 6

Opened PR #37 with an example for Docker layer caching.

from cache.

mikelax avatar mikelax commented on June 1, 2024 6

Hello, I am wondering if this issue has bubbled back up on the action/cache's priority list? I am wondering if there is a difficult technical challenge blocking this feature, as on its surface it seems like it would be a huge benefit to both users running actions and reduce load on GH infrastructure.

from cache.

DannyBen avatar DannyBen commented on June 1, 2024 5

@steebchen I agree. Docker caching should be natively supported, not through this action.

Seems like a lot of people agree to that statement. Where is the proper place to post such a request? In the runner repo? Perhaps someone in the inner circle should post it as a feature request in the appropriate place and post a link here?

from cache.

Kurt-von-Laven avatar Kurt-von-Laven commented on June 1, 2024 5

Should we try to use this cache action to cache docker layers, doing trickery with docker save and docker load, or are you working on a different path for Docker caching?

My company, ScribeMD, faced a similar issue and just released a dirt simple docker-cache action, which caches all Docker images whether built or pulled using said docker save/docker load trickery. Please file an issue if you have any feedback!

from cache.

DannyBen avatar DannyBen commented on June 1, 2024 4

The cache option on docker/build-push-action works great for building images.

Well, that is not that great in the minimalist's opinion. The syntax is very convoluted, and as you noticed, works partially. I hope some day, $ docker build would work on the GitHub action runners in the same way it works on any other long-running machine, with docker layer cache built in and mounted automatically, with a simple or no config.

from cache.

piotrkubisa avatar piotrkubisa commented on June 1, 2024 3

@mikelax Would https://github.com/satackey/action-docker-layer-caching work for your case? It's built on top of this action.

This linked action says it will use following method to take care of caching:

This GitHub Action uses the docker save / docker load command and the @actions/cache library.

While ago, @dtinth published an article Caching Docker builds in GitHub Actions: Which approach is the fastest? πŸ€” A research. (https://dev.to/dtinth/caching-docker-builds-in-github-actions-which-approach-is-the-fastest-a-research-18ei). He found the docker save/load method is far from the best solution. Worth taking a look whether you are trying to cut some seconds.

from cache.

ErikSchierboom avatar ErikSchierboom commented on June 1, 2024 3

While the setup-buildx suggestion works when you're building your own docker images, AFAICT it doesn't work when you're using docker images as container services. It would be great if this was supported.

from cache.

dudicoco avatar dudicoco commented on June 1, 2024 3

What would be the benefit of implementing this here if there's a GH caching available via docker/build-push-action?

Example:

      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: user/app:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

Links: https://docs.docker.com/build/ci/github-actions/cache/#github-cache https://github.com/docker/build-push-action

This forces you to use the docker action rather than relying on a generic docker caching solution.

from cache.

rickerp avatar rickerp commented on June 1, 2024 2

What would be the benefit of implementing this here if there's a GH caching available via docker/build-push-action?

It does not cache pulled images. For us, that is the most time spent in our actions.

This πŸ‘†

from cache.

Kurt-von-Laven avatar Kurt-von-Laven commented on June 1, 2024 1

No, you didn't make a mistake. Please give docker-cache a try.

from cache.

joshmgross avatar joshmgross commented on June 1, 2024

You're welcome to use this for caching docker layers! If you get something figured out we can add it as an example usage of this action.

are you working on a different path for Docker caching?

Nothing is in scope right now for v1 of this action that's specific to the Docker scenario, but it might already work.

Related: https://github.com/actions/toolkit/issues/197

from cache.

crazy-max avatar crazy-max commented on June 1, 2024

I think I can do something with ghaction-docker-buildx action and use the local --output.

from cache.

tuler avatar tuler commented on June 1, 2024

I tried but I hit the file limit, discussed at #6

from cache.

peaceiris avatar peaceiris commented on June 1, 2024

@tuler also tried to cache with a tarball on #33. It seems to be a good solution.

#37 also was opened.

from cache.

hamelsmu avatar hamelsmu commented on June 1, 2024

cc: @neovintage

from cache.

joshmgross avatar joshmgross commented on June 1, 2024

@DannyBen See #81 where we're tracking Docker caching of actions

from cache.

jimgreer avatar jimgreer commented on June 1, 2024

+1

from cache.

crazy-max avatar crazy-max commented on June 1, 2024

Working example for me using buildx here: #260 (comment)

from cache.

dhadka avatar dhadka commented on June 1, 2024

@mikelax Would https://github.com/satackey/action-docker-layer-caching work for your case? It's built on top of this action.

from cache.

cdhanna avatar cdhanna commented on June 1, 2024

I would also love to see a solution for this. I've had a few actions spend hundreds of billable minutes trying to pull docker images, and its costing a significant amount of money. I'd much rather have the docker layers cached.

But when I try to cache the docker folder, I get this warning.
Warning: EACCES: permission denied, scandir '/var/lib/docker'

from cache.

basicdays avatar basicdays commented on June 1, 2024

The setup-buildx line of actions is also a beast to try to use if you have your test services and container under test setup through docker-compose. There is no obvious way to get the docker provided actions to work with the cache.

from cache.

roj1512 avatar roj1512 commented on June 1, 2024

Need this.

from cache.

evencheng-invitae avatar evencheng-invitae commented on June 1, 2024

Upvote

from cache.

thisismydesign avatar thisismydesign commented on June 1, 2024

The cache option on docker/build-push-action works great for building images.

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build-and-upload-docker-image:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    permissions:
      contents: read
      packages: write
    outputs:
      docker-tag: ${{ steps.meta.outputs.tags }}

    steps:
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1

      - name: Log in to the Container registry
        uses: docker/login-action@v1
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata (tags, labels) for Docker
        id: meta
        uses: docker/metadata-action@v3
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}

      - name: Build and push
        id: docker_build
        uses: docker/build-push-action@v2
        with:
          push: true
          file: Dockerfile.prod
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: |
            type=registry,ref=${{ steps.meta.outputs.tags }}
            type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:master
          cache-to: type=inline

However, I have a use case for running tests in a dockerized setup.

jobs:
  test:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    needs: [build-and-upload-docker-image]

    steps:
    - uses: actions/checkout@v2

    - name: Log in to the Container registry
      uses: docker/login-action@v1
      with:
        registry: ${{ env.REGISTRY }}
        username: ${{ github.actor }}
        password: ${{ secrets.PACKAGE_ACCESS_TOKEN }}

    - name: Run tests
      run: docker-compose run web yarn test

Here every run pulls dependencies defined in docker-compose (e.g. postgres, redis, etc). Could caching of those be supported natively or by an action?

from cache.

thisismydesign avatar thisismydesign commented on June 1, 2024

@DannyBen Yeah it's verbose but if it bothers you, create an action to wrap these into a one-liner.

from cache.

l1553k avatar l1553k commented on June 1, 2024

Just leaving an idea, haven't seen it discussed yet:

      - uses: actions/cache@v3
        with:
          path: ~/.cache/docker
          key: ${{ runner.os }}-docker
          restore-keys: |
            ${{ runner.os }}-docker
      - name: Mount docker inside ~/.cache/docker
        run: |
          sudo systemctl stop docker.service
          sudo systemctl stop docker.socket
          sudo sed -i "s!ExecStart=/usr/bin/dockerd!ExecStart=/usr/bin/dockerd -g $HOME/.cache/docker!" /lib/systemd/system/docker.service
          sudo systemctl daemon-reload
          sudo systemctl start docker

...

      - name: Make cache accessible to caching action (put this at the end of yml)
        run: |
          sudo chown $(whoami):$(whoami) -R ~/.cache

It has some problems which I hope someone smarter will know how to solve and share that knowledge:

Post job cleanup.
/usr/bin/tar --posix --use-compress-program zstd -T0 -cf cache.tzst -P -C /home/runner/work/my-repo/my-repo --files-from manifest.txt
/usr/bin/tar: ../../../.cache/docker/overlay2/7aca327c7588ed39f3f42dee1459c9ca3e42d699db7f03c0152212c9b8d347be/work/work: Cannot open: Permission denied
...
/usr/bin/tar: ../../../.cache/docker/overlay2/9873a6171ad3ecbd9d5541d49b8ca221e1d6e39871e356ff07f5ff22715f8959/work/work: Cannot open: Permission denied
/usr/bin/tar: Exiting with failure status due to previous errors
Warning: Tar failed with error: The process '/usr/bin/tar' failed with exit code 2

Soooo... it does not work (as I said, this is an idea, not a solution :) ).

from cache.

Kurt-von-Laven avatar Kurt-von-Laven commented on June 1, 2024

Myself and others have run into a similar issue. I am not aware of another way around the permission errors currently besides using docker save and docker load. Running Docker in rootless mode addresses some Docker permission issues, but not these. If someone knows of a better approach, please feel free to send a PR to ScribeMD/docker-cache.

from cache.

Kurt-von-Laven avatar Kurt-von-Laven commented on June 1, 2024

I have since learned of a much faster approach provided you have control over where the Docker image is pulled from. You can simply upload the Docker images you are pulling to the GitHub Container Registry, which spares you the cost of docker save + cache upload on cache miss. For those, like us, who don't have such control, docker-cache is probably your best bet for now.

from cache.

EpsilonAlpha avatar EpsilonAlpha commented on June 1, 2024

I run into the same issue: tried to cache /var/lib/docker with the result after cleanup (which means saving the cache I guess):

Post job cleanup.
Warning: EACCES: permission denied, scandir '/var/lib/docker'

Donno if I made a mistake on the workflow file in the Cache Step:

    - name: Cache Docker Layer
      id: cache-docker
      uses: actions/cache@v3
      env:
        cache-name: cache-docker-layer
      with:
        path: /var/lib/docker
        key: ${{ runner.os }}-build-${{ env.cache-name }}-${{ hashFiles('**/image/overlay2/repositories.json') }}
        restore-keys: |
          ${{ runner.os }}-build-${{ env.cache-name }}-
          ${{ runner.os }}-build-
          ${{ runner.os }}-

from cache.

EpsilonAlpha avatar EpsilonAlpha commented on June 1, 2024

No, you didn't make a mistake. Please give docker-cache a try.

Thanks, I wanted to try it out before reporting back and works good, but not what I expected:

  • The first run with docker-cache took whooping 8m6s, I guess it needed to build up the cache. Acceptable imo for the initial build.
  • The second run with docker-cache took 1m50s.
  • But for comparison: Without any caching method the workflow (which builds the image) runs in 26s till 35s.

So caching is for me at this point a disadvantage, which bumps up the runtime of the workflow very drastically. Reducing the workflow runtime, was my goal by caching the layers.
The description of docker-cache also states that: "Note that this action does not perform Docker layer caching."
And that is exactly what I wanted.

So thanks for the alternative, but I will skip using caching at this point. Will give it a try if the solution evolved.

from cache.

Kurt-von-Laven avatar Kurt-von-Laven commented on June 1, 2024

Apologies; I overlooked the fact that you were looking for Docker layer caching, which indeed is not the intended purpose of docker-cache. As your experiment demonstrated, docker-cache is geared towards those who are pulling images, not building them. I suggest using the official Docker build push action, which performs layer caching and/or hosting your images in the GitHub Container Registry (possibly in addition to anywhere else they might be hosted).

from cache.

EpsilonAlpha avatar EpsilonAlpha commented on June 1, 2024

No Problem, that's why I tested it out if it would work for my use case. And thanks for the Build Push Action and the GitHub Container Registry. Maybe the speed from a GH-Runner to that registry is faster than pushing to resources outside.

I will also try out, if a image hosted on GCR can be pulled down faster than Docker Hub, thanks for the idea πŸ‘πŸΌ this maybe also speed up the process if it works

from cache.

Kurt-von-Laven avatar Kurt-von-Laven commented on June 1, 2024

Yes, credit to Thai Pangsakulyanont for his excellent blog post showing that pulling from GHCR is as fast as pulling from the cache, but since it's a pure docker pull, you don't even need the subsequent docker load step required when restoring from the GitHub Actions cache. Similarly, you can push to GHCR without the preceding docker save step required when saving to the cache, which I expect is the primary reason saving the cache took 8 minutes during your experiment. You most likely have many Docker images with a lot of overlapping layers between them if you are building images, which makes docker save extremely inefficient. When you are only pulling, you only have the precise images you pulled, so the performance tends to be far better in that case as you end up pushing far last data. As with any performance-related work, I recommend to anyone else reading this that you measure the results in your specific case, but to summarize docker-cache works best for those who are only pulling images from a registry other than GHCR and don't control which registry the images are housed at.

from cache.

github-actions avatar github-actions commented on June 1, 2024

This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.

from cache.

lukasz-mitka avatar lukasz-mitka commented on June 1, 2024

What would be the benefit of implementing this here if there's a GH caching available via docker/build-push-action?

Example:

      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: user/app:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

Links:
https://docs.docker.com/build/ci/github-actions/cache/#github-cache
https://github.com/docker/build-push-action

from cache.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.