GithubHelp home page GithubHelp logo

Comments (30)

rohanmars avatar rohanmars commented on July 28, 2024 2

I've implemented the feature. Please test it out on the v1.0.1 or v1 tag/release. I added a dry-run option also if you want to test with that first to simulate everything

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024 1

Thanks!
That's a good idea. Let me look into it.

Something like?

    steps:
      - uses: dataaxiom/[email protected]
        with:
          keep-n-tagged: 10
          exclude-tags: dev
          token: ${{secrets.GITHUB_TOKEN}}

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024 1

I think your proposal is perfect

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024 1

pleasure.

Now it passes, but I see these warnings:

Warning: couldn't find image digest sha256:b25c425cb7b29392a93342a9f9b2d3f3655bf1bba0626d4949ccb29d9a2f2d94 in repository, skipping
Warning: couldn't find image digest sha256:9290ba706a8173b39325236be6e0803cbb0835b7cf4022aa63315a3709628539 in repository, skipping
Warning: couldn't find image digest sha256:45db222b16bd5b922b6cb9e7b08ea6486b9b72cf9d6a321041ec688348114a07 in repository, skipping
Warning: couldn't find image digest sha256:30d7d0a3727d7a82ccfb835281495bf45d96829efc1cb043308917fceab4c71d in repository, skipping
deleting package id: 195021580 digest:sha256:2d49d6d1a3fa6a9188d4f23ce3e4a1b1987ea57aa29dcb97e8fd4c9c64bcfab5 tag:v1.31.0
Warning: couldn't find image digest sha256:bb1dd8101ad4e1d372470cd5451ee20b9d46b7cfc680df682970eaddb8f49b80 in repository, skipping
Warning: couldn't find image digest sha256:f5715c3bb6f7886d201ebc7278e58e674a5eaa3f92fa07dfe9bdb5e0d7819213 in repository, skipping
Warning: couldn't find image digest sha256:1f9ee2ae5b0eb06322458d3e8c506e696f2ddbf72996cd909630071081e3c79a in repository, skipping
Warning: couldn't find image digest sha256:e272e95bbfd56c14318d38d36719b7a7d85055cf0f1c904a0259acb1e15a3c13 in repository, skipping
deleting package id: 195021156 digest:sha256:9273a4198824a8a15d1e0856e56e498a6f6f938116121e756dd9e2148ceb4fac tag:v1.30.0
Warning: couldn't find image digest sha256:7a4ba8b4397c604529532a4c945984e93d66dd1524a1a690b75494d9baf4b0aa in repository, skipping
Warning: couldn't find image digest sha256:7e9b63975b3803fe421b628987fd942f351985496f13355e15bb6911a2984ef0 in repository, skipping
Warning: couldn't find image digest sha256:0a39d1fa1509f0885a9a990a3515f42b69c59e9cef87fe295f0cc5cc82bead8b in repository, skipping
Warning: couldn't find image digest sha256:890a89fca7d0507d484f96d1f742a87d84f180331f149dd491136256e768f5cb in repository, skipping
deleting package id: 197541388 digest:sha256:07fd4058a1c233662ebdbc6c5bbd008fde13e1e9494451197ec96811dbf8e1f3

https://github.com/ManiMatter/decluttarr/actions/runs/9085371392/job/24968536043

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024 1

Let me think about your latest suggestion more, I'll get back to you. As long as it can support all the cleanup modes I agree some better validation should be used.

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024 1

just wanted to again thank you - fab work. really enjoing the exchange with you here on this :)

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024 1

thanks for being my first external user/tester. it's validated that it's currently needed.

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024 1

Hey
I haven't yet implemented a partial-manifest-removal mode, I feel like it pulls into scope images that you may not want to be deleted (especially for the other non- keep-n-tagged modes) (but it still may be a valid use case). So currently you would have to then go and put the invalid images in for deletion on a next workflow run after running the validate mode.

I did add a few extra features:

  • Removal of ghost images in scope for (keep-n-tagged/keep-n-untagged modes. So it doesn't include them in the count and they get auto removed
  • I still log invalid images, but they don't show up a workflow warnings.
  • Added the 'validate' mode which will scan all images and print out any images which are partially valid
  • Support using tags option on keep-n-tagged mode, additional to the exclude-tag. They way you can pull into scope images which you know from a previous validation to be invalid.

I planning to build some automated tests next, it's quite a manual process to test all these scenarios. Once that is done then there is other options which could be added: deleting by date, by download count, package restoration etc

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024 1

makes me think that maybe the tags option should really be delete-tags instead to be clearer and in line with the other delete- options

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024 1

agree to all you said :)

typescript / github actions etc are not my forte at all, and i'm actually studying how you are coding it, and am learning a lot from it. thank you

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024

Hey, super cool that you built it so quickly, much appreciated.

I let it run with these settings, and got the following error

Settings:

      - name: "Clean up docker images"
        uses: dataaxiom/ghcr-cleanup-action@v1
        with:
          keep-n-tagged: 10
          exclude-tags: dev
          dry-run: true
          token: ${{secrets.GITHUB_TOKEN}}

Error:

Error: Cannot read properties of undefined (reading 'id')

Job:
https://github.com/ManiMatter/decluttarr/actions/runs/9083315844/job/24961774859

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024

Btw - "cherry on top" - does the script print out at the very end how many containers it removed?
Would be cool to know (makes one feel good to know how much crap was removed) ๐Ÿงน

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

Yes, it outputs the following kind of log currently:

deleting tagged images, keeping 0 versions
deleting package id: 216128070 digest:sha256:a6d2b38300ce017add71440577d5b0a90460d0e57fd7aec21dd0d1b0761bbfb2 tag:ubuntu
deleting package id: 216128003 digest:sha256:2af372c1e2645779643284c7dc38775e3dbbc417b2d784a27c5a9eb784014fb8 tag:architecture amd64
deleting package id: 216128018 digest:sha256:2f63021dc56651000aa1e250d42c3aa898a5cd61120aeb8daf9e7e0fd20b84e5 tag:architecture arm
deleting package id: 216128036 digest:sha256:462e829de9164b6c066246cddc265a936071744f689f0ea73daa92b4f9feb47e tag:architecture arm64
deleting package id: 216128051 digest:sha256:6250b8edd7248ca0764e8c10069113ac1c837becd6e1e5a92991dfa14dce842f tag:architecture ppc64le
deleting package id: 216128067 digest:sha256:4b24be9d94475438fe8313d8772be9c94e7c89d4e2b2d2a7570dcb3a7f51ee80 tag:architecture s390x

I've just found some further optimizations which I'll push and release right now, it includes a wildcard matching feature for the tag/exclude tags

It doesn't yet print out total number of images deleted.

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024

cheers. will test it when the next version is ready, let me know

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

I've push the latest changes, v1 or v1.0.2.
I've verified the image order and some a bunch of different tests, however the code equality test is different between keep-n-tagged and keep-n-untagged which I'm still investigating/testing.

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

Can you trigger your pipeline again and see if the id error still exists?

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

I think I can harden then platform image deletion.I think you might have some manifests that don't link to actual images. I'll make the change right now.

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024

re-ran the job right now, ended up in the same place:
Error: Cannot read properties of undefined (reading 'id')

https://github.com/ManiMatter/decluttarr/actions/runs/9083315844/job/24964895577

Not sure I understand this one?

I've verified the image order and some a bunch of different tests, however the code equality test is different between keep-n-tagged and keep-n-untagged which I'm still investigating/testing.

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

I'm testing a fix. I've added some extra guards to handled corrupt manifests.

On the image order I have to sort the remaining "images" once all the processing is done, what I found was the sort function had to be reversed between keep-n-tagged and keep-n-untagged logic. Which I wouldn't expect, but my test showed it needed it. (Still need to verify it more)

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

Thanks for the PR!!!!!
I've made some changes on main branch. Can you retest using main?

uses: dataaxiom/ghcr-cleanup-action@main

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

thats great. yes I added those warnings, you have some corrupt image manifests, but all good. It just skips over them. Any way to tell if the action keep the most recent tags?

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024

number of multi architecture images deleted = 70
total number of images deleted = 357

nice one! ๐Ÿฅ‡

From what I see, the following are not removed, which looks good. I see that the next version v1.35.0 is removed, so everything looks right.

  • dev (as being on the "exclude-tags" list)
  1. latest/v1.42.0
  2. v1.41.2
  3. v1.41.1
  4. v1.41.0
  5. v1.40.0
  6. v1.39.0
  7. v1.38.0
  8. v1.37.0
  9. v1.36.0
  10. v1.35.1

Job:
https://github.com/ManiMatter/decluttarr/actions/runs/9085371392/job/24968536043

Images:
https://github.com/ManiMatter/decluttarr/pkgs/container/decluttarr/versions?filters%5Bversion_type%5D=tagged

On the warnings regarding the corrupt image manifests, would these remain there after the job has run?
Or asking differently, will these never be cleaned up?
If there was a way to kill those automatically, that'd be great (else the user sees the same message over and over again)

Warning: couldn't find image digest sha256:b25c425cb7b29392a93342a9f9b2d3f3655bf1bba0626d4949ccb29d9a2f2d94 in repository, skipping

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

That's great to hear it's all calculating the cleanup correctly. I would expect those warnings to be on only that job. The image manifests which have corruptions are deleted on that run so it won't loop through that manifest file again in a future job. When new tags come up for deletions on new jobs it would asses those manifests then for processing, which would be new errors if there are any.

I'm making a few minor changes (for stats output) and then I'll release 1.0.3 later today and retag v1

Thanks so much for the feature request and testing. It made me fix a bunch of things with the keep-n scenarios.

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024

On the warnings regarding the corrupt image manifests, would these remain there after the job has run?
Or asking differently, will these never be cleaned up?
If there was a way to kill those automatically, that'd be great (else the user sees the same message over and over again)

I think I can answer my own question.

Here is what's happening. This github action doesn't properly work with multi-arch images, and kills the untagged images belonging to multi-arch images.

Here's an example:
https://github.com/ManiMatter/decluttarr/pkgs/container/decluttarr/195679913

This image has 3 architectures, but they were deleted by the github action.
Multi Arch Package: sha256:502b0b859a36e03643941232530b4a6ccf3a8a2bca1cb90700207f83b2b784c3

  • linux/amd64: @sha256:b25c425cb7b29392a93342a9f9b2d3f3655bf1bba0626d4949ccb29d9a2f2d94
  • linux/arm64: @sha256:9290ba706a8173b39325236be6e0803cbb0835b7cf4022aa63315a3709628539
  • unknown/unknown: @sha256:45db222b16bd5b922b6cb9e7b08ea6486b9b72cf9d6a321041ec688348114a07

Your script correctly says that the 3 latter ones can't be found (no wonder, they were deleted).
This error goes away, if the parent package gets deleted (which it will, if by the settings it is in scope of deletion, for instance as in my case when it's older then 10 versions).

I would have a suggestion though here: Why not add another setting "remove-broken-multi-arch-images" (or something along these lines, or just do the following additional cleanup without even having an explicit setting for it)?
->Remove any multi arch package where the underlying manifests are not found anymore.
My thinking: A multi-arch package that points nowhere is useless... Any my take is that many people switching over from the other github action to yours will probably suffer from that problem.

What do you think?

Update: Just seen that you posted this while I wrote:

The image manifests which have corruptions are deleted on that run so it won't loop through that manifest file again in a future job.

Does this mean that you have already baked the removal in (without user having to specifically put a setting), or you are removing them "by chance" because they may fall into scope (e.g. because they are older than 10 versions)?

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

Yeah, that's exactly what is likely happening. It's the main reason I wrote this version, none of the existing cleanup actions (even dedicated container ones supports multiarch images) which corrupted some of my images so I disabled the actions.

That's a interesting idea to proactively validate the image manifest. I almost think that should be a different standalone mode producing a report. As multi arch image may be corrupt but still work for a given platform, if that underlying image still exists the pull would likely work on a given host. In your case it looked like all the underlying images were gone. This issue did make me think of a scenario where maybe multiple multi-architecture manifests may point to the same image digest. I need to walk through the code on that potential.

I was thinking this morning another standalone mode could be a "recovery" mode whereby you could specify the package id's to recover any packages that were accidentally deleted from the action directly (unrelated to the issue above). Otherwise you have to jig some web api call manually to recover them.

Does this mean that you have already baked the removal in (without user having to specifically put a setting), or you are removing them "by chance" because they may fall into scope (e.g. because they are older than 10 versions)?

Yes, that's correct for the images that fall into scope on each run. But your idea would proactively valid the 10 other versions in your scenario, but it probably shouldn't touch them.

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024

you're right, there could be multi-arch containers that miss some manifests, but not others.

Would this be a valid approach?

        with:
          keep-n-tagged: 10
          exclude-tags: dev
          dry-run: true
          token: ${{ secrets.GITHUB_TOKEN }}
          partial-manifest-removal: true
  1. By default, all multi-arch containers are removed where all underlying manifests are invalid (no setting is needed for that, imo should be default behavior of this action)
  2. If partial-manifest-removal is true, also multi-arch containers are removed, where only some manifests are invalid
    These two rules are executed before any other rules (for instance, selecting the keep-n-tagged)

In the part where the images are deleted, I would then not show the warning that the respective manifest is missing (simply skip that)

Thoughts?

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

Good suggestions. It agree removing the warning for in scope deletions is probably a good idea. I'll make that change.

remove multi arch containers that do not have a single valid underlying manifest (e.g. if i have 15 images, and nr. 9 is completly broken, it would take 1-8, 10&11, rather than 1-10 when selecting 10 images to keep)

That's a good idea for fully broken images in scope. Out of scope is a bit more problematic for the other cleanup modes (not keep-n-tagged) Let me review the code.

spit out a warning for multi arch containers that have partially missing manifests (but some are valid), and keep the container

For out of scope processing that might be more challenging as the current different modes: delete by untagged, delete by tag and keep-n-untagged operate a bit different to the keep-n-tagged. In my case I'm currently using the delete untagged (default), and expect the total number of images to grow without ever deleting released images. Maybe a "verify" option could be used to enable out of scope scanning/reporting, which then could be turned off when the images 'healthy'

You are probably right there are a lot of broken manifests out there in ghcr.io considering the lack of multi-arch support, so this kind of functionality would be useful for projects that switch to this action.

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024

I've seen this new option:

Ghost Images

Multi architecture images which have no underlying platform packages are
automatically removed for the keep-n-untagged and keep-n-tagged modes and not
include in their count. Partially corrupt images are not removed by default, use
the validate option to be able to identify and then fix them.

with "fix them", does this mean it will remove those partially broken platform packages as we discussed above with the "partial-manifest-removal: true" flag? Or does this simply mean that the logs spit out which packages are broken, so that I can then go in and fix them myself?

Many thanks for clarifying :)

from ghcr-cleanup-action.

ManiMatter avatar ManiMatter commented on July 28, 2024

I'm impressed by how many features you have already added to this action in so little time!

Based on your this statement, my understanding is that packages that are fully broken (no working underlying manifests) are removed already.

Removal of ghost images in scope for (keep-n-tagged/keep-n-untagged modes. So it doesn't include them in the count and they get auto removed

However, I am not sure I understand how the partial-manifest-removal mode may conflict with the different action modes.

I feel like it pulls into scope images that you may not want to be deleted (especially for the other non- keep-n-tagged modes)

I understand there are essentially 4 modes.

  1. Remove all untagged and keep all tagged
  2. Remove n untagged and keep all tagged
  3. Remove all untagged an keep n tagged
  4. Remove specific tags

In all options, I would expect that if partial-manifest-removal is on, packages that are partially broken would be removed in any case. Meaning: Even if they are listed as "excluded" or if there are 10 untagged items of which one is partially broken and we say keep 100, it would still reduce the number to 9. E.g., the "fixing" should supersede everything.

I was thinking first whether something that is "excluded" should also be excluded from partial-manifest-removal. However, if we clearly document that partial-manifest-removal takes priority over exclusion, then I think it's fine that partial-manifest-removal supersedes. If we were to give priority to exclusion over partial-manifest-removal, it would start getting tricky because of wildcard support. If somebody is not OK with exclusion being ignored for partial-manifest-removal, then I guess the user should not use partial-manifest-removal, but instead use the "validate" option and remove these items manually.

What do you think?

I am probably missing an important aspect that you have in mind, when cautioning that the partial-manifest-removal may conflict with the other modes. Could you elaborate please? Maybe we can brainstorm it out together :)

Cheers

from ghcr-cleanup-action.

rohanmars avatar rohanmars commented on July 28, 2024

I currently have coded that exclusion takes priority over everything, and it currently supports wildcards. I have a pre processing step that process all the tags for exclusion. I think that should stay that way, while ironically it's a cleanup action it should take a super cautious approach to deletion and cleanup.

Your point about partial-manifest-removal in all cases for the other modes is where it gets tricky. As from a user perspective it kind of implies including into scope all images as you point out. But the untagged and tags modes currently leave all out of scope images alone. So it might lead to more user error unless there is more general approach about deleting.

I would propose changing the partial-manifest-removal to a delete-partial-manifests. I was thinking yesterday that there may be a number of delete- options which are additive. Like delete-older-then, delete-downloads-less-then and delete-partial-manifests, which get processed before the keep and tags modes. In this setup strangely mode 1 is really an explicit delete-untagged option, might might be default option.

So the core logic would be to process excludes, process delete options, then process the keep options.

Which then got me thinking the project needs automated testing asap. So that these other functions can be added easily and without breaking all the other modes. haha

from ghcr-cleanup-action.

Related Issues (12)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.