Comments (22)
That would work for non-production workloads, which applies to working with latest tag anyway.
The other would be to set spec.strategy.type to 'Recreate' in the deployment which results to some downtime as well, but wouldn't require changes in keel.
I'm currently trying out a very rough patch, where no reset is performed, but a ENV variable is set to each container, resulting in a new rc each time. What is your opinion on this? I remember seeing some discussion earlier on the force-policy feature ticket.
from keel.
No, I didn't see it. Yeah, totally forgot that perms are required for deletion. Only pod deletion permissions are required, thanks!
Regarding the quay:
After a simple unit test that pretty much does the same thing as for Zalando registry, Quay returns an error (every registry wants to be unique). Will get it fixed.
from keel.
It seems that Kubernetes doesn't manage to destroy existing replicas in time. Could you try scaling down to 1 replica and then trying an update? If it would solve the issue then Keel could do it for you. I imagine workflow could be:
- Set replicas to 1 or 0
- Set tag to
0.0.0
- Set replica count and tag to whatever you had before
from keel.
Could work. Another option is to terminate pods, it could be done "slowly" so it's almost like a rolling update. Regarding non-production workloads - I guess it's reasonable to expect that production workloads would be versioned.
Regarding that patch - feel free to open a work in progress PR :)
from keel.
I tested keel 0.4.7 with gke server version 1.8, and "force update" does not work for me.
Here is the sequence of events that happened:
1. scheduler assigns the allocation to a node, and replicaset-controller creates a new pod.
2. the image is set to 0.0.0 by keel. Since the image does not exist, it shows `Failed to pull image` and `Error syncing pod`.
3. the node backoffs pulling the image
4. the new pod is deleted
The notification I got is that the image is updated successfully, and yet the pod is not updated at all. (There is only 1 pod.)
Instead of pulling tag 0.0.0
, which creates unexpected "fail to pull" events in the cluster (unless that person knows keel very well), we only have to change the replica count from N to 0 and then from 0 to N.
from keel.
Seems like k8s scheduler behaviour changed. I think force update should be reimplemented with your suggestion. Seems like a clean approach.
@taylorchu do you have to wait a little bit when you set replicas to 0 or it terminates pods immediately?
from keel.
no, I do not set replica count to 0 myself.
from keel.
@taylorchu started looking at this issue. One problem with setting replicas to 0 would mean that auto scaler would stop working (it has to be unset).
What about terminating all pods? that would result in k8s recreating them. If it was done with some breaks in the process it could even mean no downtime.
from keel.
We're just starting to use Keel (on GCP K8s 1.8.7) and are hit with this problem on 0.6.1.
It occured to me that the only reason Keel has to do this (and b0rks it up apparently) is because Replication Controllers already explicitly refuse to (as per K8s docs)
As my 5c, I think emulating the rolling update would be the cleanest way to go.
Also, we're quite happy running (a carefully selected set of) latest
tagged containers in production and some apps have a gitflow where master
branch is always deemed stable; so merges to the branch are only approved once they're production ready, leading again to a valid use case for a stable/latest tag.
from keel.
Hi, thanks. Will get this sorted ASAP. Do you think my suggested strategy by terminating pods would do the job? Terminated pod will always pull the new version as I understand.
from keel.
Well AFAIK you'd need to set imagePullPolicy: Always
on the Deployment, but other than that, we'd be perfectly happy with it; it's pretty much what we're doing manually now.
from keel.
Awesome, I am a bit swamped by work these days but will try to add and test this strategy either this evening or on the weekend :)
from keel.
That would be awesome! We'll be more than happy to help you test the changes if you like.
from keel.
Hi @The-Loeki, just pushed alpha
tag that is built based on #154. Did testing and it seems to be a reliable way to force update for same image tags.
It would be nice if you did more testing as it should also solve that other #153 issue (even added a unit test for that specific docker registry :)).
Migrated client-go (which is now split into multiple repos) to release-6.0
which should ensure that everything works for foreseeable future. There were a bunch of other updates to dependencies which required more changes (how we parse images) so any additional testing would be really welcome :)
from keel.
Hi @rusenask thanks for your hard work :) Today we've done the first round of testing on the alpha tag.
The Good
- Zalando works w00t
- We've done two deployment upgrades with 'latest' tags to see how it works and it looks nice for now. We'll be doing a bunch more of those and be sure to let you know!
The Bad
- RBAC permissions need to be fixed to allow deletion of pods.
Question: Do you need to delete replicasets and replicacontrollers as well?
I'll hack up a PR
The Ugly
time="2018-03-09T10:23:29Z" level=debug msg="registry client: getting digest" registry="https://quay.io" repository=coreos/dex tag=v2.9.0
2018/03/09 10:23:30 registry failed ping request, error: Get https://quay.io/v2/: http: non-successful response (status=401 body="{\"error\": \"Invalid bearer token format\"}")
time="2018-03-09T10:23:30Z" level=debug msg="registry.manifest.head url=https://quay.io/v2/coreos/dex/manifests/v2.9.0 repository=coreos/dex reference=v2.9.0"
time="2018-03-09T10:23:30Z" level=info msg="trigger.poll.RepositoryWatcher: new watch repository tags job added" digest="sha256:c9ab4b2f064b8dd3cde614af50d5f1c49d6c45603ce377022c15bc9aa217e2db" image="quay.io/coreos/dex:v2.9.0" job_name=quay.io/coreos/dex schedule="@every 24h"
time="2018-03-09T10:23:37Z" level=debug msg="secrets.defaultGetter.lookupSecrets: pod secrets found" image=quay.io/jetstack/cert-manager-controller namespace=kube-system pod_selector="app=cert-manager,release=cert-manager" provider=helm registry=quay.io secrets="[]"
time="2018-03-09T10:23:37Z" level=debug msg="secrets.defaultGetter.lookupSecrets: no secrets for image found" image=quay.io/jetstack/cert-manager-controller namespace=kube-system pod_selector="app=cert-manager,release=cert-manager" pods_checked=1 provider=helm registry=quay.io
time="2018-03-09T10:23:37Z" level=debug msg="registry client: getting digest" registry="https://quay.io" repository=jetstack/cert-manager-controller tag=v0.2.3
2018/03/09 10:23:37 registry failed ping request, error: Get https://quay.io/v2/: http: non-successful response (status=401 body="{\"error\": \"Invalid bearer token format\"}")
time="2018-03-09T10:23:37Z" level=debug msg="registry.manifest.head url=https://quay.io/v2/jetstack/cert-manager-controller/manifests/v0.2.3 repository=jetstack/cert-manager-controller reference=v0.2.3"
time="2018-03-09T10:23:37Z" level=info msg="trigger.poll.RepositoryWatcher: new watch repository tags job added" digest="sha256:6bccc03f2e98e34f2b1782d29aed77763e93ea81de96f246ebeb81effd947085" image="quay.io/jetstack/cert-manager-controller:v0.2.3" job_name=quay.io/jetstack/cert-manager-controller schedule="@every 24h"
time="2018-03-09T10:24:35Z" level=debug msg="secrets.defaultGetter.lookupSecrets: pod secrets found" image=quay.io/jetstack/cert-manager-controller namespace=kube-system pod_selector="app=cert-manager,release=cert-manager" provider=helm registry=quay.io secrets="[]"
time="2018-03-09T10:24:35Z" level=debug msg="secrets.defaultGetter.lookupSecrets: no secrets for image found" image=quay.io/jetstack/cert-manager-controller namespace=kube-system pod_selector="app=cert-manager,release=cert-manager" pods_checked=1 provider=helm registry=quay.io
time="2018-03-09T10:25:30Z" level=debug msg="secrets.defaultGetter.lookupSecrets: pod secrets found" image=quay.io/jetstack/cert-manager-controller namespace=kube-system pod_selector="app=cert-manager,release=cert-manager" provider=helm registry=quay.io secrets="[]"
curl -m 5 -Lv -H "Content-Type: application/json" https://quay.io/v2/jetstack/cert-manager-controller/manifests/v0.2.3
of course 'just works'
I'd venture from the logs that it tries to auth against Quay with the empty/nonexistent secret or something, but that's just a guess
from keel.
Hi @The-Loeki thanks for trying it out :)
Great regarding the good part.
As for the bad, maybe it's angry about empty credentials (try sending empty basic auth). Not sure what changed though. I will dig into it.
from keel.
Did you see my updated comments? I'm hacking up a PR with fixed RBAC perms, but I'm not sure if you need to be able to delete replicasets & controllers too?
from keel.
We'll be deploying Harbor as our own registry service soon, so you might want to get more coffee ;)
from keel.
at least it's open source :)
from keel.
Apparently that error was just a log of failed ping, the manifest was retrieved successfully. I have removed Ping function from the registry client as I can see that public index.docker.io doesn't have that endpoint anymore too. New alpha
image is available.
Merging into master
branch.
from keel.
Looks much better indeed
[theloeki@murphy ~]$ kubectl -n kube-system logs -f keel-85f9fd6447-4gtt2 |grep quay
time="2018-03-09T12:19:13Z" level=debug msg="registry client: getting digest" registry="https://quay.io" repository=coreos/dex tag=v2.9.0
time="2018-03-09T12:19:13Z" level=debug msg="registry.manifest.head url=https://quay.io/v2/coreos/dex/manifests/v2.9.0 repository=coreos/dex reference=v2.9.0"
time="2018-03-09T12:19:14Z" level=info msg="trigger.poll.RepositoryWatcher: new watch repository tags job added" digest="sha256:c9ab4b2f064b8dd3cde614af50d5f1c49d6c45603ce377022c15bc9aa217e2db" image="quay.io/coreos/dex:v2.9.0" job_name=quay.io/coreos/dex schedule="@every 24h"
time="2018-03-09T12:19:18Z" level=debug msg="secrets.defaultGetter.lookupSecrets: pod secrets found" image=quay.io/jetstack/cert-manager-controller namespace=kube-system pod_selector="app=cert-manager,release=cert-manager" provider=helm registry=quay.io secrets="[]"
time="2018-03-09T12:19:18Z" level=debug msg="secrets.defaultGetter.lookupSecrets: no secrets for image found" image=quay.io/jetstack/cert-manager-controller namespace=kube-system pod_selector="app=cert-manager,release=cert-manager" pods_checked=1 provider=helm registry=quay.io
time="2018-03-09T12:19:18Z" level=debug msg="registry client: getting digest" registry="https://quay.io" repository=jetstack/cert-manager-controller tag=v0.2.3
time="2018-03-09T12:19:18Z" level=debug msg="registry.manifest.head url=https://quay.io/v2/jetstack/cert-manager-controller/manifests/v0.2.3 repository=jetstack/cert-manager-controller reference=v0.2.3"
time="2018-03-09T12:19:19Z" level=info msg="trigger.poll.RepositoryWatcher: new watch repository tags job added" digest="sha256:6bccc03f2e98e34f2b1782d29aed77763e93ea81de96f246ebeb81effd947085" image="quay.io/jetstack/cert-manager-controller:v0.2.3" job_name=quay.io/jetstack/cert-manager-controller schedule="@every 24h"
time="2018-03-09T12:20:15Z" level=debug msg="secrets.defaultGetter.lookupSecrets: pod secrets found" image=quay.io/jetstack/cert-manager-controller namespace=kube-system pod_selector="app=cert-manager,release=cert-manager" provider=helm registry=quay.io secrets="[]"
time="2018-03-09T12:20:15Z" level=debug msg="secrets.defaultGetter.lookupSecrets: no secrets for image found" image=quay.io/jetstack/cert-manager-controller namespace=kube-system pod_selector="app=cert-manager,release=cert-manager" pods_checked=1 provider=helm registry=quay.io
from keel.
Fixed, available from 0.7.x
.
from keel.
Related Issues (20)
- Help with dev environment HOT 1
- Feature: Update non `images` fields HOT 2
- Assuming registry up-to-date status?
- Keel configuration to work behind a reverse proxy
- *v1beta1.CronJob: the server could not find the requested resource HOT 2
- Notifications feature proposal
- Website search bar not working
- Initial tag required? HOT 1
- High CPU usage HOT 5
- Fix documentation to add "Registry" webhook
- Bump helm chart release HOT 4
- Helm chart: Ingress not working with release name
- Add support for ntfy HOT 1
- Can keel handle multiple containers in a pod deployment? HOT 3
- Cannot use registry mirror
- keel is using deprecated api /apis/batch/v1beta1/cronjobs
- Helm chart still uses app version 0.19.1
- Timeout errors trying to use polling HOT 1
- Slack integration is broken
- Allow ignoring/skipping version
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from keel.