Comments (7)
QA Template
Solution
Added Security Context to the cleanup job in #1862
Testing
Install rancher 2.7.5 in a hardened cluster rke2 1.24 (see issue for more info on the env)
Upgrade to the latest rancher should not give any error in the fleet-cleanup-clusterregistrations
job
Additional info
Needs a new fleet RC
from fleet.
QA report
Testing considerations:
For hardening, I followed the steps detailed in this guide with few adjustments.
As psp is disabled after Kubernetes version 1.24 testing was done with psa and Kubernetes version > 1.25
Tested scenarios:
Scenario 1: Fresh installation hardened rke2 cluster on Rancher performing CIS with no errors.
Setup:
- RKE2 version: rke2
v1.28.9+rke2r1
- Hardening parameters used: cis and psa.yaml
- Configured default Service account
- Fresh installation of Rancher
v2.9-6d87a11ea46b7571646d7c3d7af704584c39fd62-head
- Confirmed successful Rancher deployment
- Deployed CIS benchmark
rke2-cis-1.8-profile-hardened
with 71 passes, 0 errors and 48 warnings
Scenario 2: Installation hardened rke2 cluster on Rancher 2.8 performing CIS with no errors, later upgrade to 2.9 and new CIS with no errors.
Setup:
- RKE2 version: rke2
v1.26.15+rke2r1
- Hardening parameters used: cis and psa.yaml
- Configured default Service account
- Fresh installation of Rancher
v2.8-ec76f714a7d22be1d4266cf5385f0aef62a9a653-head
- Confirmed successful Rancher deployment
- Deployed CIS benchmark
rke2-cis-1.8-profile-hardened
with 71 passes, 0 errors and 48
warnings - Upgraded to Rancher
v2.9-6d87a11ea46b7571646d7c3d7af704584c39fd62-head
- Deployed new CIS benchmark
rke2-cis-1.8-profile-hardened
again with 71 passes, 0 errors and 48 warnings - Checked
fleet-cleanup-clusterregistrations
job did not throw any error
from fleet.
Successfully upgraded to the Rancher 2.7.6 from Rancher 2.7.5 without any error. Also the command:
satya@opensuse15:~> kubectl get job -A
NAMESPACE NAME COMPLETIONS DURATION AGE
fleet-default test-fleet-examples-2ff03 1/1 21s 11m
kube-system helm-install-traefik 1/1 27s 45m
kube-system helm-install-traefik-crd 1/1 24s 45m
satya@opensuse15:~> kubectl get pods -n cattle-system
NAME READY STATUS RESTARTS AGE
helm-operation-czs7c 0/2 Completed 0 15m
helm-operation-f5k74 0/2 Completed 0 45m
helm-operation-grfgr 0/2 Completed 0 14m
helm-operation-hpcnv 0/2 Completed 0 45m
helm-operation-nlj28 0/2 Completed 0 44m
helm-operation-qz66m 0/2 Completed 0 45m
helm-operation-vxvzc 0/2 Completed 0 45m
rancher-569b86c8f5-7s8kn 1/1 Running 0 17m
rancher-webhook-788c48b988-rlg6d 1/1 Running 0 45m
Also, fleet upgraded from 0.7.0 to 0.7.1 without any error.
Please let me know if anything else need to check.
from fleet.
I have been able to reproduce the issue, but only in the case when the cis-1.6 profile is enabled in the underlying RKE2 Kubernetes cluster.
Helm operation pods are continuously failing:
root@ip-172-31-26-45:/etc/rancher/rke2# kubectl get pod -n cattle-system
NAME READY STATUS RESTARTS AGE
helm-operation-2klh9 1/2 Error 0 28m
helm-operation-2n6nn 1/2 Error 0 52m
helm-operation-2vqfh 1/2 Error 0 54m
helm-operation-4xskv 1/2 Error 0 47m
helm-operation-55qhj 1/2 Error 0 7m55s
helm-operation-5mf85 1/2 Error 0 3m57s
helm-operation-5rplt 2/2 Running 0 3m52s
helm-operation-78gpf 1/2 Error 0 23m
helm-operation-8jcww 1/2 Error 0 52m
helm-operation-8ztt4 1/2 Error 0 43m
helm-operation-97l5p 1/2 Error 0 52m
helm-operation-b5sfs 1/2 Error 0 22m
helm-operation-cndwz 0/2 Completed 0 52m
helm-operation-dww9d 1/2 Error 0 52m
helm-operation-gc7cs 1/2 Error 0 8m57s
helm-operation-gg29f 1/2 Error 0 33m
helm-operation-h7wsk 1/2 Error 0 18m
helm-operation-hhvqt 1/2 Error 0 18m
helm-operation-j26v7 1/2 Error 0 37m
helm-operation-jsk4k 1/2 Error 0 51m
helm-operation-nv7zz 0/2 Completed 0 53m
helm-operation-qrqsr 1/2 Error 0 13m
helm-operation-qt7bf 1/2 Error 0 52m
helm-operation-r82vk 1/2 Error 0 46m
helm-operation-r9r8w 1/2 Error 0 28m
helm-operation-s66gv 1/2 Error 0 13m
helm-operation-v687s 1/2 Error 0 51m
helm-operation-z7q6c 1/2 Error 0 43m
helm-operation-zd8t5 1/2 Error 0 38m
helm-operation-zr559 1/2 Error 0 51m
helm-operation-zs9w2 1/2 Error 0 33m
rancher-6bf9cd485c-8d5fg 1/1 Running 1 (55m ago) 56m
rancher-6bf9cd485c-qd79p 1/1 Running 0 56m
rancher-6bf9cd485c-rjphw 1/1 Running 0 56m
rancher-webhook-998454b77-ghsgd 1/1 Running 0 52m
Due to the failure in the fleet-cleanup-clusterregistrations pod related with the
root@ip-172-31-26-45:/etc/rancher/rke2# kubectl get pod -n cattle-fleet-system
NAME READY STATUS RESTARTS AGE
fleet-cleanup-clusterregistrations-vsbhq 0/1 CreateContainerConfigError 0 8m26s
fleet-controller-64f5b4585-shjjb 1/1 Running 0 52m
gitjob-58dc7cb797-wr28d 1/1 Running 0 52m
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 9m33s default-scheduler Successfully assigned cattle-fleet-system/fleet-cleanup-clusterregistrations-vsbhq to ip-172-31-26-45
Warning Failed 7m39s (x12 over 9m33s) kubelet Error: container has runAsNonRoot and image will run as root (pod: "fleet-cleanup-clusterregistrations-vsbhq_cattle-fleet-system(d32d3d68-6bbf-4783-9ecc-d6200284b411)", container: cleanup)
Normal Pulled 4m32s (x26 over 9m33s) kubelet Container image "rancher/fleet-agent:v0.8.0" already present on machine
from fleet.
Also hitting this issue!
Rancher 2.7.7 and RKE2 1.24.x w/CIS Profile 1.6 enabled
from fleet.
This appears to be a missing PSP for hardened clusters using CIS profile, running <=1.24. On 1.25+ the entire cattle-fleet-system
namespace is exempted from the PSA.
A quick workaround is to just bind the unrestricted PSP to the service account:
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: unrestricted-psp
namespace: cattle-fleet-system
rules:
- apiGroups:
- extensions
resourceNames:
- system-unrestricted-psp
resources:
- podsecuritypolicies
verbs:
- use
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: unrestricted-psp
namespace: cattle-fleet-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: unrestricted-psp
subjects:
- kind: ServiceAccount
name: fleet-controller
Once the rancher charts are fixed, the Role and RoleBinding can be deleted.
from fleet.
PR #1862 needs a backport to v0.9
from fleet.
Related Issues (20)
- Bundle is not re-generated and Gitrepo status is not updated when deleting a bundle. HOT 2
- metrics: Add metrics to gitops controller
- Enable node selection for shards
- Grafana Dashboard for Metrics
- Fleet Repo doesn't show any error when there is an issue (in fleet 0.9) HOT 1
- Improve Content resources cleanup
- Add extraEnvs to allow setting env vars for the controller
- Error `no matches for kind \"GitRepo\" in version \"fleet.cattle.io/v1alpha1\"` in gitjob logs after start HOT 1
- Error with stacktrace in gitjob pod after sending a webhook event with wrong credentials
- ‘Continuous Delivery Dashboard’ shows bundles in not ready state
- Force Update on GitRepo is creating multiple job workloads that fill-up the entire pod limit.
- [SURE-8482] Misleading error message when trying to deploy cluster resources with targetNamespace HOT 1
- [SURE-8481] When deploying a helm chart stored in a OCI repository, if the configuration is invalid, fleet won't throw any error. Instead, the bundle will remain absent from the bundle list without any indications. HOT 3
- Submodules aren't cloned recursively anymore since Fleet 0.9.x
- cannot clone ssh url "SSH agent requested but SSH_AUTH_SOCK not-specified" HOT 2
- Merge gitopts and gitrepo controllers
- [0.9] [SURE-8550] drift detection is generating secrets without cleaning HOT 1
- [forwardport v0.10][SURE-8550] drift detection is generating secrets without cleaning
- Problems with Gitrepos with wrong URLs HOT 1
- Drift detection/correction should omit status fields
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fleet.