cruise-automation / k-rail Goto Github PK
View Code? Open in Web Editor NEWKubernetes security tool for policy enforcement
License: Apache License 2.0
Kubernetes security tool for policy enforcement
License: Apache License 2.0
Creating a deployment, via the Kubernetes Terraform provider, causes KRail to reject a deployment that has share-process-namespace
to false
. Version of K-Rail is release-v1.5.0
.
Example Deployment:
resource "kubernetes_deployment" "deployment" {
metadata {
labels = {
name = "name"
}
name = "name"
namespace = "namespace"
}
spec {
replicas = 1
template {
metadata {
labels = {
app = "name"
}
}
spec {
container {
image = "image"
name = "name"
image_pull_policy = "Always"
}
share_process_namespace = false # also tried setting this to null.
}
}
}
}
Deployment deployment had violation: No ShareProcessNamespace: sharing the process namespace among containers in a Pod is forbidden.
Looking at the policy, it seems to simply check if the value is present in the YAML and reject on that basis, rather than looking at the actual value. Is this correct?
I am trying to install k-rail 3.4.2 by following the below steps and the installation fails:
Steps to reproduce
helm repo add k-rail https://cruise-automation.github.io/k-rail/
helm repo update
kubectl create namespace k-rail
kubectl label namespace k-rail k-rail/ignore=true
helm install k-rail k-rail/k-rail --namespace k-rail
The version being installed is 3.4.2
Expected results
k-rail installed
Actual results
helm install --debug k-rail k-rail/k-rail --namespace k-rail
install.go:173: [debug] Original chart version: ""
Error: failed to fetch https://github.com/cruise-automation/k-rail/releases/download/k-rail-v3.4.2/k-rail-v3.4.2.tgz : 404 Not Found
helm.go:81: [debug] failed to fetch https://github.com/cruise-automation/k-rail/releases/download/k-rail-v3.4.2/k-rail-v3.4.2.tgz : 404 Not Found
helm.sh/helm/v3/pkg/getter.(*HTTPGetter).get
helm.sh/helm/v3/pkg/getter/httpgetter.go:90
helm.sh/helm/v3/pkg/getter.(*HTTPGetter).Get
helm.sh/helm/v3/pkg/getter/httpgetter.go:42
helm.sh/helm/v3/pkg/downloader.(*ChartDownloader).DownloadTo
helm.sh/helm/v3/pkg/downloader/chart_downloader.go:99
helm.sh/helm/v3/pkg/action.(*ChartPathOptions).LocateChart
helm.sh/helm/v3/pkg/action/install.go:704
main.runInstall
helm.sh/helm/v3/cmd/helm/install.go:185
main.newInstallCmd.func2
helm.sh/helm/v3/cmd/helm/install.go:120
github.com/spf13/cobra.(*Command).execute
github.com/spf13/[email protected]/command.go:852
github.com/spf13/cobra.(*Command).ExecuteC
github.com/spf13/[email protected]/command.go:960
github.com/spf13/cobra.(*Command).Execute
github.com/spf13/[email protected]/command.go:897
main.main
helm.sh/helm/v3/cmd/helm/helm.go:80
runtime.main
runtime/proc.go:225
runtime.goexit
runtime/asm_amd64.s:1371
I checked the releases and up to 3.4.1 and there is a k-rail-3.4.1, but there is none for 3.4.2
Add a policy that evicts tainted pods after some configurable period has elapsed
https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/#the-eviction-api
Hello!
K-Rail policy No Root User allows me to run Pod only if runAsNonRoot: true is specified in Pod's AND Container's securityContext same time.
Is it correct behavior or should I be able to run pod ONLY with runAsNonRoot: true in PodSecurityContext?
Thanks in advance.
This pod is not getting deleted after its deployment is deleted. Even manual deletion is giving below error.
Error from server (InternalError): Internal error occurred: admission webhook "k-rail.cruise-automation.github.com" attempted to modify the object, which is not supported for this operation
I have added exemption but its still not working.
kubernetes version: v1.20
Prevent overloading etcd in between compactions
Hey guys!
We're relying on pod_trusted_repository
to enforce usage of our local images on all pods across many namespaces. Sometimes we'll deploy a fresh helm chart, and find that k-rail blocks an initContainer (or something we weren't able to override) due to pod_trusted_repository (which is good!)
Finding out which image the pod attempted to use is hard though - some cases requiring us to make a policy exception for the matching workload in order to let the daemonset controller create the replicaset, so that we can then interrogate it and reverse-engineer the necessary image (which is bad)
It'd be really helpful if the k-rail log output from pod_trusted_repository policy enforcement could include the details of the image which was blocked.
Thanks :)
D
Could the helm chart releases be uploaded to a helm repo?
(We run chartmuseum on https://run.pivotal.io + behind the scenes host the charts on s3 via https://github.com/starkandwayne/chartmuseum-for-cloudfoundry)
Hey guys!
This pod was created under k-rail v2.0.1:
<snip>
Volumes:
userfunc:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
And this one was created under v1.5.0:
<snip>
Volumes:
userfunc:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: 512Mi
It would seem as if the policy which enables mutating of pods to enforce EmptyDir limits ... is not enforcing :)
The relevant portion of the config (applied via helm chart) has not changed:
<snip>
- enabled: true
name: pod_empty_dir_size_limit
report_only: false
<snip>
policy_config:
mutate_empty_dir_size_limit:
default_size_limit: 512Mi
maximum_size_limit: 1Gi
I couldn't see any obvious recent changes around this.
Thanks!
D
Hi,
I just read about k-rail yesterday and decided to take it for a test run. After installation (following the steps in the GitHub page), I then tried verifying the installation by deploying the non-compliant-deployment.yaml. However, I ran into this error:
$ k create -f non-compliant-deployment.yaml
Error from server (InternalError): error when creating "non-compliant-deployment.yaml": Internal error occurred: failed calling webhook "k-rail.cruise-automation.github.com": Post "https://k-rail.k-rail.svc:443/?timeout=1s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0
After some investigation, it appears that this is due to Kubernetes 1.19 being compiled with Golang 1.15 and that starting from Golang 1.15, certificates have to populate the SANs section, something which Helm doesn't currently do. They mention using an environment variable to temporarily disable this check but it looks like even this workaround will be disabled in Golang 1.16
I verified that my build uses Golang 1.15:
$ k version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"clean", BuildDate:"2020-11-11T13:17:17Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-14T07:30:52Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}
Some related links:
helm/helm#9046
kubesphere/kubesphere#2928
As a temporary workaround, I added the following to my api-server's manifest:
env:
- name: GODEBUG
value: x509ignoreCN=0
and it now works again. However, this is still a temporary workaround and it it looks like it won't work in future versions of Kubernetes
Hey guys,
I just tried to upgrade my helm-based installation from v1.0 to v1.1.
The pods crash with:
{"level":"fatal","msg":"invalid log level set: \u003cnil\u003e","time":"2019-11-13T03:45:45Z"}
The helm chart's values.yaml
include the following:
config:
log_level: info
If I manually edit the configmap that the helm chart creates, and remove the log_level
line, then the pods will run successfully.
If I remove log_level
from values.yaml
, the helm chart just inserts a default into the configmap anyway, and we're back to crashlooping.
Thus far, my only workaround has been to downgrade to v1.0
Cheers!
D
When operating clusters for tenants, it may be desirable to enforce a default namespace NetworkPolicy.
As useful default NetworkPolicy would be one that prevents traffic ingress from outside of the namespace (and the Ingress controller).
In this scenario, preventing modifications to this default NetworkPolicy would ensure that tenants add additional NetworkPolicies if they need to allow additional ingress into their namespace.
Version:
cruise/k-rail:release-v1.3.1
To Reproduce
on a fresh k-rail rollout, re-configure configMap k-rail-exemptions
, like:
...
data:
config.yml: |
- exempt_policies:
- '*'
group: '*'
namespace: kube-system
resource_name: '*'
username: '*'
- exempt_policies:
- 'pod_no_exec'
group: '*'
namespace: test
resource_name: '*'
username: '*'
...
Start an pod in test namespace, e.g. with a busybox image and exec to the container
Expected behavior
kubectl -n test exec should work as normal
Actual behavior
error is raised:
kubectl -n test exec -t -i test-exec -- sh
Error from server (InternalError): Internal error occurred: add operation does not apply: doc is missing path: "/metadata/annotations": missing value
AdmissionReview:
{
"kind": "AdmissionReview",
"apiVersion": "admission.k8s.io/v1beta1",
"request": {
"uid": "3bd81ba7-f13e-4518-ab8b-fc8d0b350589",
"kind": {
"group": "",
"version": "v1",
"kind": "PodExecOptions"
},
"resource": {
"group": "",
"version": "v1",
"resource": "pods"
},
"subResource": "exec",
"requestKind": {
"group": "",
"version": "v1",
"kind": "PodExecOptions"
},
"requestResource": {
"group": "",
"version": "v1",
"resource": "pods"
},
"requestSubResource": "exec",
"name": "test-exec",
"namespace": "test",
"operation": "CONNECT",
"userInfo": {
"username": "admin",
"uid": "admin",
"groups": [
"system:masters",
"system:authenticated"
]
},
"object": {
"kind": "PodExecOptions",
"apiVersion": "v1",
"stdin": true,
"stdout": true,
"tty": true,
"container": "test-exec",
"command": [
"sh"
]
},
"oldObject": null,
"dryRun": false,
"options": null
}
}
Thanks,
Thorsten
I've been trying to configure k-rail for a deployment which includes kube-prometheus-stack. I've gotten most of the components working with a few exemptions. Unfortunately, it seems that the exemption system does not apply to DaemonSet objects correctly, so my deployment cannot complete.
Kubernetes version: 1.19
k-rail version: v3.5.1
# Debug exemption for kube-prometheus-stack
- resource_name: "*"
namespace: "prometheus"
username: "*"
group: "*"
exempt_policies: ["*"]
DaemonSet manifest:
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: prometheus-node-exporter
app.kubernetes.io/instance: kube-prometheus-stack
chart: prometheus-node-exporter-2.0.4
heritage: Helm
jobLabel: node-exporter
release: kube-prometheus-stack
name: kube-prometheus-stack-prometheus-node-exporter
namespace: prometheus
spec:
selector:
matchLabels:
app: prometheus-node-exporter
release: kube-prometheus-stack
template:
metadata:
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: 'true'
labels:
app: prometheus-node-exporter
chart: prometheus-node-exporter-2.0.4
heritage: Helm
jobLabel: node-exporter
release: kube-prometheus-stack
spec:
automountServiceAccountToken: false
containers:
- args:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--path.rootfs=/host/root'
- '--web.listen-address=$(HOST_IP):9100'
- >-
--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)
- >-
--collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
env:
- name: HOST_IP
value: 0.0.0.0
image: >-
[REDACTED]/prometheus/node-exporter:v1.2.2@sha256:a990408ed288669bbad5b5b374fe1584e54825cde4a911c1a3d6301a907a030c
imagePullPolicy: IfNotPresent
livenessProbe:
httpGet:
path: /
port: 9100
name: node-exporter
ports:
- containerPort: 9100
name: metrics
protocol: TCP
readinessProbe:
httpGet:
path: /
port: 9100
resources: {}
volumeMounts:
- mountPath: /host/proc
name: proc
readOnly: true
- mountPath: /host/sys
name: sys
readOnly: true
- mountPath: /host/root
mountPropagation: HostToContainer
name: root
readOnly: true
hostNetwork: true
hostPID: true
securityContext:
fsGroup: 65534
runAsGroup: 65534
runAsNonRoot: true
runAsUser: 65534
serviceAccountName: kube-prometheus-stack-prometheus-node-exporter
tolerations:
- effect: NoSchedule
operator: Exists
volumes:
- hostPath:
path: /proc
name: proc
- hostPath:
path: /sys
name: sys
- hostPath:
path: /
name: root
updateStrategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
And here is the error passed back from k-rail:
admission webhook "k-rail.cruise-automation.github.com" denied the request: DaemonSet kube-prometheus-stack-prometheus-node-exporter had violation: Host Bind Mounts: host bind mounts are forbidden DaemonSet kube-prometheus-stack-prometheus-node-exporter had violation: Host Bind Mounts: host bind mounts are forbidden DaemonSet kube-prometheus-stack-prometheus-node-exporter had violation: Host Bind Mounts: host bind mounts are forbidden DaemonSet kube-prometheus-stack-prometheus-node-exporter had violation: No Root user: Container node-exporter can run as the root user which is forbidden DaemonSet kube-prometheus-stack-prometheus-node-exporter had violation: No Root user: Container node-exporter can run as the root user which is forbidden DaemonSet kube-prometheus-stack-prometheus-node-exporter had violation: No Host Network: Using the host network is forbidden DaemonSet kube-prometheus-stack-prometheus-node-exporter had violation: No Host PID: Using the host PID namespace is forbidden
Following on from #23, I just tried to create a daemonset on v1.0-release, and I'm afraid the original problem (can't create exceptions for daemonset because they have no resource name) persists. Here's the config I fed to k-rail:
- resource_name: "istio-cni-node"
namespace: "istio-system"
username: "*"
group: "*"
exempt_policies: ["pod_no_host_network"]
And here's the debug output, when trying to deploy an istio-node-cni daemonset:
{
"enforced": true,
"kind": "Pod",
"level": "warning",
"msg": "ENFORCED",
"namespace": "istio-system",
"policy": "pod_no_host_network",
"resource": "",
"time": "2019-11-11T09:03:43Z",
"user": "system:serviceaccount:kube-system:daemon-set-controller"
}
{
"kind": "AdmissionReview",
"apiVersion": "admission.k8s.io/v1beta1",
"request": {
"uid": "e6a3f261-3121-437e-b34e-f537245f79de",
"kind": {
"group": "",
"version": "v1",
"kind": "Pod"
},
"resource": {
"group": "",
"version": "v1",
"resource": "pods"
},
"requestKind": {
"group": "",
"version": "v1",
"kind": "Pod"
},
"requestResource": {
"group": "",
"version": "v1",
"resource": "pods"
},
"namespace": "istio-system",
"operation": "CREATE",
"userInfo": {
"username": "system:serviceaccount:kube-system:daemon-set-controller",
"uid": "81af81e6-f027-48da-ac0a-70358b49a5cc",
"groups": [
"system:serviceaccounts",
"system:serviceaccounts:kube-system",
"system:authenticated"
]
},
"object": {
"kind": "Pod",
"apiVersion": "v1",
"metadata": {
"generateName": "istio-cni-node-",
"creationTimestamp": null,
"labels": {
"controller-revision-hash": "76db549475",
"k8s-app": "istio-cni-node",
"pod-template-generation": "1"
},
"annotations": {
"kubernetes.io/psp": "istio-cni-node",
"scheduler.alpha.kubernetes.io/critical-pod": "",
"seccomp.security.alpha.kubernetes.io/pod": "runtime/default",
"sidecar.istio.io/inject": "false"
},
"ownerReferences": [
{
"apiVersion": "apps/v1",
"kind": "DaemonSet",
"name": "istio-cni-node",
"uid": "d6c11e2a-c5eb-4569-bb77-8ee35a83b169",
"controller": true,
"blockOwnerDeletion": true
}
]
},
"spec": {
"volumes": [
{
"name": "cni-bin-dir",
"hostPath": {
"path": "/opt/cni/bin",
"type": ""
}
},
{
"name": "cni-net-dir",
"hostPath": {
"path": "/etc/cni/net.d",
"type": ""
}
},
{
"name": "istio-cni-token-s6fzj",
"secret": {
"secretName": "istio-cni-token-s6fzj"
}
}
],
"containers": [
{
"name": "install-cni",
"image": "registry-internal.elpenguino.net/library/istio-install-cni:1.3.3",
"command": [
"/install-cni.sh"
],
"env": [
{
"name": "CNI_NETWORK_CONFIG",
"valueFrom": {
"configMapKeyRef": {
"name": "istio-cni-config",
"key": "cni_network_config"
}
}
}
],
"resources": {},
"volumeMounts": [
{
"name": "cni-bin-dir",
"mountPath": "/host/opt/cni/bin"
},
{
"name": "cni-net-dir",
"mountPath": "/host/etc/cni/net.d"
},
{
"name": "istio-cni-token-s6fzj",
"readOnly": true,
"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent",
"securityContext": {
"capabilities": {
"drop": [
"ALL"
]
},
"allowPrivilegeEscalation": false
}
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 5,
"dnsPolicy": "ClusterFirst",
"nodeSelector": {
"beta.kubernetes.io/os": "linux"
},
"serviceAccountName": "istio-cni",
"serviceAccount": "istio-cni",
"hostNetwork": true,
"securityContext": {},
"affinity": {
"nodeAffinity": {
"requiredDuringSchedulingIgnoredDuringExecution": {
"nodeSelectorTerms": [
{
"matchFields": [
{
"key": "metadata.name",
"operator": "In",
"values": [
"wn1.kube-cluster.local"
]
}
]
}
]
}
}
},
"schedulerName": "default-scheduler",
"tolerations": [
{
"operator": "Exists",
"effect": "NoSchedule"
},
{
"operator": "Exists",
"effect": "NoExecute"
},
{
"key": "CriticalAddonsOnly",
"operator": "Exists"
},
{
"key": "node.kubernetes.io/not-ready",
"operator": "Exists",
"effect": "NoExecute"
},
{
"key": "node.kubernetes.io/unreachable",
"operator": "Exists",
"effect": "NoExecute"
},
{
"key": "node.kubernetes.io/disk-pressure",
"operator": "Exists",
"effect": "NoSchedule"
},
{
"key": "node.kubernetes.io/memory-pressure",
"operator": "Exists",
"effect": "NoSchedule"
},
{
"key": "node.kubernetes.io/pid-pressure",
"operator": "Exists",
"effect": "NoSchedule"
},
{
"key": "node.kubernetes.io/unschedulable",
"operator": "Exists",
"effect": "NoSchedule"
},
{
"key": "node.kubernetes.io/network-unavailable",
"operator": "Exists",
"effect": "NoSchedule"
}
],
"priority": 0,
"enableServiceLinks": true
},
"status": {}
},
"oldObject": null,
"dryRun": false,
"options": {
"kind": "CreateOptions",
"apiVersion": "meta.k8s.io/v1"
}
}
}
When a Pod has been execed into, a annotation should be added to the Pod to indicate that.
Maybe k-rail.cruise-automation.github.com/taint/exec: <timestamp>
Block Pod execs unless an exemption allows it.
CONNECT
operation on pods/exec
resource
Hey guys,
I find myself needing to allow users to exec into pods in specific namespaces. Problem is, I don't know exactly what those namespaces will be called, ahead of time, but I know that they'll always be prefixed with preview-
.
Is it possible to permit regex matches for fields in exceptions?
Thanks!
David
When I run:
docker build --pull -t k-rail:manual .
I get the following output:
...
go: finding golang.org/x/text v0.3.0
GO111MODULE=on CGO_ENABLED=1 go test -race -cover github.com/cruise-automation/k-rail/cmd github.com/cruise-automation/k-rail/policies github.com/cruise-automation/k-rail/policies/ingress github.com/cruise-automation/k-rail/policies/pod github.com/cruise-automation/k-rail/resource github.com/cruise-automation/k-rail/server
# runtime/cgo
exec: "gcc": executable file not found in $PATH
FAIL github.com/cruise-automation/k-rail/policies [build failed]
FAIL github.com/cruise-automation/k-rail/policies/ingress [build failed]
FAIL github.com/cruise-automation/k-rail/policies/pod [build failed]
FAIL github.com/cruise-automation/k-rail/resource [build failed]
FAIL github.com/cruise-automation/k-rail/server [build failed]
make: *** [Makefile:20: test] Error 2
But the docker build is not aborted and completes with exit code 0
OK, the question isn't the greatest 😬 - I'll try to explain a bit more:
We have a separate set of "build" nodes for CI/CD, but this could be applied to any scenario where you have a separate set of tainted nodes.
These nodes are typically short-lived and are used to allow docker-in-docker reducing the risk that a malicious app or user could run containers that potentially compromise or cause issues for other containers running on the host's Docker daemon.
We use nodeSelector
, taints
and tolerations
to ensure that build agents run on build nodes and no other workloads get scheduled there.
It'd be nice if we could specifically deny (or allow) resources to run on these nodes with k-rail, and allow docker socket mounts on these nodes only based on label or taint. I'm not sure if this ability exists already or if it's a feature that others. would be interested in?
I can write a policy up for this and submit a PR?
Hi,
I am running k-rail on my kubernetes cluster combined with linkerd as service mesh to ensure mTLS communication between pods.
linkerd will automatically inject further (init-)containers into my pod to accomplish this.
One of the injected containers require to be run with runAsNonRoot: false
...
image: cr.l5d.io/linkerd/proxy-init:v1.4.0
imagePullPolicy: IfNotPresent
name: linkerd-init
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_ADMIN
- NET_RAW
privileged: false
readOnlyRootFilesystem: true
runAsNonRoot: false
runAsUser: 0
...
then, of course k-rail is throwing a pod_no_root_user
violation
I was wondering if there is a way to define an exemptions on container level within a pod?
Any help would be much appreciated.
If we split the processing into 2, we will be able to capture many incorrect schemas before we do the validation and leverage the schema validation for types.
@shivdudhani had a good point in https://github.com/nirmata/kyverno/issues/570
This enables automatic scheduling of public workloads on nodes isolated from the rest.
This requires using the k8s API to resolve Pods or some other resource from the Service that the Ingress targets.
Organisations usually come up with some best practices for monitoring and managing pods. A new policy could enforce a configurable list of labels to be mandatory to run a pod in the environment.
For example all pods should have an app
label.
The policies.Config would need to be extended with a new field.
type Config struct {
...
PolicyMandatoryPodLabels []string
}
🤔 Extensions
pods
only it makes sense to think about enforcing labels for other components as well.annotations
on a type. For example kubernetes.io/ingress.class
must not be empty for an ingress on a multi ingress-controller environment.Software version numbers
Client Version: v1.25.4
Kustomize Version: v4.5.7
Server Version: v1.25.4
Describe the bug
Installing 'k-rail' helm chart fails with below error:
helm install k-rail k-rail/k-rail --namespace k-rail
Error: INSTALLATION FAILED: unable to build kubernetes objects from release manifest: unable to recognize "": no matches for kind "PodDisruptionBudget" in version "policy/v1beta1"
Expected behavior
Helm chart is successfully installed
Some incidents require preventing changes being made. During degraded performance, prevent further degregation or thundering herd after recovery.
Hello,
Evil Russian hacker here.
When I hack, if I hack into person with k8 ingress control access, I can hack traffic into cluster to go to my namespace, instead of original namespace.
Preventing hosts from jumping namespaces after creation could prevent this type of hacking. K-rail could do this, but this would make my hacking life harder.
-ERH
Я большой русский медведь. Причудливый и уютный.
Add a mutation policy to make terminationMessagePolicy: FallbackToLogsOnError
default for containers. This will help users determine why a container exited when looking at their logs.
Hey guys,
Trying to use this exemption with the istio-operator:
- resource_name: "istio-cni-node"
namespace: "istio-system"
username: "*"
group: "*"
exempt_policies: ["pod_trusted_repository", "pod_no_host_network"]
When the operator creates a daemonset, the exemption is applied:
{"enforced":false,"kind":"DaemonSet","level":"info","msg":"EXEMPT","namespace":"istio-system","policy":"pod_trusted_repository","resource":"istio-cni-node","time":"2019-10-31T22:05:14Z","user":"system:serviceaccount:istio-system:istio-operator-operator"}
{"enforced":false,"kind":"DaemonSet","level":"info","msg":"EXEMPT","namespace":"istio-system","policy":"pod_no_host_network","resource":"istio-cni-node","time":"2019-10-31T22:05:14Z","user":"system:serviceaccount:istio-system:istio-operator-operator"}
But when the daemonset attempts to create pods, the exemption is not applied, since the resource
value is empty:
{"enforced":true,"kind":"Pod","level":"warning","msg":"ENFORCED","namespace":"istio-system","policy":"pod_trusted_repository","resource":"","time":"2019-10-31T22:05:14Z","user":"system:serviceaccount:kube-system:daemon-set-controller"}
{"enforced":true,"kind":"Pod","level":"warning","msg":"ENFORCED","namespace":"istio-system","policy":"pod_no_host_network","resource":"","time":"2019-10-31T22:05:14Z","user":"system:serviceaccount:kube-system:daemon-set-controller"}
Is this intended behaviour? :)
Thanks,
D
Hey guys,
Could we have another policy, similar to https://github.com/cruise-automation/k-rail#unique-ingress-host, which could prevent deployment of Istio VirtualServices with duplicate names? The policy would serve the same purpose - preventing the accidental (or deliberate) interception of traffic to one service simply by creating a matching virtualservice in another namespace.
I'd be happy to take a crack at duplicating policies/ingress/unique_ingress_host.go
myself, but might need help to add a check to ensure that the necessary CRD to list VirtualServices even exists in the cluster.
Here's an example virtualservice record - the record we care about is spec.hosts
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
annotations:
meta.helm.sh/release-name: minio
meta.helm.sh/release-namespace: dev
creationTimestamp: "2020-07-27T09:44:34Z"
generation: 1
labels:
app.kubernetes.io/managed-by: Helm
name: dev-minio.elpenguino.net
namespace: dev
resourceVersion: "10700039"
selfLink: /apis/networking.istio.io/v1beta1/namespaces/dev/virtualservices/dev-minio.elpenguino.net
uid: 118e4125-20b6-4a82-b940-94c729387b62
spec:
gateways:
- istio-ingressgateway.istio-system.svc.cluster.local
hosts:
- dev-minio.elpenguino.net
Thanks!
D
DAYTONA accounts for approximately half of GCR pulls and we are hitting GCR ratelimits.
Changing ImagePullPolicy:Always
to IfNotPresent
can help the situation.
Implement a mutation policy that can override this value for configured images.
type ImagePullPolicyConfig struct {
Images []string
ImagePullPolicy string
}
type Config struct {
...
PolicyImagePullPolicyOverride struct {
[]ImagePullPolicyConfig
}
I want to change the default configuration (values.yaml) for my setup, but the documentation does not explain how I am able to do this.
Of course I don't want (and cannot) change the values.yaml file in the official k-rail repo. So I tried to fork the repo, but all the files like index.yaml and Chart.yaml point to the official repo.
Update WebhookConfigurations to v1 https://github.com/cruise-automation/k-rail/search?q=%22admissionregistration.k8s.io%2Fv1beta1%22
v1 version of Admission should be used https://github.com/cruise-automation/k-rail/search?q=%22admission%2Fv1beta1%22&type=code
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.16.md#deprecations-and-removals
The admissionregistration.k8s.io/v1beta1 versions of MutatingWebhookConfiguration and ValidatingWebhookConfiguration are deprecated and will no longer be served in v1.19. Use admissionregistration.k8s.io/v1 instead. (#79549, @liggitt)
The
MutatingWebhookConfiguration
andValidatingWebhookConfiguration
APIs have been promoted toadmissionregistration.k8s.io/v1
:
failurePolicy
default changed fromIgnore
toFail
for v1matchPolicy
default changed fromExact
toEquivalent
for v1timeout
default changed from30s
to10s
for v1sideEffects
default value is removed, and the field made required, and onlyNone
andNoneOnDryRun
are permitted for v1admissionReviewVersions
default value is removed and the field made required for v1 (supported versions for AdmissionReview arev1
andv1beta1
)- The
name
field for specified webhooks must be unique forMutatingWebhookConfiguration
andValidatingWebhookConfiguration
objects created viaadmissionregistration.k8s.io/v1
Theadmissionregistration.k8s.io/v1beta1
versions ofMutatingWebhookConfiguration
andValidatingWebhookConfiguration
are deprecated and will no longer be served in v1.19.
A metric endpoint would be helpful to monitor the load and system health.
Prometheus defaults would be nice already
TBD
This policy would mutate pods to include a given Seccomp policy by default, unless an exemption is present.
Default to runtime/default
.
seccomp.security.alpha.kubernetes.io/defaultProfileName=runtime/default
Add a configurable webhook to call when violations are enforced.
It should have a configurable endpoint, method, and body that can be templated via the go template format to support injection of violation parameters.
This would be useful for notifying when a policy has been enforced to a Slack channel, for example. In our experience enforced violations are rare when configured correctly, but we generally want to know about them when they happen.
Hey guys,
I'm afraid the helm chart doesn't seem to be helm3-compatible. We're trying to deploy manifests into a namespace, but the namespace is only created when the manifests are deployed. 🐔 , meet 🥚
Here's how to reproduce:
helm install k-rail deploy/helm
)The error you'll get in response is:
root@leeloo1:/tmp/k-rail# helm install -n k-rail k-rail deploy/helm
Error: create: failed to create: namespaces "k-rail" not found
root@leeloo1:/tmp/k-rail#
If you pre-emptively create the namespace, then helm will complain as follows:
root@leeloo1:/tmp/k-rail# helm install -n k-rail k-rail deploy/helm
Error: rendered manifests contain a resource that already exists. Unable to continue with install: existing resource conflict: kind: Namespace, namespace: , name: k-rail
root@leeloo1:/tmp/k-rail#
If you don't supply a namespace to helm, it'll fail since the default
namespace already exists :)
root@leeloo1:/tmp/k-rail# helm install k-rail deploy/helm
Error: rendered manifests contain a resource that already exists. Unable to continue with install: existing resource conflict: kind: Namespace, namespace: , name: default
root@leeloo1:/tmp/k-rail#
The only way I've been able to deploy in helm3 has been to remove the namespace from the helm template, pre-create it, and then to install the chart. This further caught me out because I didn't realize that I needed to label the namespace with name=k-rail
in order for the namespace to be ignored by the mutatingwebhook.
May I suggest the following:
name
to k-rail-inspection=false
. It's confusing and redundant that for proper operation, I need a namespace named "k-rail", with a label "name" equal to "k-rail" 😜If you're happy with the above approach, I'll prepare a PR for you :)
Cheers!
D
I installed k-rail, then installed tiller, and no errors stopped me. What is the scope of tiller installation being prevented? https://github.com/cruise-automation/k-rail#no-helm-tiller
My script that installs k-rail first, then tiller via [1]:
--> waiting for k-rail to be active...
pod/k-rail-6bfd6fd575-7w2s6 condition met
pod/k-rail-6bfd6fd575-cm822 condition met
pod/k-rail-6bfd6fd575-jl9w7 condition met
Install/upgrade Tiller Server for Helm
Installing helm into current Kuberenetes context...
serviceaccount/helm created
clusterrolebinding.rbac.authorization.k8s.io/helm created
Creating certificates...
$HELM_HOME has been configured at /Users/drnic/.helm.
Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
Installing helm client certificates...
Tiller is ready, testing connection, and certificates:
working!
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "starkandwayne" chart repository
...Successfully got an update from the "stable" chart repository
Update Complete.
Does k-rail allow my tiller because it is secured by TLS? [1]
Hi 👋
There seems to be inconsistencies in how policies are checked on pod specs (possibly others). For example, the policy for No ShareProcessNamespace
checks if the value is not nil
:
Whilst the policy for No Host Network
(sic) checks if the value is true
.
k-rail/policies/pod/no_host_network.go
Line 41 in b6763dd
Checking against nil
will cause the policy to fire if I explicitly set shareProcessNamespace: false
. It also fires if some tooling explicitly sets the value to false
. Terraform does this in it's kubernetes_deployment
resource and K-Rail denies the deployment.
Is there a reason for checking nil
rather than true
? Or is it just something that crept in and may be considered a 🦋 (bug)?
I'm willing to open a PR and tests once I understand the why (if it was a conscious decision) 😄 - Thanks!
I was experimenting with creating exemptions and added the following exemption to values.yaml:
- resource_name: "abc"
namespace: default
username: "kubernetes-admin"
exempt_policies: ["pod_no_exec"]
and then upgraded the helm chart. After doing so, I was able to exec into a pod named abc. However, I was surprised that this also allowed me to exec into any pod that starts with the string abc. So I could exec into pods named:
and so on.
I think this is a bug and if it is not, I think this should be more clearly stated on the GitHub page. Thanks
Update
Looking at the code, it looks like this is by design because you expect resources to be created by controllers, etc. This is what I ran into when I looked at exception.go:
// Compile returns a CompiledExemption
func (r *RawExemption) Compile() CompiledExemption {
// if not specified, assume it's the field matches all
// ensure that ResourceName has a trailing glob so it can match the IDs added by certain resource types
// ie, Deployment pod name test-pod, ReplicaSet name test-pod-sdf932, PodName test-pod-sdf932-ew92
if !strings.HasSuffix(r.ResourceName, "*") {
r.ResourceName = r.ResourceName + "*"
}
if r.ClusterName == "" {
r.ClusterName = "*"
}
if r.Namespace == "" {
r.Namespace = "*"
}
if r.Username == "" {
r.Username = "*"
}
if r.Group == "" {
r.Group = "*"
}
if len(r.ExemptPolicies) == 0 {
r.ExemptPolicies = []string{"*"}
}
so it might be useful to add something about this in the README.md
Helm v3.0.2
Kubernetes Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.10-gke.0", GitCommit:"a988db14950de3628f9e21773f3de0bf52485534", GitTreeState:"clean", BuildDate:"2019-12-11T22:37:55Z", GoVersion:"go1.12.12b4", Compiler:"gc", Platform:"linux/amd64"}
K-rail 6cff60ea41d1acf9f4a87ab4a739a0eeaa1472c1 tag: v1.3.0
After a successful install (first by commenting out webhooks.yaml
, installing so the pods can be provisioned, then uncommenting webhooks.yaml
and upgrading), inspection of the logs shows a series of tls errors.
kubectl logs --namespace k-rail --selector name=k-rail
......
2020/01/20 07:56:36 http: TLS handshake error from 10.1.2.35:50702: remote error: tls: bad certificate
2020/01/20 07:56:36 http: TLS handshake error from 10.1.2.35:50704: remote error: tls: bad certificate
2020/01/20 07:56:55 http: TLS handshake error from 10.1.2.36:45366: remote error: tls: bad certificate
.....
Inspection of the k-rail-cert secret shows the certificate and private key are both well-formed.
kubectl get secret -n k-rail k-rail-cert -o yaml
I'm not quite sure where this is coming from.
Following up on: #36 (comment)
The policy.Config
is passed to the policy implementations with an admission request. There is currently no way to fail fast on configuration issues. For example: MaximumSizeLimit < DefaultSizeLimit
.
Some options to address this:
yaml:
annotation with json:
To come up with a good proposal it would be helpful to understand how configuration changes should be handled in the future. Like update config and restart like now or with a watchdog.
Hey!
By default, an emptyDir
lacks a sizeLimit
parameter, and is disk-based;
a Pod with access to said emptyDir
can consume the Node's entire disk (i.e. the limit is unbounded) until the offending Pod is deleted or evicted, which can constitute a denial-of-service condition at the affected Node (i.e. DiskPressure
). If the emptyDir
is memory-backed, writes are counted against the container's memory limit, so this is probably less of a concern.
I'm happy to take up a k-rail policy for this issue. I envisage two approaches:
emptyDir
lacks a sizeLimit
, report a violation (e.g. block)emptyDir
lacks a sizeLimit
, add a sane defaultOne might also want to check if a supplied sizeLimit
is within a sane bound.
Let me know what sounds good here and I'll move forward.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.