coreos / coreos-kubernetes Goto Github PK

View Code? Open in Web Editor NEW

1.1K 82.0 468.0 26.39 MB

CoreOS Container Linux+Kubernetes documentation & Vagrant installers

Home Page: https://coreos.com/kubernetes/docs/latest/

License: Apache License 2.0

Ruby 10.32% Shell 89.68%

coreos-kubernetes's People

Contributors

Stargazers

Watchers

Forkers

robszumski gosharplite philips bcwaldon aaronlevy jonboulle derekparker jbarratt smbale adelq bfallik dubfriend kcyeu endocode chancez mistio ndobson slack tedyoung colhom wolfeidau ipedrazas acalephstorage dysinger piotrprokop ash11tw eliaslevy apple-corps trunkclub alexhersh joshix classdojo ccelebi dkapanidis cdsalmons gergpetr thereallukl urashidmalik gtank nrbrt wienczny slapers michieldemey jamondouglas gabrtv jlekie alexchesser mariomarin malcolmredheron aasimid danielwhatmuff rwehner jirwin cedbossneo eduardiazf xuwang peebs bernadinm phoenixes iameli digideskio jhuntoo wangft ankitforcode desenmeng kamkie csats xuhuajie liubq919 winkapp d3v-cl cigolabs jrdnull guoshimin piffio jebenexer dinkin cloudxtreme aknuds1 meonkeys david-mohr fpco ienliven kbrwn mjg59 team-innovation zhouqing86 runeffective richardavendano aerisweather four43 hkjn danielheckrath amarruedo spacepluk juliandunn mattmarshall lparis jbowen93 rnaveiras

coreos-kubernetes's Issues

Tag nodes in AWS after kubelet registration with instance-type

That way if you have a cluster with mixed node types you can target node types using nodeSelector. This would require changes to kubelet.service unit installed by multi-node/aws/artifacts/scripts/install-worker.sh.

Similarly, it would be a good idea to automatically add tags for other EC2 metadata such as placement/availability-zone and ami-id. The tags could be namespaces under as.amazon.com. E.g. aws.amazon.com:instance-type.

Create clusters with HA masters by default

Ideally Kubernetes clusters should have highly available masters. Currently k8s nodes are auto scaled, but the master is not. This can be achieved with the combination of 1) an ELB and 2) either the podmaster (whose spec is already included in the public artifacts), or the use of fleet to guarantee that only one copy each of the controller manager and scheduler are running at once.

Secure kubelets with TLS

All kubelets should be deployed with unique client TLS certificates signed by a shared CA.

NoCredentialProviders: no valid providers in chain

"Failed creating cluster: NoCredentialProviders: no valid providers in chain" - keep getting this error. Noticed that there was a similar issue registered in the AWS GO SDK.

docs: expand thinking around master nodes and register=false

An "I wish" reported from coreos.com/kubernetes/docs/latest/deploy-master.html

I wish this page explained why register-node is false and talks about the errors in the kubelet logs

Sounds like we need to cover two things a bit more clearly:

register is false so that user pods don't land on the master machine(s). expand on the strategy/thinking behind this. maybe a trade off for when you'd want to do this
explicitly state that these logs are harmless and what causes them. related: #56

kubernetes/docs: Incorrect IP used for example kubectl config

In the Kubernetes single node vagrant deploy docs, the examples show two different IP addresses (172.17.4.101 & 172.17.4.201). The Vagrantfile which is used to stand up the single node cluster uses a specific IP address (172.17.4.99) for the node, so the docs should reflect that. This could cause confusion for folks spinning up the single node cluster for the first time.

Use different variables for flannel and etcd members

The ambiguous {{ETCD_ENDPOINTS}} values used by both flannel and tectonic should be different since they require different syntax.

https://github.com/coreos/kube-up/blob/master/multi-node/generic/controller-cloud-config.yaml#L9

https://github.com/coreos/kube-up/blob/master/multi-node/generic/controller-cloud-config.yaml#L29

aws should not need flannel port

VPC can do the networking we need, so we should not need to use the flannel overlay in AWS instructions.

Create dedicated etcd machines backed by EBS volumes

It should be configurable (and probably the default set up) to launch a cluster of etcd nodes (probably 3 by default) on separate CoreOS instances and point the Kubernetes API server at them rather than having a single etcd node colocated with the API server. This is a more reliable configuration for production deployments that can tolerate potential etcd failures on a single node and provide better disaster recovery. The etcd data dirs should be located on EBS volumes (not the root device) so the instances can be terminated safely without risk of losing the data.

Mention port to be used by workers

The comment at:
https://github.com/coreos/kube-up/blob/master/multi-node/generic/worker-cloud-config.yaml#L22

should mention at the very least a default port. As most of the configuration has occurred via magic at this point there is no context about what this is providing.

Docker log level

We should set the Docker log level to warn or above. The current Docker default outputs every API call to the log and increases CPU load by a non-trivial amount.

Related kubernetes/kubernetes#15728

Deploy & configure dex

Deploy dex (github.com/coreos/dex) as an OIDC provider for kube-apiserver such that an end-user can use the OIDC functionality built into kubectl and kube-apiserver to self-authenticate.

Use service account in DNS addon

The DNS addon currently uses a kubeconfig to configure itself with access to the cluster. Ideally, it would use a service account, and the kubeconfig object would be unnecessary.

This is blocked on kubernetes/kubernetes#13411.

Provide a Cloudformation Template for AWS

It looks like there are some plans to support AWS, I am planning on building a cloudformation template for my own setup.

Are you interested in accepting this as pull request?

Deploy single-node etcd

The manual instruction to start single-node etcd does not allow remote machine to access etcd.

$ sudo systemctl start etcd2
sudo systemctl enable etcd2

I found out that below parameters are needed for remote access to etcd.

  etcd2:
    advertise-client-urls: http://$private_ipv4:2379
    listen-client-urls: http://0.0.0.0:2379

Kubernetes addons?

I rolled out the single and multi host setups and noticed the only addon your deploying is kube-dns.

Is there any plan to add others?

Specifically kube-ui, metrics and logging.

Even some notes on any possible issues doing so would be super handy.

Easier workflow for port forwarding

Adding something similar to https://github.com/coreos/coreos-vagrant/blob/master/config.rb.sample#L90 would make it much easier to access services exposed via nodePort.

docs: Add NodePort explanation to guestbook example

The upstream docs for the guestbook app make no mention of nodeports. However, on platforms where there is no cloud-provider loadbalancer, accessing the service externally isn't very clear. See issue #65 for context.

Should add documentation or a blurb to the examples link describing how to use the NodePort that is created with the guestbook service.

allow ABAC authorization mode

Even though the API endpoint can be TLS secured, service accounts currently have full access to the entire cluster and all its API endpoints. Meaning, if a single pod/container is compromised, everything is.

We're currently doing this manually (by having our own artefacts based on the ones in this repository), but it would be nice if it was possible to define the authorization plugin to use.

kubectl exec - worker cert issue

When using the OpenSSL documentation running kubectl exec produces the following error:

kubectl --insecure-skip-tls-verify=false exec busybox -- nslookup kubernetes
error: Unable to upgrade connection: {
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "x509: certificate is valid for kube-worker, not coreos03",
  "code": 500
}

As an insecure workaround I generated the worker csr like so:

openssl req -new -key worker-key.pem -out worker.csr -subj "/CN=*"

kube-aws requires you to delete ./clusters each time you run

Seems the code currently assumes the ./clusters path doesn't exist so it fails if you attempt to recreate a cluster.

docs: Add cluster port requirements / descriptions

For multi-node guides there should be documentation covering the port requirements for varying node types (controller / worker) so deployers can properly configure firewall rules.

guides for gce

Do you have any plans to create guid for Google Compute?

Secure access to etcd using TLS

Multi-node setups (e.g. AWS) should secure access to etcd using TLS.

docs: add explanation about kubelet failure to register on master nodes

On a master node, the kubelet will attempt to post status to the API over and over again.

We should add a note that this isn't an issue and also explain why this happens.

Set up real PKI for kubernetes control plane

Create a CA and provision certificates for kube-apiserver

Don't turn off CoreOS updates

I notice that the cluster install scripts turn off CoreOS's update engine. Why is this necessary? This goes against one of the big benefits and motivations behind CoreOS as I understood them: to keep machines updated automatically and to encourage infrastructure that can withstand the loss of any individual machines (e.g. as they restart for an update.)

I assume there's a good reason updates are disabled for now, but once it's been explained why, I'd like to keep this issue open to track progress towards being able to turn them on.

deploy-master.md kube-controller-master.yaml path incorrect

/srv/kubernetes/manifests/kube-controller-manager.yaml

Should be

/etc/kubernetes/manifests/kube-controller-manager.yaml (notice the etc instead of srv)

Multiple places in the document.

Support multi availability zone deployments on AWS

In addition to supporting HA masters by default (#90), it would also be good to have support for multi AZ deployments on AWS. A minimal HA setup would be 3 master nodes on 3 different AZs, with worker nodes also on different AZs.

Run kube-proxy as pod

Rather than extract kube-proxy onto the host and running it as a systemd unit, it should be run as a static pod on each kubelet.

apiserver is not available at startup first time

Kubelet can not create the service pods of apiserver, podmaster and kube-proxy from /etc/kubernetes/manifests/:

Sep 23 04:23:10 c1 kubelet[1218]: E0923 04:23:10.172129    1218 reflector.go:136] Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:10 c1 kubelet[1218]: E0923 04:23:10.516622    1218 kubelet.go:1641] error getting node: node 172.17.4.101 not found
Sep 23 04:23:11 c1 kubelet[1218]: E0923 04:23:11.173514    1218 reflector.go:136] Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:11 c1 kubelet[1218]: E0923 04:23:11.174512    1218 reflector.go:136] Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:11 c1 kubelet[1218]: E0923 04:23:11.175847    1218 reflector.go:136] Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:12 c1 kubelet[1218]: E0923 04:23:12.175900    1218 reflector.go:136] Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:12 c1 kubelet[1218]: E0923 04:23:12.177491    1218 reflector.go:136] Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:12 c1 kubelet[1218]: E0923 04:23:12.177850    1218 reflector.go:136] Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:13 c1 kubelet[1218]: E0923 04:23:13.179284    1218 reflector.go:136] Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:13 c1 kubelet[1218]: E0923 04:23:13.186413    1218 reflector.go:136] Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:13 c1 kubelet[1218]: E0923 04:23:13.186447    1218 reflector.go:136] Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:13 c1 kubelet[1218]: E0923 04:23:13.206073    1218 event.go:194] Unable to write event: 'Post http://127.0.0.1:8080/api/v1/namespaces/kube-system/events: dial tcp 127.0.0.1:8080: connection refused' (may retry after sleeping)
Sep 23 04:23:14 c1 kubelet[1218]: E0923 04:23:14.188419    1218 reflector.go:136] Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:14 c1 kubelet[1218]: E0923 04:23:14.189941    1218 reflector.go:136] Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services: dial tcp 127.0.0.1:8080: connection refused
Sep 23 04:23:14 c1 kubelet[1218]: E0923 04:23:14.190631    1218 reflector.go:136] Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused

and none containers up:

core@c1 /etc/systemd/system $ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
core@c1 /etc/systemd/system $

Step by Step - Vagrant

Hi,
I'm following the kubernetes Step by Step.
Everything is fine since I have an issue with the kube-controller.
This one request a cloud provider. If I provide vagrant, it fails telling that vagrant does not implement load-balancer.

Maybe a comment for vagrant can ne added at the start of the guide ? Or maybe i'm missing simply missing something for things to work ?

docs: clarify etcd dropin on getting started

The following "I Wish" came from coreos.com/kubernetes/docs/latest/getting-started.html

I wish this page what's the version of coreos in this turtial, I can't find /etc/systemd/system/etcd2.service.d/40-listen-address.conf

We probably need to add some language about, "create the file..." or something similar.

flannel: Use vxlan as backend

Authentication Code Generation to API Endpoints [Doc]

Hey,

Can you write a description on how to access the KUBE-UI with any of the Authentication options of Kubernetes ? How to generate and get the tokens ?

Cheers,
Luc Michalski

docs: add flannel troubleshooting when setting up the kublet

When setting up the kublet, any flannel misconfiguration will bork docker, which isn't immediately apparent.

Document some steps to 1. identify when this happens and 2. how to fix it

docs: Add information about inspecting bootstrap process

The initial bootstrap of kubernetes components can make it seem like the cluster isn't working initially (api not available, or kubelet log complaining).

We should add more documentation about trouble-shooting, inspecting the bootstrap state. e.g. see tickets for context:

#76
#63

Is there any plan to add the kubernetes load balancer to vagrant?

Tried to run through the guestbook example and it worked for the most part except there is no load balancer so services aren't exposed by the cluster in the normal way.

I managed to get the guest book example working as it was exposed as a nodeport but still a bit confusing / limited.

I would love to use this stack for demos locally but I really need the load balancer.

Cheers

Cloud configs should not reference personal gists

https://github.com/coreos/kube-up/blob/master/multi-node/generic/controller-cloud-config.yaml
https://github.com/coreos/kube-up/blob/master/multi-node/generic/worker-cloud-config.yaml

currently reference gists of aaronlevy's. We need to not do that.

Kuberentes OpenSSL guide MASTER_HOST confusion.

In the document https://coreos.com/kubernetes/docs/latest/openssl.html at the begging it states that MASTER_HOST can be set to "the publicly routable IP or hostname of the master cluster." Then in the OpenSSL Config section It shows IP.2 = ${MASTER_HOST} .

Since IP.2 imples an IP address, and not a hostname, after discussion on IRC, IP.2 should not be set to a hostname. However we can use ${MASTER_DNS_NAME}, which is not mentioned above.

Expose cadvisor on public interface

The kubelet cadvisor interface is currently only listening on localhost (via --cadvisor-port=0). We may want to expose cadvisor so metric collection tools (e.g. heapster addon) can retrieve data from the nodes.

Simplify ExternalDNSName management

When using kube-aws, you are required to provide an externalDNSName, which is a DNS name you are expected to make routable to the controller instance. This is helpful, but is a very clunky way of making your cluster externally accessible. Additionally, the documentation is rather sparse and expects you to know what you're doing.

The main reason this exists is so the apiserver TLS certificate is valid for an external name before it is copied into the cluster.

This is clearly clunky, and could be simpler from the perspective of the deployer.

Multi-node Vagrant environment not coming up

Following the docs I am able to start up a multi-node cluster with Vagrant but the Kubernetes bootstrap process seems to get stuck.

The etcd node comes up with no issues, but w1 and c1 are having trouble downloading the Docker images. After the podmaster image downloads it seems that nothing else wants to download on c1. Likewise, on w1, nothing downloads after the pause image.

I also noticed that the update-engine fails on w1 and c1 with this:

-- Logs begin at Sun 2015-09-20 19:36:12 UTC, end at Sun 2015-09-20 19:43:30 UTC. --
Sep 20 19:36:16 localhost systemd[1]: Starting Update Engine...
Sep 20 19:36:17 localhost update_engine[583]: [0920/193617:INFO:main.cc(155)] CoreOS Update Engine starting
Sep 20 19:36:17 localhost systemd[1]: Started Update Engine.
Sep 20 19:36:17 localhost update_engine[583]: [0920/193617:INFO:update_check_scheduler.cc(82)] Next update check in 9m23s
Sep 20 19:36:26 w1 systemd[1]: Stopping Update Engine...
Sep 20 19:36:27 w1 systemd[1]: update-engine.service: Main process exited, code=exited, status=1/FAILURE
Sep 20 19:36:27 w1 systemd[1]: Stopped Update Engine.
Sep 20 19:36:27 w1 systemd[1]: update-engine.service: Unit entered failed state.
Sep 20 19:36:27 w1 systemd[1]: update-engine.service: Failed with result 'exit-code'.

Any ideas? Maybe this is a disk space issue?

I have tested the single node Vagrant method and it is working.

Network CIDRs should be a cluster.yaml setting

It would be nice to be able to pick a different network CIDR for the kubernetes VPC without having to recompile kube-aws & maintain a copy of the artifact files in a bucket.

#108

Changes to the memory not working in `multi-node` vagrant setup config

When using the multi-node vagrant setup updating one of the vars as follows has no effect.

$worker_vm_memory=2048

All vms end up using 512 MB of memory anyway.

Only way to fix it is modify the config.rb as follows, but this results in ALL vms using 1024 MB.

diff --git a/multi-node/vagrant/Vagrantfile b/multi-node/vagrant/Vagrantfile
index 771b674..a85ede8 100644
--- a/multi-node/vagrant/Vagrantfile
+++ b/multi-node/vagrant/Vagrantfile
@@ -10,11 +10,11 @@ Vagrant.require_version ">= 1.6.0"

 $update_channel = "alpha"
 $controller_count = 1
-$controller_vm_memory = 512
+$controller_vm_memory = 1024
 $worker_count = 1
-$worker_vm_memory = 512
+$worker_vm_memory = 1024
 $etcd_count = 1
-$etcd_vm_memory = 512
+$etcd_vm_memory = 1024

 CONFIG = File.expand_path("config.rb")
 if File.exist?(CONFIG)

Failed to apply cloud-config: Unable to decode / pb with SSL certs

Template not working

Failed to apply cloud-config: Unable to decode /etc/kubernetes/ssl/ca.pem (Unable to decode base64: "illegal base64 data at input byte 64")

I provide it through the AWS form

docs: provide instructions for configuring kubectl with AWS

Provide instructions for how to set up kubectl with a context from the generated ./clusters/<cluster-name>/kubeconfig file. Ideally kube-aws can merge this file with the default ~/.kube/config, or sets the appropriate env vars for kubectl to find it automatically.

The instructions become something like:

A kubectl config file will be written to ./clusters/<cluster-name>/kubeconfig, which is referenced automatically by the cluster name configured earlier.

kubectl can switch between multiple "contexts", which is a combination of user credentials and cluster information. Switch to your new cluster's context and then list the nodes:

$ kubectl config use-context ${CLUSTERNAME}
$ kubectl get nodes
NAME                         LABELS                                              STATUS
ip-10-0-0-223.ec2.internal   kubernetes.io/hostname=ip-10-0-0-223.ec2.internal   Ready
ip-10-0-0-224.ec2.internal   kubernetes.io/hostname=ip-10-0-0-224.ec2.internal   Ready
ip-10-0-0-225.ec2.internal   kubernetes.io/hostname=ip-10-0-0-225.ec2.internal   Ready

docs: there is no `kube-aws describe` subcommand

I'm following the tutorial on https://coreos.com/kubernetes/docs/latest/kubernetes-on-aws.html and the command kube-aws describe is mentioned twice. However, using release 0.1.0, this subcommand does not exist:

$ kube-aws describe
Error: unknown command "describe" for "kube-aws"
Run 'kube-aws --help' for usage.

$ kube-aws --help
Manage Kubernetes clusters on AWS

Usage: 
  kube-aws [command]

Available Commands: 
  destroy     Destroy an existing Kubernetes cluster
  render      Render a CloudFormation template
  status      Describe an existing Kubernetes cluster
  up          Create a new Kubernetes cluster
  version     Print version information and exit
  help        Help about any command

Flags:
      --aws-debug[=false]: Log debug information from aws-sdk-go library
      --config="cluster.yaml": Location of kube-aws cluster config file
  -h, --help[=false]: help for kube-aws

Use "kube-aws [command] --help" for more information about a command.

$ kube-aws version
kube-aws version v0.1.0

I'm not sure if this command was renamed to kube-aws status or if it simply does not exist.

I'd be happy to submit a PR to fix the docs changing describe to status or to remove the mentions altogether depending on what happened.

Logs are not working via the kubectl command

I have deployed a rethinkdb to the vagrant based single VM cluster and I am getting the following error when i try and retrieve the logs.

$ kubectl logs rethinkdb-rc-751i0
Internal Error: Unrecognized input header

It may be related to how the docker is configured to send logs to journald.

CloudFormation template defines IAMRoleWorker but fails to make use of it

Rather, it gives worker nodes the IAMRoleController erroneously.

coreos / coreos-kubernetes Goto Github PK

coreos-kubernetes's People

Contributors

Stargazers

Watchers

Forkers

coreos-kubernetes's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs