coreos / coreos-kubernetes Goto Github PK
View Code? Open in Web Editor NEWCoreOS Container Linux+Kubernetes documentation & Vagrant installers
Home Page: https://coreos.com/kubernetes/docs/latest/
License: Apache License 2.0
CoreOS Container Linux+Kubernetes documentation & Vagrant installers
Home Page: https://coreos.com/kubernetes/docs/latest/
License: Apache License 2.0
That way if you have a cluster with mixed node types you can target node types using nodeSelector
. This would require changes to kubelet.service
unit installed by multi-node/aws/artifacts/scripts/install-worker.sh
.
Similarly, it would be a good idea to automatically add tags for other EC2 metadata such as placement/availability-zone
and ami-id
. The tags could be namespaces under as.amazon.com
. E.g. aws.amazon.com:instance-type
.
Ideally Kubernetes clusters should have highly available masters. Currently k8s nodes are auto scaled, but the master is not. This can be achieved with the combination of 1) an ELB and 2) either the podmaster (whose spec is already included in the public artifacts), or the use of fleet to guarantee that only one copy each of the controller manager and scheduler are running at once.
All kubelets should be deployed with unique client TLS certificates signed by a shared CA.
"Failed creating cluster: NoCredentialProviders: no valid providers in chain" - keep getting this error. Noticed that there was a similar issue registered in the AWS GO SDK.
An "I wish" reported from coreos.com/kubernetes/docs/latest/deploy-master.html
I wish this page explained why register-node is false and talks about the errors in the kubelet logs
Sounds like we need to cover two things a bit more clearly:
In the Kubernetes single node vagrant deploy docs, the examples show two different IP addresses (172.17.4.101
& 172.17.4.201
). The Vagrantfile which is used to stand up the single node cluster uses a specific IP address (172.17.4.99
) for the node, so the docs should reflect that. This could cause confusion for folks spinning up the single node cluster for the first time.
The ambiguous {{ETCD_ENDPOINTS}}
values used by both flannel and tectonic should be different since they require different syntax.
https://github.com/coreos/kube-up/blob/master/multi-node/generic/controller-cloud-config.yaml#L9
https://github.com/coreos/kube-up/blob/master/multi-node/generic/controller-cloud-config.yaml#L29
VPC can do the networking we need, so we should not need to use the flannel overlay in AWS instructions.
It should be configurable (and probably the default set up) to launch a cluster of etcd nodes (probably 3 by default) on separate CoreOS instances and point the Kubernetes API server at them rather than having a single etcd node colocated with the API server. This is a more reliable configuration for production deployments that can tolerate potential etcd failures on a single node and provide better disaster recovery. The etcd data dirs should be located on EBS volumes (not the root device) so the instances can be terminated safely without risk of losing the data.
The comment at:
https://github.com/coreos/kube-up/blob/master/multi-node/generic/worker-cloud-config.yaml#L22
should mention at the very least a default port. As most of the configuration has occurred via magic at this point there is no context about what this is providing.
We should set the Docker log level to warn
or above. The current Docker default outputs every API call to the log and increases CPU load by a non-trivial amount.
Related kubernetes/kubernetes#15728
Deploy dex (github.com/coreos/dex) as an OIDC provider for kube-apiserver such that an end-user can use the OIDC functionality built into kubectl and kube-apiserver to self-authenticate.
The DNS addon currently uses a kubeconfig to configure itself with access to the cluster. Ideally, it would use a service account, and the kubeconfig object would be unnecessary.
This is blocked on kubernetes/kubernetes#13411.
It looks like there are some plans to support AWS, I am planning on building a cloudformation template for my own setup.
Are you interested in accepting this as pull request?
The manual instruction to start single-node etcd does not allow remote machine to access etcd.
$ sudo systemctl start etcd2
sudo systemctl enable etcd2
I found out that below parameters are needed for remote access to etcd.
etcd2:
advertise-client-urls: http://$private_ipv4:2379
listen-client-urls: http://0.0.0.0:2379
I rolled out the single and multi host setups and noticed the only addon your deploying is kube-dns.
Is there any plan to add others?
Specifically kube-ui, metrics and logging.
Even some notes on any possible issues doing so would be super handy.
Adding something similar to https://github.com/coreos/coreos-vagrant/blob/master/config.rb.sample#L90 would make it much easier to access services exposed via nodePort.
The upstream docs for the guestbook app make no mention of nodeports. However, on platforms where there is no cloud-provider loadbalancer, accessing the service externally isn't very clear. See issue #65 for context.
Should add documentation or a blurb to the examples link describing how to use the NodePort that is created with the guestbook service.
Even though the API endpoint can be TLS secured, service accounts currently have full access to the entire cluster and all its API endpoints. Meaning, if a single pod/container is compromised, everything is.
We're currently doing this manually (by having our own artefacts based on the ones in this repository), but it would be nice if it was possible to define the authorization plugin to use.
When using the OpenSSL documentation running kubectl exec produces the following error:
kubectl --insecure-skip-tls-verify=false exec busybox -- nslookup kubernetes
error: Unable to upgrade connection: {
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "x509: certificate is valid for kube-worker, not coreos03",
"code": 500
}
As an insecure workaround I generated the worker csr like so:
openssl req -new -key worker-key.pem -out worker.csr -subj "/CN=*"
Seems the code currently assumes the ./clusters path doesn't exist so it fails if you attempt to recreate a cluster.
For multi-node guides there should be documentation covering the port requirements for varying node types (controller / worker) so deployers can properly configure firewall rules.
Do you have any plans to create guid for Google Compute?
Multi-node setups (e.g. AWS) should secure access to etcd using TLS.
On a master node, the kubelet will attempt to post status to the API over and over again.
We should add a note that this isn't an issue and also explain why this happens.
Create a CA and provision certificates for kube-apiserver
I notice that the cluster install scripts turn off CoreOS's update engine. Why is this necessary? This goes against one of the big benefits and motivations behind CoreOS as I understood them: to keep machines updated automatically and to encourage infrastructure that can withstand the loss of any individual machines (e.g. as they restart for an update.)
I assume there's a good reason updates are disabled for now, but once it's been explained why, I'd like to keep this issue open to track progress towards being able to turn them on.
/srv/kubernetes/manifests/kube-controller-manager.yaml
Should be
/etc/kubernetes/manifests/kube-controller-manager.yaml (notice the etc instead of srv)
Multiple places in the document.
In addition to supporting HA masters by default (#90), it would also be good to have support for multi AZ deployments on AWS. A minimal HA setup would be 3 master nodes on 3 different AZs, with worker nodes also on different AZs.
Rather than extract kube-proxy onto the host and running it as a systemd unit, it should be run as a static pod on each kubelet.
Kubelet can not create the service pods of apiserver, podmaster and kube-proxy from /etc/kubernetes/manifests/:
Sep 23 04:23:10 c1 kubelet[1218]: E0923 04:23:10.172129 1218 reflector.go:136] Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:10 c1 kubelet[1218]: E0923 04:23:10.516622 1218 kubelet.go:1641] error getting node: node 172.17.4.101 not found Sep 23 04:23:11 c1 kubelet[1218]: E0923 04:23:11.173514 1218 reflector.go:136] Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:11 c1 kubelet[1218]: E0923 04:23:11.174512 1218 reflector.go:136] Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:11 c1 kubelet[1218]: E0923 04:23:11.175847 1218 reflector.go:136] Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:12 c1 kubelet[1218]: E0923 04:23:12.175900 1218 reflector.go:136] Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:12 c1 kubelet[1218]: E0923 04:23:12.177491 1218 reflector.go:136] Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:12 c1 kubelet[1218]: E0923 04:23:12.177850 1218 reflector.go:136] Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:13 c1 kubelet[1218]: E0923 04:23:13.179284 1218 reflector.go:136] Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:13 c1 kubelet[1218]: E0923 04:23:13.186413 1218 reflector.go:136] Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:13 c1 kubelet[1218]: E0923 04:23:13.186447 1218 reflector.go:136] Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:13 c1 kubelet[1218]: E0923 04:23:13.206073 1218 event.go:194] Unable to write event: 'Post http://127.0.0.1:8080/api/v1/namespaces/kube-system/events: dial tcp 127.0.0.1:8080: connection refused' (may retry after sleeping) Sep 23 04:23:14 c1 kubelet[1218]: E0923 04:23:14.188419 1218 reflector.go:136] Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:14 c1 kubelet[1218]: E0923 04:23:14.189941 1218 reflector.go:136] Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services: dial tcp 127.0.0.1:8080: connection refused Sep 23 04:23:14 c1 kubelet[1218]: E0923 04:23:14.190631 1218 reflector.go:136] Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3D172.17.4.101: dial tcp 127.0.0.1:8080: connection refused
and none containers up:
core@c1 /etc/systemd/system $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES core@c1 /etc/systemd/system $
Hi,
I'm following the kubernetes Step by Step.
Everything is fine since I have an issue with the kube-controller.
This one request a cloud provider. If I provide vagrant
, it fails telling that vagrant does not implement load-balancer.
Maybe a comment for vagrant can ne added at the start of the guide ? Or maybe i'm missing simply missing something for things to work ?
The following "I Wish" came from coreos.com/kubernetes/docs/latest/getting-started.html
I wish this page what's the version of coreos in this turtial, I can't find /etc/systemd/system/etcd2.service.d/40-listen-address.conf
We probably need to add some language about, "create the file..." or something similar.
Hey,
Can you write a description on how to access the KUBE-UI with any of the Authentication options of Kubernetes ? How to generate and get the tokens ?
Cheers,
Luc Michalski
When setting up the kublet, any flannel misconfiguration will bork docker, which isn't immediately apparent.
Document some steps to 1. identify when this happens and 2. how to fix it
Tried to run through the guestbook example and it worked for the most part except there is no load balancer so services aren't exposed by the cluster in the normal way.
I managed to get the guest book example working as it was exposed as a nodeport but still a bit confusing / limited.
I would love to use this stack for demos locally but I really need the load balancer.
Cheers
https://github.com/coreos/kube-up/blob/master/multi-node/generic/controller-cloud-config.yaml
https://github.com/coreos/kube-up/blob/master/multi-node/generic/worker-cloud-config.yaml
currently reference gists of aaronlevy's. We need to not do that.
In the document https://coreos.com/kubernetes/docs/latest/openssl.html at the begging it states that MASTER_HOST can be set to "the publicly routable IP or hostname of the master cluster." Then in the OpenSSL Config section It shows IP.2 = ${MASTER_HOST} .
Since IP.2 imples an IP address, and not a hostname, after discussion on IRC, IP.2 should not be set to a hostname. However we can use ${MASTER_DNS_NAME}, which is not mentioned above.
The kubelet cadvisor interface is currently only listening on localhost (via --cadvisor-port=0
). We may want to expose cadvisor so metric collection tools (e.g. heapster addon) can retrieve data from the nodes.
When using kube-aws, you are required to provide an externalDNSName
, which is a DNS name you are expected to make routable to the controller instance. This is helpful, but is a very clunky way of making your cluster externally accessible. Additionally, the documentation is rather sparse and expects you to know what you're doing.
The main reason this exists is so the apiserver TLS certificate is valid for an external name before it is copied into the cluster.
This is clearly clunky, and could be simpler from the perspective of the deployer.
Following the docs I am able to start up a multi-node cluster with Vagrant but the Kubernetes bootstrap process seems to get stuck.
The etcd node comes up with no issues, but w1
and c1
are having trouble downloading the Docker images. After the podmaster image downloads it seems that nothing else wants to download on c1
. Likewise, on w1
, nothing downloads after the pause image.
I also noticed that the update-engine
fails on w1
and c1
with this:
-- Logs begin at Sun 2015-09-20 19:36:12 UTC, end at Sun 2015-09-20 19:43:30 UTC. --
Sep 20 19:36:16 localhost systemd[1]: Starting Update Engine...
Sep 20 19:36:17 localhost update_engine[583]: [0920/193617:INFO:main.cc(155)] CoreOS Update Engine starting
Sep 20 19:36:17 localhost systemd[1]: Started Update Engine.
Sep 20 19:36:17 localhost update_engine[583]: [0920/193617:INFO:update_check_scheduler.cc(82)] Next update check in 9m23s
Sep 20 19:36:26 w1 systemd[1]: Stopping Update Engine...
Sep 20 19:36:27 w1 systemd[1]: update-engine.service: Main process exited, code=exited, status=1/FAILURE
Sep 20 19:36:27 w1 systemd[1]: Stopped Update Engine.
Sep 20 19:36:27 w1 systemd[1]: update-engine.service: Unit entered failed state.
Sep 20 19:36:27 w1 systemd[1]: update-engine.service: Failed with result 'exit-code'.
Any ideas? Maybe this is a disk space issue?
I have tested the single node Vagrant method and it is working.
It would be nice to be able to pick a different network CIDR for the kubernetes VPC without having to recompile kube-aws
& maintain a copy of the artifact files in a bucket.
When using the multi-node
vagrant setup updating one of the vars as follows has no effect.
$worker_vm_memory=2048
All vms end up using 512 MB of memory anyway.
Only way to fix it is modify the config.rb
as follows, but this results in ALL vms using 1024 MB.
diff --git a/multi-node/vagrant/Vagrantfile b/multi-node/vagrant/Vagrantfile
index 771b674..a85ede8 100644
--- a/multi-node/vagrant/Vagrantfile
+++ b/multi-node/vagrant/Vagrantfile
@@ -10,11 +10,11 @@ Vagrant.require_version ">= 1.6.0"
$update_channel = "alpha"
$controller_count = 1
-$controller_vm_memory = 512
+$controller_vm_memory = 1024
$worker_count = 1
-$worker_vm_memory = 512
+$worker_vm_memory = 1024
$etcd_count = 1
-$etcd_vm_memory = 512
+$etcd_vm_memory = 1024
CONFIG = File.expand_path("config.rb")
if File.exist?(CONFIG)
Template not working
Failed to apply cloud-config: Unable to decode /etc/kubernetes/ssl/ca.pem (Unable to decode base64: "illegal base64 data at input byte 64")
I provide it through the AWS form
Provide instructions for how to set up kubectl with a context from the generated ./clusters/<cluster-name>/kubeconfig
file. Ideally kube-aws
can merge this file with the default ~/.kube/config
, or sets the appropriate env vars for kubectl
to find it automatically.
The instructions become something like:
A kubectl config file will be written to ./clusters/<cluster-name>/kubeconfig
, which is referenced automatically by the cluster name configured earlier.
kubectl
can switch between multiple "contexts", which is a combination of user credentials and cluster information. Switch to your new cluster's context and then list the nodes:
$ kubectl config use-context ${CLUSTERNAME}
$ kubectl get nodes
NAME LABELS STATUS
ip-10-0-0-223.ec2.internal kubernetes.io/hostname=ip-10-0-0-223.ec2.internal Ready
ip-10-0-0-224.ec2.internal kubernetes.io/hostname=ip-10-0-0-224.ec2.internal Ready
ip-10-0-0-225.ec2.internal kubernetes.io/hostname=ip-10-0-0-225.ec2.internal Ready
I'm following the tutorial on https://coreos.com/kubernetes/docs/latest/kubernetes-on-aws.html and the command kube-aws describe
is mentioned twice. However, using release 0.1.0, this subcommand does not exist:
$ kube-aws describe
Error: unknown command "describe" for "kube-aws"
Run 'kube-aws --help' for usage.
$ kube-aws --help
Manage Kubernetes clusters on AWS
Usage:
kube-aws [command]
Available Commands:
destroy Destroy an existing Kubernetes cluster
render Render a CloudFormation template
status Describe an existing Kubernetes cluster
up Create a new Kubernetes cluster
version Print version information and exit
help Help about any command
Flags:
--aws-debug[=false]: Log debug information from aws-sdk-go library
--config="cluster.yaml": Location of kube-aws cluster config file
-h, --help[=false]: help for kube-aws
Use "kube-aws [command] --help" for more information about a command.
$ kube-aws version
kube-aws version v0.1.0
I'm not sure if this command was renamed to kube-aws status
or if it simply does not exist.
I'd be happy to submit a PR to fix the docs changing describe
to status
or to remove the mentions altogether depending on what happened.
I have deployed a rethinkdb to the vagrant based single VM cluster and I am getting the following error when i try and retrieve the logs.
$ kubectl logs rethinkdb-rc-751i0
Internal Error: Unrecognized input header
It may be related to how the docker is configured to send logs to journald.
Rather, it gives worker nodes the IAMRoleController erroneously.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.