terraform-aws-modules / terraform-aws-eks Goto Github PK

View Code? Open in Web Editor NEW

4.4K 90.0 4.0K 3.43 MB

Terraform module to create Amazon Elastic Kubernetes (EKS) resources 🇺🇦

Home Page: https://registry.terraform.io/modules/terraform-aws-modules/eks/aws

License: Apache License 2.0

HCL 95.54% Smarty 0.75% Shell 2.09% PowerShell 1.62%

terraform terraform-module aws kubernetes eks elastic-kubernetes-service aws-eks aws-eks-cluster

terraform-aws-eks's People

Contributors

Stargazers

Watchers

Forkers

run-at-scale tanmng nextdeveloperteam artursmet markfreebairn rms1000watt navitastech stafot smalltown ingussneilands lcharkiewicz ozbillwang tsub prasannakalan yamaszone kamilhristov acbharat bypasslane bshelton229 billyteves charliec3 laverya hobbsh infra-modules xavierdavidgarcia nmettu-lcg jeff-french perryao dpiddock mandrean prakasha4devops lambertpan kpankonen irinafakotakis strygin madhukishore123 rochacon erks eric-gonzales geota urbanos-examples benashz sethlindberg cargill gbolo email2liyang nicgrayson newlyregistered26 sethpollack baileyvw tedchang77 danylohetmantsev monsterxx03 rgposadas subnova jayers99 mmcaya hobsons aweis89 briskgopesh crowd-ai vjremotegithub bkmeneguello rubiooo zihaoyu abnamrocoesd ryanli-me chris-mac dominik-k etopeter engineerchick sandipan6d daneharrigan xinzhang opengov annnnieee nkrendel rmakram-ims ctproject4 bkono mickengland larslevie interrobangc marky-mark asantos2000 onophris skang0601 jipengxiang seanclerkin stewb iliasbertsimas michaelyak kumarchatla nidcode aimanparvaiz thecrimsinghost lhegdal md2k saic-devsecops hmarquetant

terraform-aws-eks's Issues

ASG workers on spot instances

I have issues

It is great that with this module I can use more than one ASG worker pool. Would be nice to be able to use spot instances i.e. for background jobs or any applications that can recover very fast from replaced node.

I'm submitting a...

feature request

What is the current behavior?

Cannot use spot instances as worker nodes (or at least do not know how)

What's the expected behavior?

I could define that one (or all) of my worker nodes ASG are using spot instances.

Not Clear on EC2PrivateDNSName

I have issues

[ * ] support request

Should I be exporting my bastion's Ec2 private DNS NAME before I do terraform apply ?
I used bastion host to provision the EKS cluster
I am not clear on EC2PrivateDNSName variable

investigate adding create_before_destroy to worker asg to prevent downtime when recreating

I have issues

to change the instance type

I'm submitting a

bug report

What is the current behavior

* module.eks.aws_launch_configuration.workers: 1 error(s) occurred:

* aws_launch_configuration.workers: Error creating launch configuration: AlreadyExists: Launch Configuration by this name already exists - A launch configuration already exists with the name eks-path-prod-0
	status code: 400, request id: xxx-xxx-xxx-xxx-xxx

If this is a bug, how to reproduce? Please include a code sample

deploy EKS cluster, then change the instance type and apply again.

What's the expected behavior

Should be no issue.

Environment

Affected module version: 1.0.0
OS: Ubuntu
Terraform version: 0.11.7

Other relevant info

Cluster and worker security group specification doesn't work

I have issues

I am creating an eks cluster with providing a cluster_security_group_id and worker_security_group_id.

I'm submitting a

bug report
feature request
support request

What is the current behavior

When specifying a security group aka cluster_security_group_id = "sg-123" or worker_security_group_id = "sg-123". I get

Error: Error running plan: 1 error(s) occurred:
module.eks.local.cluster_security_group_id: local.cluster_security_group_id: Resource 'aws_security_group.cluster' not found for variable 'aws_security_group.cluster.id'`<br>

If this is a bug, how to reproduce? Please include a code sample

Create an eks cluster with a cluster_security_group_id or worker_security_group_id specified.

Terraform does not support short circut evaluation in it's ternary operator. The fix to the issue is specified here

What's the expected behavior

We should be able to create an EKS cluster while specifying cluster sg or worker sg as the documentation currently specifies.

Environment

Affected module version: 1.1.0
OS:
Terraform version: 0.11.7

Other relevant info

asg size changes should be ignored.

I have issues

asg size changes should be ignored.

I'm submitting a

feature request

What is the current behavior

Updated asg sized after deploy, terraform apply detects the changes, which should be ignored.

At least change indesired_capacity should be ignored.

  ~ module.eks.aws_autoscaling_group.workers
      desired_capacity:         "2" => "1"
      max_size:                 "5" => "3"
      min_size:                 "2" => "1"

What's the expected behaviour

Ignore the changes, since we don't want the running system to be re-sized.

Environment

Affected module version: 1.0.0
OS: ubuntu
Terraform version: 0.11.7

Other relevant info

If you are fine to ignore change in desired_capacity, I can raise PR for this feature, please confirm.

kube-proxy doesn't exist in the latest AWS worker node AMI

I have issues

I'm submitting a

bug report

What is the current behavior

kube-proxy doesn't exist in the latest AWS worker node AMI, but the userdata teamplate try to restart it, that will encounter error as below

Failed to restart kube-proxy.service: Unit not found.

What's the expected behavior

Remove the kube-proxy from restart step

How to define nodeSelector with autoscaling?

I have issues

On how to define nodeSelector with autoscaling?

I'm submitting a

support request

What is the current behavior

pods can be deployed to any nodes.

What's the expected behavior

I can manually set nodeSelector with label command to several nodes, but in autoscaling environment, how to work it out?

I found there are codes about worker_groups, but not sure how to use for labelling.

https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/eks_test_fixture/main.tf#L19-L34

  # the commented out worker group list below shows an example of how to define
  # multiple worker groups of differing configurations
  # worker_groups = "${list(
  #                   map("asg_desired_capacity", "2",
  #                       "asg_max_size", "10",
  #                       "asg_min_size", "2",
  #                       "instance_type", "m4.xlarge",
  #                       "name", "worker_group_a",
  #                   ),
  #                   map("asg_desired_capacity", "1",
  #                       "asg_max_size", "5",
  #                       "asg_min_size", "1",
  #                       "instance_type", "m4.2xlarge",
  #                       "name", "worker_group_b",
  #                   ),
  # )}"

Environment

Affected module version: 1,0.0
OS: Ubuntu
Terraform version: 0.11.7

Other relevant info

aws_auth config fails to apply while getting started

I have issues

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

Terraform apply fails with

* module.eks.null_resource.update_config_map_aws_auth: Error running command 'kubectl apply -f ./config-map-aws-auth_beam-eks.yaml --kubeconfig ./kubeconfig_beam-eks': exit status 1. Output: error: unable to recognize "./config-map-aws-auth_beam-eks.yaml": Unauthorized

If this is a bug, how to reproduce? Please include a code sample if relevvant.

This is my configuration for the eks module.

I have a really basic vpc created via terraform-aws-modules/vpc/aws.

module "eks" {
  source       = "terraform-aws-modules/eks/aws"
  cluster_name = "beam-eks"
  subnets      = "${module.vpc.public_subnets}"
  vpc_id       = "${module.vpc.vpc_id}"
}

What's the expected behavior?

Apply succeeds

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Affected module version: "1.4.0"
OS: MacOS 10.13.3 (17D47)
Terraform v0.11.8

provider.aws v1.33.0
provider.http v1.0.1
provider.local v1.1.0
provider.null v1.0.0
provider.template v1.0.0

it works with multiple worker groups in one EKS, thanks.

I need thank for the hide feature that I can manage multiple worker groups in one EKS, test version is v1.3.0.

So below codes work properly.
https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/eks_test_fixture/main.tf#L19-L34

My use case is, I need manage two groups of nodes, one for applications, one for monitoring service only. Later will add more node groups (label for nodeSelector ) for different purpose.

I'm submitting a...

kudos, thank you, warm fuzzy

Any other relevant info

Should we uncomment these lines or make another test cases?

Better support for multiple clusters

I'm submitting a

[] feature request

A couple of changes would be it easier to work with multiple clusters.

Include the cluster name in the file name here by default. This way other clusters won't overwrite the same file.
Include the cluster name in the configuration here. This will make some keys in here unique, which makes it easier to merge the configuration without manual adjustments.

Be able to define per ASGs tags

I have issues

Tags currently are too prescriptive. I have a use case where I need to tag different ASGs with different tags. Im using the ability to push these tags down to node labels and taints to drive different workloads on my kubernetes cluster. Atm, it seems that I can only define tags once on the top level EKS module and these tags are used globally throughout. I would like to be able to define tags per ASGs. A sensible place to provide these seems to be the list of worker_groups maps.

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

Tags are defined once in var.tags and used throughout both to tag the cluster resources itself, as well as, all ASGs that are created.

If this is a bug, how to reproduce? Please include a code sample if relevvant.

What's the expected behavior?

Should be able to provide tags in the list of worker_groups maps and these should be used to tagging the corresponding ASGs that are created for each respective worker group. If the tags are not set, they can default to the existing top-level global tag variable that is used.

Are you able to fix this problem and submit a PR? Link here if you have already.

I can submit a PR if this is reasonable.

Environment details

Affected module version:
OS:
Terraform version:

Any other relevant info

Support for the new amazon-eks-node-* AMI with bootstrap script

I have issues

The new amazon-eks-node-* AMI with bootstrap script has been released. However, it's not backward compatible with the old AMI and doesn't work with this module.

https://aws.amazon.com/blogs/opensource/improvements-eks-worker-node-provisioning/

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

This module only works with the eks-woker-* AMIs.

If this is a bug, how to reproduce? Please include a code sample if relevvant.

N/A

What's the expected behavior?

This module should also work with the new amazon-eks-node-* AMI. The entire userdata.sh.tpl can be reduced to something like this:

# Allow user supplied pre userdata code
${pre_userdata}

# Bootstrap and join the cluster
/etc/eks/bootstrap.sh --b64-cluster-ca '${cluster_auth_base64}' --apiserver-endpoint '${endpoint}' --kubelet-extra-args '${kubelet_extra_args}' '${cluster_name}'

# Allow user supplied userdata code
${additional_userdata}

Are you able to fix this problem and submit a PR? Link here if you have already.

I can contribute, but would like to discuss on how we want to approach backward compatibility first.

Environment details

Affected module version: 1.4.0
OS: all
Terraform version: all

Any other relevant info

See:

awslabs/amazon-eks-ami#16 for more info of the change.
https://aws.amazon.com/blogs/opensource/improvements-eks-worker-node-provisioning/

Should we manage k8s resources with this module?

This is a general question about the direction of this module.

We get requests that would require this module to manage or create Kubernetes resources. Some examples:

Modifying CNI configuration before worker ASG creation: #96
Deploying cluster autoscaler: #71
Manage add-ons: #19

I think we should have a clear position on this types of issues.

Specify multiple cluster/worker security groups

I have issues

I'm submitting a

bug report
feature request
support request

What is the current behavior

The EKS plugin currently supports being able to pass in 1 cluster and worker security group by id.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

I think it would make sense to support specifying an array of security group ids.

Environment

Affected module version:
OS:
Terraform version:

Other relevant info

We have a use case where we need to attach multiple security groups. Some of which are predefined.

Avoid using hardcoded value for max pod per node

Right now in the user-data script we have

sed -i s,MAX_PODS,20,g /etc/systemd/system/kubelet.service

The value 20 is hardcoded right now. Since AWS released the numbers in their CloudFormation template, I think we can extract the value and use a lookup function to get the proper value.

A proposal:

locals {
  # Mapping from the node type that we selected and the max number of pods that it can run
  # Taken from https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-06-05/amazon-eks-nodegroup.yaml
  max_pod_per_node = {
    c4.large    = 29
    c4.xlarge   = 58
    c4.2xlarge  = 58
    c4.4xlarge  = 234
    c4.8xlarge  = 234
    c5.large    = 29
    c5.xlarge   = 58
    c5.2xlarge  = 58
    c5.4xlarge  = 234
    c5.9xlarge  = 234
    c5.18xlarge = 737
    i3.large    = 29
    i3.xlarge   = 58
    i3.2xlarge  = 58
    i3.4xlarge  = 234
    i3.8xlarge  = 234
    i3.16xlarge = 737
    m3.medium   = 12
    m3.large    = 29
    m3.xlarge   = 58
    m3.2xlarge  = 118
    m4.large    = 20
    m4.xlarge   = 58
    m4.2xlarge  = 58
    m4.4xlarge  = 234
    m4.10xlarge = 234
    m5.large    = 29
    m5.xlarge   = 58
    m5.2xlarge  = 58
    m5.4xlarge  = 234
    m5.12xlarge = 234
    m5.24xlarge = 737
    p2.xlarge   = 58
    p2.8xlarge  = 234
    p2.16xlarge = 234
    p3.2xlarge  = 58
    p3.8xlarge  = 234
    p3.16xlarge = 234
    r3.xlarge   = 58
    r3.2xlarge  = 58
    r3.4xlarge  = 234
    r3.8xlarge  = 234
    r4.large    = 29
    r4.xlarge   = 58
    r4.2xlarge  = 58
    r4.4xlarge  = 234
    r4.8xlarge  = 234
    r4.16xlarge = 737
    t2.small    = 8
    t2.medium   = 17
    t2.large    = 35
    t2.xlarge   = 44
    t2.2xlarge  = 44
    x1.16xlarge = 234
    x1.32xlarge = 234
  }

  workers_userdata = <<USERDATA
#!/bin/bash -xe
CA_CERTIFICATE_DIRECTORY=/etc/kubernetes/pki
CA_CERTIFICATE_FILE_PATH=$CA_CERTIFICATE_DIRECTORY/ca.crt
mkdir -p $CA_CERTIFICATE_DIRECTORY
echo "${aws_eks_cluster.this.certificate_authority.0.data}" | base64 -d >  $CA_CERTIFICATE_FILE_PATH
INTERNAL_IP=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)
sed -i s,MASTER_ENDPOINT,${aws_eks_cluster.this.endpoint},g /var/lib/kubelet/kubeconfig
sed -i s,CLUSTER_NAME,${var.cluster_name},g /var/lib/kubelet/kubeconfig
sed -i s,REGION,${data.aws_region.current.name},g /etc/systemd/system/kubelet.service
sed -i s,MAX_PODS,${lookup(local.max_pod_per_node, var. workers_instance_type)},g /etc/systemd/system/kubelet.service
sed -i s,MASTER_ENDPOINT,${aws_eks_cluster.this.endpoint},g /etc/systemd/system/kubelet.service
sed -i s,INTERNAL_IP,$INTERNAL_IP,g /etc/systemd/system/kubelet.service
DNS_CLUSTER_IP=10.100.0.10
if [[ $INTERNAL_IP == 10.* ]] ; then DNS_CLUSTER_IP=172.20.0.10; fi
sed -i s,DNS_CLUSTER_IP,$DNS_CLUSTER_IP,g /etc/systemd/system/kubelet.service
sed -i s,CERTIFICATE_AUTHORITY_FILE,$CA_CERTIFICATE_FILE_PATH,g /var/lib/kubelet/kubeconfig
sed -i s,CLIENT_CA_FILE,$CA_CERTIFICATE_FILE_PATH,g  /etc/systemd/system/kubelet.service
systemctl daemon-reload
systemctl restart kubelet kube-proxy
USERDATA
}

@brandoconnor Please let me know if this is OK, I'll create a fork and a pull request later

Feature Request: Worker Configuration

I'm submitting a

feature request

What is the current behavior

Worker node number and instance type cannot be configured.

What's the expected behavior

Configuration options for worker node number and instance type can be specified in module inputs.

Include autoscaling related IAM policies for workers for the cluster-autoscaler

Currently we have to add the policy outside this module but I think 90% of people will use the cluster-autoscaler so it would be cool to have it included in this module and perhaps enabled with a variable.
kops currently has this by default here.

The policy would look something like this:

data "aws_iam_policy_document" "eks_node_autoscaling" {
  statement {
    sid    = "eksDemoNodeAll"
    effect = "Allow"

    actions = [
      "autoscaling:DescribeAutoScalingGroups",
      "autoscaling:DescribeAutoScalingInstances",
      "autoscaling:DescribeLaunchConfigurations",
      "autoscaling:DescribeTags",
      "autoscaling:GetAsgForInstance",
    ]

    resources = ["*"]
  }

  statement {
    sid    = "eksDemoNodeOwn"
    effect = "Allow"

    actions = [
      "autoscaling:SetDesiredCapacity",
      "autoscaling:TerminateInstanceInAutoScalingGroup",
      "autoscaling:UpdateAutoScalingGroup",
    ]

    resources = ["*"]

    condition {
      test     = "StringEquals"
      variable = "autoscaling:ResourceTag/Name"
      values   = ["xxxx-eks_asg"]
    }
  }
}

This allows would allow the cluster-autoscaler the access it needs to run correctly.

What do you think?

Security group "workers_ingress_cluster" is very limiting

Currently, in workers.tf, we have this security group:

resource "aws_security_group_rule" "workers_ingress_cluster" {
  description              = "Allow workers Kubelets and pods to receive communication from the cluster control plane."
  protocol                 = "tcp"
  security_group_id        = "${aws_security_group.workers.id}"
  source_security_group_id = "${local.cluster_security_group_id}"
  from_port                = 1025
  to_port                  = 65535
  type                     = "ingress"
  count                    = "${var.worker_security_group_id == "" ? 1 : 0}"
}

Basically, this setting makes it impossible for Kubernetes services to access pods that have have containerPort set to anything below 1025, which is a huge issue since so many of them use the 80 port (e.g. nginx). So, from_port should be set to 0, not 1025.

I realize this is copied from CloudFormation in the official EKS guide, so I'll also submit an issue there.

Override the default ingress rule that allows communication with the EKS cluster API.

I have issues

I would prefer to use the default security groups created for the cluster, but do not want the default API/32 to be used.

I'm submitting a

bug report
feature request
support request

What is the current behavior

Currently, if you use the default security groups it will create a security group role that allows communication with the eks cluster over the current API/32 cidr.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

I want to override the API/32 cidr and specify my own.

Environment

Affected module version: 1.1.0
OS:
Terraform version: 0.11.7

Other relevant info

Using computed values in worker group parameters results in `value of 'count' cannot be computed` error

I have issues

I'm submitting a...

bug report

What is the current behavior?

terraform plan produces this output when any worker group parameters are computed values:

laverya:~/dev$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

data.http.workstation_external_ip: Refreshing state...
data.aws_region.current: Refreshing state...
data.aws_availability_zones.available: Refreshing state...
data.aws_iam_policy_document.cluster_assume_role_policy: Refreshing state...
data.aws_iam_policy_document.workers_assume_role_policy: Refreshing state...
data.aws_ami.eks_worker: Refreshing state...

Error: Error refreshing state: 1 error(s) occurred:

* module.eks.data.template_file.userdata: data.template_file.userdata: value of 'count' cannot be computed

If this is a bug, how to reproduce? Please include a code sample if relevant.

provider "aws" {
  version = "~> 1.27"
  region  = "us-east-1"
}

data "aws_availability_zones" "available" {}

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "1.37.0"
  name    = "eks-vpc"
  cidr    = "10.0.0.0/16"
  azs     = ["${data.aws_availability_zones.available.names[0]}", "${data.aws_availability_zones.available.names[1]}", "${data.aws_availability_zones.available.names[2]}"]

  private_subnets    = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets     = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]

  tags = "${map("kubernetes.io/cluster/terraform-eks", "shared")}"
}

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "1.3.0"
  cluster_name = "terraform-eks"
  subnets = ["${module.vpc.private_subnets}", "${module.vpc.public_subnets}"]
  tags    = "${map("Environment", "test")}"
  vpc_id = "${module.vpc.vpc_id}"

  worker_groups = [
    {
      name          = "default-m5-large"
      instance_type = "m5.large"

      subnets = ""
      # subnets = "${join(",", module.vpc.private_subnets)}"
    },
  ]
}

Uncomment subnets = "${join(",", module.vpc.private_subnets)}" to replace subnets = "" in the worker_groups config and run terraform plan.

What's the expected behavior?

terraform plan completes and a plan is produced.

Are you able to fix this problem and submit a PR? Link here if you have already.

I have not yet identified the root cause.

Environment details

Affected module version: 1.3.0
OS: Ubuntu 16.04
Terraform version: Terraform v0.11.7

Any other relevant info

This makes it rather difficult to assign subnets to worker groups.

Assign public IPs to EKS workers in private subnets.

I have issues

I created EKS cluster in private subnets, we also have discussed about this topic in several tickets, we agree to create EKS workers in private subnets only.

Now it's time to decide, should we keep the feature to assign public IP to EKS workers?

If it is not required any more, I will raise PR to remove this line directly. Otherwise, I have to update with a condition, which way you like?

resource "aws_launch_configuration" "workers" {
  name_prefix                 = "${var.cluster_name}-${lookup(var.worker_groups[count.index], "name", count.index)}"
-  associate_public_ip_address = "${lookup(var.worker_groups[count.index], "public_ip", lookup(var.workers_group_defaults, "public_ip"))}"
  security_groups             = ["${local.worker_security_group_id}"]
  iam_instance_profile        = "${aws_iam_instance_profile.workers.id}"
  image_id                    = "${lookup(var.worker_groups[count.index], "ami_id", data.aws_ami.eks_worker.id)}"
  instance_type               = "${lookup(var.worker_groups[count.index], "instance_type", lookup(var.workers_group_defaults, "instance_type"))}"

I'm submitting a...

bug report
feature request

What is the current behavior?

When create workers in private subnets, public IPs are assigned to these workers.

Are you able to fix this problem and submit a PR? Link here if you have already.

Yes, I will

Environment details

Affected module version: v1.1.0
OS: ubuntu
Terraform version: 0.11.7

Any other relevant info

Bug - EKS can not create load balancers after module provisioned in new AWS account

I have issues

Provisioning EKS cluster in new AWS account will result in an error when attempting to provision a load balancer if no load balancers of any kind have been provisioned before.

I'm submitting a...

bug report

What is the current behavior?

No previous load balancers ( i.e. service-link role AWSServiceRoleForElasticLoadBalancing doesn't exist)

AccessDenied: User: <MODULE-PROVISIONED-ROLE> is not authorized to perform: iam:CreateServiceLinkedRole on resource: arn:aws:iam::<ACCOUNT-ID>:role/aws-service-role/elasticloadbalancing.amazonaws.com/AWSServiceRoleForElasticLoadBalancing

because EKS is attempting to create the ELB service-link role for you, and the roles created by the module lack iam:CreateServiceLinkedRole

If this is a bug, how to reproduce? Please include a code sample if relevant.

Provision EKS cluster using module into new account (or ensure service-link role AWSServiceRoleForElasticLoadBalancing doesn't exist)
Attempt to provision a Service of type LoadBalancer via kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1 
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.15.2
        ports:
        - containerPort: 80
---
kind: Service
apiVersion: v1
metadata:
  name: nginxservice
spec:
  type : LoadBalancer
  selector:
    app: nginx
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80

What's the expected behavior?

EKS should provision load balancer.

Module should optionally provision (via flag) a resource "aws_iam_service_linked_role", or include updated IAM policies (iam:CreateServiceLinkedRole) to allow the EKS cluster to provision the required service-link role. Alternatively, if this is deemed not the responsibility of the module, the "Assumptions" section in README.md should note the issue.

Are you able to fix this problem and submit a PR? Link here if you have already.

Possibly, depending on the choice of solution (implementation change, documentation update)

Environment details

Affected module version: All

Any other relevant info

AWS Service Link FAQ:
https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/elb-service-linked-roles.html#create-service-linked-role

Cluster DNS does not function

I have issues

I'm submitting a...

support request

DNS

How is cluster DNS supposed to work? I have not been able to get pods to resolve any cluster addresses (including kubernetes.default) using EKS. I suspect it's a function of how the AWS VPC CNI works (or doesn't) and figured other people using this module must be running into the same problem however I can't seem to find much on the internet about this in EKS.

locals {
  worker_groups = "${list(
                  map(
                      "name", "k8s-worker",
                      "ami_id", "ami-73a6e20b",
                      "asg_desired_capacity", "5",
                      "asg_max_size", "8",
                      "asg_min_size", "5",
                      "instance_type","m4.large",
                      "key_name", "${aws_key_pair.infra-deployer.key_name}"
                      ),
  )}"
  tags = "${map("Environment", "${terraform.workspace}")}"
}

data "aws_vpc" "vpc" {
  filter {
    name   = "tag:env"
    values = ["${terraform.workspace}"]
  }

  filter {
    name   = "tag:Name"
    values = ["${terraform.workspace}-us-west-2"]
  }
}

data "aws_subnet_ids" "eks_subnets" {
  vpc_id = "${data.aws_vpc.vpc.id}"

  tags {
    env  = "${terraform.workspace}"
    Name = "${terraform.workspace}-eks*"
  }
}

module "eks" {
  source                = "terraform-aws-modules/eks/aws"
  cluster_name          = "${terraform.workspace}"
  subnets               = "${data.aws_subnet_ids.eks_subnets.ids}"
  vpc_id                = "${data.aws_vpc.vpc.id}"
  kubeconfig_aws_authenticator_env_variables = "${map("AWS_PROFILE", "infra-deployer" )}"
  map_accounts          = ["${lookup(var.aws_account_ids, "prod")}"]
  worker_groups         = "${local.worker_groups}"
  tags                  = "${local.tags}"
}

Trying DNS on a brand new cluster:

$ kubectl exec -ti busybox -- nslookup kubernetes.default
Server:		172.20.0.10
Address:	172.20.0.10:53

** server can't find kubernetes.default: NXDOMAIN

*** Can't find kubernetes.default: No answer

$ kubectl exec -ti busybox -- cat /etc/resolv.conf
nameserver 172.20.0.10
search default.svc.cluster.local svc.cluster.local cluster.local staging.thinklumo.com us-west-2.compute.internal
options ndots:5

I've tried too many things to list here and at this point suspect its an issue with EKS so I'm hoping someone has been down this path already.

Are you able to fix this problem and submit a PR? Link here if you have already.

N/A

Environment details

Affected module version: latest
OS: AL2
Terraform version:

Terraform v0.11.7
+ provider.aws v1.25.0

Any other relevant info

Allow pre-userdata script on worker launch config

I have issues

I want to be able to run additional user data before the plugin user data on the worker launch config.
I am behind a proxy and need to configure the proxy information before anything else happens.

I'm submitting a

bug report
feature request
support request

What is the current behavior

The plugin only provides a way to specify additional user data that runs after the plugins user data.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

Environment

Affected module version: 1.1.0
OS:
Terraform version: 0.11.7

Other relevant info

Is there a way to modify the aws-k8s-cni yaml before creating the worker groups?

I have issues

The AWS CNI by default pre-allocates the max number of IPs per node which results in unnecessary depletion of my IP pool. As of CNI 1.1, you can fix this by setting WARM_IP_TARGET in the aws-k8s-cni.yaml but this needs to be applied before the EC2 instances are created.

Is there a way I can have Terraform apply a k8s config between creating the cluster and creating the worker groups? My current workaround is specifying 0 nodes in the Terraform module, applying my custom aws-k8s-cni.yaml, then changing the worker node count to my actual desired number.

Thanks!

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

The cluster is created using the release aws-k8s-cni.yaml which does not have WARM_IP_TARGET set.

If this is a bug, how to reproduce? Please include a code sample if relevvant.

What's the expected behavior?

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Affected module version: 1.3.0
OS: MacOS
Terraform version: v0.11.7

Any other relevant info

We should ignore changes to node ASG desired_capacity

I have issues

I'm submitting a

feature request

The reason is that after cluster creation, almost everyone will run the k8s node autoscaler. This autoscaler is changing the desired_capacity to suit resources required by the cluster. So when the cluster autoscales, then TF is run again later, you see something like this:

  ~ module.cluster_1.aws_autoscaling_group.workers
      desired_capacity:   "5" => "3"

You can just add a lifecycle statement to resource aws_autoscaling_group.workers:

  lifecycle {
    ignore_changes = [ "desired_capacity" ]
  }

AWS Profile in kubeconfig template

I'm submitting a

feature request

For my current delivery, the customer has credentials with multiple profiles and not only needs to specify different profiles per cluster, but has no default profile.

It would be great if the kubeconfig.tpl could be modified:

...
users:
- name: aws
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1alpha1
      command: heptio-authenticator-aws
      args:
        - "token"
        - "-i"
        - "${cluster_name}"
      env:
        - name: AWS_PROFILE
          value: ${aws_profile}

where the default Terraform value to populate the template would be default to ensure no regression.

I wanted to start a discussion before a PR to ensure best path forward on this. Thanks!

Feature Request Addons

I'm submitting a

feature request

Feature Request Addons

I would be nice to have an option which allows to install addons together with the cluster. There is another Terraform Registry module which does that: https://registry.terraform.io/modules/scholzj/kubernetes/aws/1.3.4#addons

Fix for AWS EKS “is not authorized to perform: iam:CreateServiceLinkedRole”

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

After deployingeks via this TF module in a brand new AWS account, the internet-facing k8s service I created could not create a load balancer. Turns out it's because this is a brand new AWS account and no ELB has been created in it before and the AWS user guide (as well as this module) assumes that AWSServiceRoleForElasticLoadBalancing already exists.

https://stackoverflow.com/questions/51597410/aws-eks-is-not-authorized-to-perform-iamcreateservicelinkedrole

Recommend adding

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": "iam:CreateServiceLinkedRole",
                "Resource": "arn:aws:iam::*:role/aws-service-role/*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "ec2:DescribeAccountAttributes"
                ],
                "Resource": "*"
            }
        ]
    }

To the cluster role policy.

How to launch worker in private subnet

In the getting started example

module "vpc" {
  source             = "terraform-aws-modules/vpc/aws"
  version            = "1.14.0"
  name               = "test-vpc"
  cidr               = "10.0.0.0/16"
  azs                = ["${data.aws_availability_zones.available.names[0]}", "${data.aws_availability_zones.available.names[1]}", "${data.aws_availability_zones.available.names[2]}"]
  private_subnets    = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets     = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
  enable_nat_gateway = true
  single_nat_gateway = true
  tags               = "${merge(local.tags, map("kubernetes.io/cluster/${local.cluster_name}", "shared"))}"
}

module "eks" {
  source             = "../.."
  cluster_name       = "${local.cluster_name}"
  subnets            = ["${module.vpc.public_subnets}", "${module.vpc.private_subnets}"]
  tags               = "${local.tags}"
  vpc_id             = "${module.vpc.vpc_id}"
  worker_groups      = "${local.worker_groups}"
  worker_group_count = "1"
  map_roles          = "${var.map_roles}"
  map_users          = "${var.map_users}"
  map_accounts       = "${var.map_accounts}"
}

Both private and public subnets are passed to eks module as a single variable. How does eks module determine which subnets are public and which subnets are private and thus launch worker into private subnets only?

can't update launch configuration.

I have issues

Recently I upgrade release from 1.1.0 to 1.3.0 and add some changes to launch configuration, such as key name,

I'm submitting a...

bug report

What is the current behavior?

* aws_launch_configuration.workers (deposed #0): ResourceInUse: Cannot delete launch configuration project-prod-020180630105107074900000001 because it is attached to AutoScalingGroup project-prod-monitoring
	status code: 400, request id: 46f32656-8661-11e8-9e77-51ef4818b760

If this is a bug, how to reproduce? Please include a code sample if relevvant.

change ami image id, add/remove key pair name, or other which need re-create a new launch configuration.

What's the expected behavior?

Smoothly updated.

Are you able to fix this problem and submit a PR? Link here if you have already.

I am still investigating this issue, if I can fix, will raise PR.

Environment details

Affected module version: v1.1.0 -> v1.3.0
OS: Ubuntu
Terraform version: 0.11.7

Any other relevant info

Here is the fix someone mentioned:

hashicorp/terraform#532 (comment)

Assumption Missing: Install Kubectl

I have issues

I'm submitting a

bug report

What is the current behavior

There is no mention of the requirement to have kubectl installed before running the script. The module will fail while applying the plan.

Error: Error applying plan:

1 error(s) occurred:

* module.eks.null_resource.configure_kubectl: Error running command 'kubectl apply -f .//config-map-aws-auth.yaml --kubeconfig .//kubeconfig': exit status 127. Output: /bin/sh: 1: kubectl: not found

What's the expected behavior

Have the a note and a link to the install instructions in the Assumption section of the README.md.

AMI eks-worker-* query returned no results

I have issues

I'm submitting a...

bug report

What is the current behavior?

If region is set to us-west-1:

Error: Error refreshing state: 1 error(s) occurred:

module.eks.data.aws_ami.eks_worker: 1 error(s) occurred:
module.eks.data.aws_ami.eks_worker: data.aws_ami.eks_worker: Your query returned no results. Please change your search criteria and try again.

If this is a bug, how to reproduce? Please include a code sample if relevant.

module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = "test-eks-cluster"
vpc_id = "${module.vpc.default_vpc_id}"
subnets = "${module.vpc.public_subnets}"

tags = {
Environment = "test"
Terraform = "true"
}
}

What's the expected behavior?

An available AWS AMI ID.

Are you able to fix this problem and submit a PR? Link here if you have already.

Not sure how.

Environment details

Affected module version: 1.3.0
OS: Linux
Terraform version: 0.11.7

Any other relevant info

Generated config not saved to correct output path

I have issues with when generated config-map is output to a folder

I'm submitting a...

[ * ] bug report

What is the current behavior?

The generated config-map-aws-auth***.yaml and kubeconfig.yaml file is saved to the root folder, the file name is appended with the value of the config_output_path variable

If this is a bug, how to reproduce? Please include a code sample if relevvant.

Create a cluster with a custom config_output_path

What's the expected behavior?

The config-map-aws-auth***.yaml file should be saved to the config_output_path

Are you able to fix this problem and submit a PR? Link here if you have already.

aws_auth.tf
line 3 - missing /
filename = "${var.config_output_path}/config-map-aws-auth_${var.cluster_name}.yaml"

line 9 - missing /
command = "kubectl apply -f ${var.config_output_path}/config-map-aws-auth_${var.cluster_name}.yaml --kubeconfig ${var.config_output_path}/kubeconfig_${var.cluster_name}"

kubectl.tf
line 3 - missing /
filename = "${var.config_output_path}/kubeconfig_${var.cluster_name}"

root_block_device missed

I have issues

running EKS cluster for a week, get disk space issue, need feature to control and extend with larger size.

I'm submitting a...

[X ] feature request

What is the current behavior?

No control on root block device, disk space is used out quickly

What's the expected behavior?

The root_block_device mapping supports the following:

volume_type - (Optional) The type of volume. Can be "standard", "gp2", or "io1". (Default: "standard").
volume_size - (Optional) The size of the volume in gigabytes.
iops - (Optional) The amount of provisioned IOPS. This must be set with a volume_type of "io1".
delete_on_termination - (Optional) Whether the volume should be destroyed on instance termination (Default: true).

Currently, only enable delete_on_termination

Are you able to fix this problem and submit a PR? Link here if you have already.

Yes, I will

Environment details

Affected module version: v1.1.0
OS: Ubuntu
Terraform version: 0.11.7

Any other relevant info

Worker ASG names should be exposed

I have issues

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

Worker ASG ARNs are exposed, but not names. ASG names are used in aws_autoscaling_attachment among other things.

If this is a bug, how to reproduce? Please include a code sample if relevant.

What's the expected behavior?

Both are exposed

Are you able to fix this problem and submit a PR? Link here if you have already.

PR: #77

Environment details

Affected module version:
OS:
Terraform version:

Any other relevant info

No any logs exported to Cloudwatch

I have issues

I have difficulty to troubleshooting EKS nodes issue, for example, OutOfDisk issue. I go to cloudwatch, there is no any instance logs or /var/log/message logs.

OutOfDisk Unknown Fri, 13 Jul 2018 00:11:01 +0000 Fri, 13 Jul 2018 00:11:43 +0000 NodeStatusUnknown Kubelet stopped posting node status.

Currently there is no key pair set and I can't login the eks nodes to do further check.

I'm submitting a...

feature request
support request

What is the current behavior?

No any logs from EKS cluster.

What's the expected behavior?

I need some ways to review the logs when something happened.

Are you able to fix this problem and submit a PR? Link here if you have already.

not sure how to fix this issue, need help.

Environment details

Affected module version: v1.1.0
OS: Ubuntu
Terraform version: 0.11.7

Any other relevant info

Allow adding new users, roles, and accounts to the configmap/aws-auth

I have issues

Amazon's EKS access control is managed via the aws-auth configmap which allows multiple IAM users and roles (cross-account capable) to be granted group membership. The current implementation only allows worker node access, this should be configurable to allow more access control rules to be specified per the documentation: https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html

I'm submitting a

bug report
feature request
support request

What is the current behavior

The current implementation only allows worker node access.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

Ability to specify role/user/account mappings for group membership.

Environment

Affected module version: 1.1.0
OS: Linux
Terraform version: 0.11.7

Other relevant info

Error running command to update_config_map_aws_auth

I have issues

Please Help! I ran everything with defaults other then setting the VPC and Subnet

I'm submitting a...

bug report
feature request
[X ] support request
kudos, thank you, warm fuzzy

What is the current behavior?

I get this error running the Terraform apply
Error: Error applying plan:

1 error(s) occurred:

null_resource.update_config_map_aws_auth: Error running command 'kubectl apply -f .//config-map-aws-auth_EKSClusterTest.yaml --kubeconfig .//kubec
onfig_EKSClusterTest': exit status 1. Output: Unable to connect to the server: getting token: exec: exec: "aws-iam-authenticator": executable file n
ot found in %PATH%

If this is a bug, how to reproduce? Please include a code sample if relevvant.

What's the expected behavior?

Set the config on the instance

Are you able to fix this problem and submit a PR? Link here if you have already.

No can I please have help?

Environment details

Affected module version:
OS:
Terraform version:

Any other relevant info

First time setting this up!

Name tags are too prescriptive; allow them more flexibility but provide sensible defaults

I have issues

I'm submitting a

bug report
feature request
support request

What is the current behavior

It's not possible to define what the Name tag of any resource will absolutely be. This should be able to be user defined.

Other relevant info

I think another variable map containing tag_defaults or a local (since computing is needed) variable will come in handy here. Will explore in next week's cycles.

Broader support of kubelet flags

I have a need to to taint nodes and did a quick implementation to support passing taints to the register-with-taints kubelet flag. Take a look here: perryao@435f62e.

But figured it might be worth talking through a solution that supports more of the flags without the need for one-off PRs for each one.

What do you think?

Flag docs here: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

Workstation cidr possibly not doing what's intended?

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

If this is a bug, how to reproduce? Please include a code sample if relevvant.

This is much more a question than an issue. I see the workstation cidr being allowed to access 443 in the security group attached to the eks cluster. I see the same thing in the terraform eks getting started post. My assumption was this would limit access to the kubernetes api (control plane) to that cidr. It doesn't do that and these control planes are fully accessible on the internet. Was the intention to allow only that cidr to access the control plane? I really wish that were possible. I'm likely completely missing the reason to allow that ingress.

What's the expected behavior?

I expected the control plane to be limited to the IP address in the cidr. My expectation may be completely wrong in which case maybe a different variable description could help.

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Affected module version:
OS:
Terraform version:

Any other relevant info

Experience with blue/green using this module?

I have issues

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

Currently, in order to achieve blue/green deployment with worker groups (i.e. updating to a new AMI), I have to add a new worker group with the updated AMI, let them spin up, drain the old nodes so pods transition, then scale down the old worker group (set min/max/desired to 0).

This is not a terrible way of doing it but the problem is that the old ASG (and related resources) sticks around forever and there doesn't seem to be a way to clean up the old stuff without major surgery. If I change the AMI 3 times, I now have 3 worker groups - 2 inactive and scaled to 0 and one active.

Is there a better way of doing this with this module? There's a distinct possibility I'm missing some fundamental terraform concepts but this seems like a complex issue to me. My code ends up looking like this after a new worker group is fully deployed and the old is scaled down (you can see how even semi-frequent deployments would make this list long and leave a lot of trailing garbage):

                  map(
                      "name", "k8s-worker-179fc16f",
                      "ami_id", "ami-179fc16f",
                      "asg_desired_capacity", "0",
                      "asg_max_size", "0",
                      "asg_min_size", "0",
                      ),
                  map(
                      "name", "k8s-worker-67a0841f",
                      "ami_id", "ami-67a0841f",
                      "asg_desired_capacity", "5",
                      "asg_max_size", "8",
                      "asg_min_size", "5",
                      "instance_type","${lookup(var.worker_sizes, "${terraform.workspace}")}",
                      "key_name", "${aws_key_pair.infra-deployer.key_name}",
                      "root_volume_size", "48"
                      )

Environment details

Affected module version: latest with customizations
OS: OSX and AL2
Terraform version: 0.11.7

Any other relevant info

I have seen other code around the internet that does blue/green ASGs but those are for much simpler use-cases IMO - a create_before_destroy and letting it rip would bring a K8s cluster down. I have no qualms with multiple apply steps - its the cleanup part that I'm after.

Bring your own security group

I have issues...

I'm submitting a

bug report
feature request
support request

What is the current behavior

Currently the module only supports creation of the security groups for the cluster and workers from within the module itself. Some of the rules are 100% necessary and others are just commonplace and therefore useful. The rules rely on dynamic values but could be applied just the same to a security group passed to the module instead of created within the module. This would give flexibility to the module consumer to provide their own security group and a predefined set of rules that might be tighter than what the module currently prescribes.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

The module should be able to accept a security group ID as input for both the cluster and workers with rules defined outside the module.

Environment

Affected module version: current (0.2.0)
OS: All
Terraform version: 0.11.x

Feature Request: Key Pair for Worker Nodes

I'm submitting a

feature request

What is the current behavior

I want to access the worker nodes via SSH using a Key Pair (Private-Public Key). However there is no option which allows me to specify a key pair to be installed on the worker nodes.

What's the expected behavior

Provide an option to specify a key pair (local or already in AWS) and installed it on the worker nodes. It should be possible to set the key name via variable. For inspiration have a look at Terraform-DC/OS module: https://github.com/dcos/terraform-dcos/tree/master/aws#configure-aws-ssh-keys

It seems the cluster it is running with Authorization enabled (like RBAC) and there is no permissions for the ingress controller. Please check the configuration

I have issues when deploy alb-ingress-controller

It seems the cluster it is running with Authorization enabled (like RBAC) and there is no permissions for the ingress controller. Please check the configuration

I'm submitting a

bug report
support request

What is the current behavior

I can't deploy alb-ingress-controller with the above error.

If this is a bug, how to reproduce? Please include a code sample

I am not 100% sure it is a bug.

After create EKS cluster with this module, I went through the steps to step 4, I got this error.

This step has additional help, but not sure how to set in EKS cluster with this module.

Deploy the modified alb-ingress-controller.

$ kubectl apply -f alb-ingress-controller.yaml

The manifest above will deploy the controller to the kube-system namespace. If you deploy it outside of kube-system and are using RBAC, you may need to adjust RBAC roles and bindings.

What's the expected behavior

Should work without error.

Environment

Affected module version: latest (a80c6e6)
OS: AWS EKS
Terraform version: 0.11.7

Other relevant info

Bug: Module ignores custom AMI ID

I have issues

I want to use the Ubuntu EKS image (custom) AMI and my settings are ignored by the module.

I'm submitting a...

bug report

What is the current behaviour?

A custom AMI ID is ignored by the module. I tried to set the AMI ID to the following ami_id in the workers_group_defaults section:

workers_group_defaults = {
      ...
      ami_id               = "ami-39397a46"  # Ubuntu Image
      ...
}

The specified AMI is the Ubuntu EKS image for us-east-1. Ubuntu released a statement that they will support and update an image specifically for EKS. The image AMI ID's can be found here: https://cloud-images.ubuntu.com/aws-eks/?_ga=2.56651242.1343651116.1533683680-508754220.1533683680

I would like to use the Ubuntu image instead of the Amazon AMI to make my environment more portable.

If this is a bug, how to reproduce? Please include a code sample if relevant.

Use try to set the ami_id in the workers_group_defaults to use the Ubuntu EKS AMI ID ami-39397a46 (us-east-1) or ami-6d622015 (us-west-2).

What's the expected behaviour?

Custom AMI ID's can be used for worker nodes. Specifically, Ubuntu EKS can be used with this module.

Are you able to fix this problem and submit a PR? Link here if you have already.

Maybe this is just a misunderstanding or it is super easy to fix. If not let me know.

Environment details

Affected module version: 1.4.0
OS: Ubuntu 18.04 LTS (Container)
Terraform version: v0.11.7

Allow worker nodes to be created in private subnets if eks cluster has both private and public subnets

I have issues

I'm submitting a

bug report
[x ] feature request
support request

What is the current behavior

Based on this guide from aws, it is recommended that you specify both public and private subnets when creating your eks cluster, but that you only create your worker nodes in your private subnets. Current behaviour in this module will use the same subnets for creating the eks cluster as for placing the worker nodes within.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

I believe it would be a good feature to add an additional (optional) list variable to the module called worker_subnets that will be used to create the worker nodes within. This means you can add private and public subnets to the subnets variable, but only add private subnets to the worker_subnets variable.

Environment

Affected module version:
OS:
Terraform version:

Other relevant info

I have a branch with this feature on a fork, I will add a PR to be looked at.

Use of name_prefix

Currently we have:

resource "aws_iam_role" "workers" {
  name_prefix        = "${aws_eks_cluster.this.name}"
  assume_role_policy = "${data.aws_iam_policy_document.workers_assume_role_policy.json}"
}

resource "aws_iam_instance_profile" "workers" {
  name_prefix = "${aws_eks_cluster.this.name}"
  role        = "${aws_iam_role.workers.name}"
}

Is there a reason to use name_prefix instead of just name? I ask because the resultant names are things like my-cluster-20180808095045107900000005.

We have to create cross account IAM policies for things like ECR and it would be nice to have a predictable and consistent name for the roles 🙂

Automatic deployment of Cluster Autoscaler

I have issues

Although worker nodes are deployed as autoscaling group, when EKS cannot schedule more pods because of missing resources i.e. CPU more nodes are not started by ASG. Would be nice to deploy Cluster Autoscaler automatically (or at least put some information in README how to do this) so we can benefit from ASG.

I'm submitting a...

feature request

What is the current behavior?

ASG workers are not started even if EKS cannot schedule more pods because of missing CPU and we are still below the maximum size of workers ASG.

What's the expected behaviour?

The expected behaviour would be:

deploy Cluster Autoscaler: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md
(
this permissions needs to be added for EC2 EKS IAM role:

"autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup"

)
2. Scale sample application so it needs more than whole CPU from single VM.
3. See that Autoscaler is adding more nodes.

Environment details

ESK in us-east-1

Terraform version:
v0.11.7