GithubHelp home page GithubHelp logo

vultr / terraform-vultr-condor Goto Github PK

View Code? Open in Web Editor NEW
76.0 4.0 15.0 182 KB

Kubernetes Deployment Tool for Vultr

HCL 94.49% Shell 5.51%
kubernetes-deployment kubernetes-cluster k8s-deployment k8s-cluster vultr vultr-k8 terraform-module kubernetes vultr-kubernetes

terraform-vultr-condor's Issues

[BUG] issue during the creation of controller instance

Describe the bug
I encountered the following error when I performed terraform apply
....
module.condor.vultr_private_network.condor_network: Creating...
module.condor.vultr_firewall_group.condor_firewall[0]: Creating...
module.condor.vultr_private_network.condor_network: Creation complete after 0s [id=81e47861-ca70-4ec1-9b1c-ff2ffcdabb0f]
module.condor.vultr_ssh_key.cluster_provisioner: Creation complete after 0s [id=fd4d4d0a-3276-41f9-a1b8-3c8c376acbae]
module.condor.vultr_firewall_group.condor_firewall[0]: Creation complete after 0s [id=7f49a818-fc77-476d-bee1-ca7185dacf8c]
module.condor.vultr_instance.workers[0]: Creating...
module.condor.vultr_instance.controllers[0]: Creating...
module.condor.vultr_instance.workers[1]: Creating...
module.condor.vultr_instance.workers[1]: Still creating... [10s elapsed]
module.condor.vultr_instance.controllers[0]: Still creating... [10s elapsed]
module.condor.vultr_instance.workers[0]: Still creating... [10s elapsed]
module.condor.vultr_instance.controllers[0]: Still creating... [20s elapsed]
module.condor.vultr_instance.workers[0]: Still creating... [20s elapsed]
module.condor.vultr_instance.workers[1]: Still creating... [20s elapsed]
module.condor.vultr_instance.workers[1]: Still creating... [30s elapsed]
module.condor.vultr_instance.workers[0]: Still creating... [30s elapsed]
module.condor.vultr_instance.controllers[0]: Still creating... [30s elapsed]
module.condor.vultr_instance.workers[0]: Still creating... [40s elapsed]
module.condor.vultr_instance.workers[1]: Still creating... [40s elapsed]
module.condor.vultr_instance.controllers[0]: Still creating... [40s elapsed]
module.condor.vultr_instance.workers[0]: Provisioning with 'file'...
module.condor.vultr_instance.controllers[0]: Provisioning with 'file'...
module.condor.vultr_instance.workers[1]: Provisioning with 'file'...
module.condor.vultr_instance.workers[1]: Still creating... [50s elapsed]
module.condor.vultr_instance.controllers[0]: Still creating... [50s elapsed]
module.condor.vultr_instance.workers[0]: Still creating... [50s elapsed]
module.condor.vultr_instance.controllers[0]: Provisioning with 'remote-exec'...
module.condor.vultr_instance.controllers[0] (remote-exec): Connecting to remote host via SSH...
module.condor.vultr_instance.controllers[0] (remote-exec): Host: 216.128.129.184
module.condor.vultr_instance.controllers[0] (remote-exec): User: root
module.condor.vultr_instance.controllers[0] (remote-exec): Password: false
module.condor.vultr_instance.controllers[0] (remote-exec): Private key: false
module.condor.vultr_instance.controllers[0] (remote-exec): Certificate: false
module.condor.vultr_instance.controllers[0] (remote-exec): SSH Agent: true
module.condor.vultr_instance.controllers[0] (remote-exec): Checking Host Key: false
module.condor.vultr_instance.controllers[0] (remote-exec): Target Platform: unix
module.condor.vultr_instance.workers[1]: Still creating... [1m0s elapsed]
module.condor.vultr_instance.controllers[0]: Still creating... [1m0s elapsed]
module.condor.vultr_instance.workers[0]: Still creating... [1m0s elapsed]
module.condor.vultr_instance.controllers[0] (remote-exec): Connected!
module.condor.vultr_instance.controllers[0] (remote-exec): + apt -y update
module.condor.vultr_instance.controllers[0] (remote-exec):
module.condor.vultr_instance.controllers[0] (remote-exec): 0% [Working]
module.condor.vultr_instance.controllers[0] (remote-exec): Hit:1 http://security.debian.org/debian-security buster/updates InRelease
module.condor.vultr_instance.controllers[0] (remote-exec):
module.condor.vultr_instance.controllers[0] (remote-exec): 0% [Connecting to debian.map.fastlydns.
module.condor.vultr_instance.controllers[0] (remote-exec): Hit:2 http://deb.debian.org/debian buster InRelease
module.condor.vultr_instance.controllers[0] (remote-exec): Hit:3 http://deb.debian.org/debian buster-updates InRelease
module.condor.vultr_instance.controllers[0] (remote-exec):
module.condor.vultr_instance.controllers[0] (remote-exec): 0% [Working]
module.condor.vultr_instance.controllers[0] (remote-exec):
module.condor.vultr_instance.controllers[0] (remote-exec): 0% [Working]
module.condor.vultr_instance.controllers[0] (remote-exec):
module.condor.vultr_instance.controllers[0] (remote-exec): 0% [Working]
module.condor.vultr_instance.controllers[0] (remote-exec): 20% [Working]
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 0%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 0%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 0%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 57%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 57%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 95%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 95%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 98%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 98%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 99%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 99%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 99%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 99%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 99%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... 99%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading package lists... Done
module.condor.vultr_instance.controllers[0] (remote-exec): Building dependency tree... 0%
module.condor.vultr_instance.controllers[0] (remote-exec): Building dependency tree... 0%
module.condor.vultr_instance.controllers[0] (remote-exec): Building dependency tree... 0%
module.condor.vultr_instance.controllers[0] (remote-exec): Building dependency tree... 50%
module.condor.vultr_instance.controllers[0] (remote-exec): Building dependency tree... 50%
module.condor.vultr_instance.controllers[0] (remote-exec): Building dependency tree
module.condor.vultr_instance.controllers[0] (remote-exec): Reading state information... 0%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading state information... 0%
module.condor.vultr_instance.controllers[0] (remote-exec): Reading state information... Done
module.condor.vultr_instance.controllers[0] (remote-exec): 9 packages can be upgraded. Run 'apt list --upgradable' to see them.
module.condor.vultr_instance.controllers[0] (remote-exec): + apt -y install jq gnupg2
module.condor.vultr_instance.controllers[0] (remote-exec): E: Could not get lock /var/lib/dpkg/lock-frontend - open (11: Resource temporarily unavailable)
module.condor.vultr_instance.controllers[0] (remote-exec): E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), is another process using it?

Something's definitely not right when the controller instance is being created. Can you take a look into condor-provision.sh script ?

[BUG] - Kubeadm init fails with timeout

Describe the bug
I'm trying to provision a cluster using the example code in the README, but it always times out when getting to kubeadm init:

Unfortunately, an error has occurred: timed out waiting for the condition

journalctl -xeu kubelet returns a load of different errors:

eviction_manager.go:260] eviction manager: failed to get summary stats: failed to get node info: node "condor-default-6be8751bc7f10eb7-controller-0" not found

kubelet.go:2163] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://10.240.0.3:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/condor-default-6be8751bc7f10eb7-controller-0?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

kubelet_node_status.go:563] Failed to set some node status fields: failed to validate nodeIP: node IP: "10.240.0.3" not found in the host's network interfaces
trace.go:205] Trace[1635771837]: "Reflector ListAndWatch" name:k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46 (03-Sep-2021 13:40:04.283) (total time: 30000ms):

reflector.go:138] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://10.240.0.3:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dcondor-default-6be8751bc7f10eb7-controller-0&limit=500&resourceVersion=0": dial tccp 10.240.0.3:6443: i/o timeout

reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://10.240.0.3:6443/api/v1/services?limit=500&resourceVersion=0": dial tcp 10.240.0.3:6443: i/o timeout

reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.RuntimeClass: failed to list *v1.RuntimeClass: Get "https://10.240.0.3:6443/apis/node.k8s.io/v1/runtimeclasses?limit=500&resourceVersion=0": dial tcp 10.240.0.3:6443: i/o timeout

controller.go:144] failed to ensure lease exists, will retry in 7s, error: Get "https://10.240.0.3:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/condor-default-6be8751bc7f10eb7-controller-0?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

# With loads of these in between
kubelet.go:2243] node "condor-default-6be8751bc7f10eb7-controller-0" not found

My module configuration is:

module "condor" {
  source  = "vultr/condor/vultr"
  version = "1.2.0"
  cluster_vultr_api_key = var.vultr_api_key
  provisioner_public_key = chomp(file("~/.ssh/id_rsa.pub"))
  cluster_region = "lhr"
}

API key is set, whitelisted, I can connect to all 3 nodes fine.

To Reproduce
Steps to reproduce the behavior:

  1. use above module code, terraform init, terraform apply
  2. wait 4-5 minutes
  3. See error

Expected behavior
For the cluster to be provisioned

Screenshots
Screenshot from 2021-09-03 14-50-59

Screenshot from 2021-09-03 14-51-56

Desktop (please complete the following information where applicable:

  • OS: Ubuntu 20.04
  • Version: Terraform v1.0.5, terraform-vultr-condor v1.2.0

Additional context

Any help will be greatly appreciated, I must be going wrong somewhere if other people have got this working out of the box.
Thanks

Repository initialization failure regarding API key error

Description of the error:
I initialized a new repository with only a main.tf with the exact instructions read on the readme of this repository, then I set my API Key as a string as per documentation:

# main.tf
module "condor" {
  source                 = "vultr/condor/vultr"
  version                = "1.1.1"
  provisioner_public_key = chomp(file("~/.ssh/id_rsa.pub"))
  cluster_vultr_api_key  = var.cluster_vultr_api_key
}

I've tried:

  • Using it directly as a string on condor module.
  • Importing it as a variable declared on main.tf
  • Importing it from another file as variable
  • Using it on the terminal as an input to the Terraform CLI interface
  • Exporting it as an environment variable (using the proper TF_VAR_ syntax)

If there is only a cluster_vultr_api_key as variable, then it returns this error:

╷
│ Error: Missing required argument
│
│ The argument "api_key" is required, but was not set.
╵

If there is also an api_key declared with cluster_vultr_api_key, it returns:

╷
│ Error: Unsupported argument
│
│   on main.tf line 6, in module "condor":
│    6:   api_key = var.api_key
│
│ An argument named "api_key" is not expected here.
╵

But if you only declare api_key:

╷
│ Error: Missing required argument
│
│   on main.tf line 1, in module "condor":
│    1: module "condor" {
│
│ The argument "cluster_vultr_api_key" is required, but no definition was found.
╵

My environment

Terraform v0.15.0
on linux_amd64
+ provider registry.terraform.io/hashicorp/http v2.0.0
+ provider registry.terraform.io/hashicorp/null v3.0.0
+ provider registry.terraform.io/hashicorp/random v3.0.1
+ provider registry.terraform.io/vultr/vultr v2.1.3

On a Debian 10.9 using WSL2. I've tried doing the exact same inside one of Vultr's VPS's and had the same results.

I remember that I could make it work by modifying the terraform-vultr-condor plugin itself, but I didn't write down which changes I did to the Terraform's code.

I also work with other terraform states/plans in this machine and none of them presented similar errors, so I am not sure what could be the source of this problem. Maybe because it was originally developed prior to v0.13 changes? I will work on this and present a fork if I reach something worth commiting. Thank you.

[BUG] - Coredns Pods unable to be scheduled (start) because no CNI networks defined

Describe the bug
After using the Terraform module the cluster is an essentially non working state because CNI networking is not working, CoreDNS pods will not start.

Error message from kubelet service:
Aug 3 03:31:20 guest kubelet: W0803 03:31:20.052496 6938 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Aug 3 03:31:20 guest kubelet: E0803 03:31:20.619335 6938 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

To Reproduce
Steps to reproduce the behavior:

  1. Terraform apply
  2. Kubectl get pods --all-namespaces

Expected behavior
Coredns pods should be in the running state

Additional context

Note that the Flannel pods are in a running state

Terraform Errors on validate

I am getting following error:

[root@vultrguest ~]# terraform validate

Error: Invalid resource type

  on .terraform/modules/cluster/vultr-controllers.tf line 1, in resource "vultr_server" "controllers":
   1: resource "vultr_server" "controllers" {

The provider provider.vultr does not support resource type "vultr_server".


Error: Error in function call

  on .terraform/modules/cluster/vultr-ssh.tf line 3, in resource "vultr_ssh_key" "provisioner":
   3:   ssh_key = trimspace(file("~/.ssh/id_rsa.pub"))

Call to function "file" failed: no file exists at /root/.ssh/id_rsa.pub.


Error: Invalid resource type

  on .terraform/modules/cluster/vultr-vpc.tf line 1, in resource "vultr_network" "cluster_network":
   1: resource "vultr_network" "cluster_network" {

The provider provider.vultr does not support resource type "vultr_network".


Error: Invalid resource type

  on .terraform/modules/cluster/vultr-workers.tf line 1, in resource "vultr_server" "workers":
   1: resource "vultr_server" "workers" {

The provider provider.vultr does not support resource type "vultr_server".


Invalid value for "path" parameter: no file exists at /root/.ssh/id_rsa.pub

I tried to run everything onto a new server and getting only 1 error this time on following command

[root@staging ~]# terraform validate

Error: Invalid function argument

  on .terraform/modules/cluster/vultr-ssh.tf line 3, in resource "vultr_ssh_key" "provisioner":
   3:   ssh_key = trimspace(file("~/.ssh/id_rsa.pub"))

Invalid value for "path" parameter: no file exists at /root/.ssh/id_rsa.pub;
this function works only with files that are distributed as part of the
configuration source code, so if this file will be created by a resource in
this configuration you must instead obtain this result from an attribute of
that resource.

Following two files do not exist on my Centos server

~/.ssh/id_rsa.pub
~/.ssh/id_rsa

If I create empty files on the above paths then I get following errors when I run terraform apply.



Error: Error creating SSH key: Unable to create SSH Key: Invalid SSH key.  Keys should be in authorized_keys format

  on .terraform/modules/cluster/vultr-ssh.tf line 1, in resource "vultr_ssh_key" "provisioner":
   1: resource "vultr_ssh_key" "provisioner" {



Error: Error creating network: Not currently enabled on your account, enable on https://my.vultr.com/network/

  on .terraform/modules/cluster/vultr-vpc.tf line 1, in resource "vultr_network" "cluster_network":
   1: resource "vultr_network" "cluster_network" {


[Feature] - Block Storage in More Locations?

This kind of a dumb request, but just hoping someone else might see it since I know I've asked support about this in the past already.

I've been hoping to hear more about expansions of block/object storage into locations outside of New Jersey for some time...I'd love to be able to try out some of this new Kubernetes stuff but wouldn't be able to truly utilize it in production unless a West Coast location (preferably the LA region for us) was available for those products.

I'll keep my fingers crossed that the team is already hard at work trying to add those features in other locations already!

how to use user_data ?

Can I use user_data instead of ssh_key_ids?

because user_data can help me quickly create the system account.

I try user_data, but it does not work.

Missing required argument even during execution of `terraform plan`

Describe the bug
When executing terraform plan -out main.tfplan, terraform complains about missing required argument

To Reproduce
Steps to reproduce the behavior:

  1. Execute terraform -out main.tfplan using the the following main.tf
    module "condor" {
    source = "vultr/condor/vultr"
    version = "1.1.0"
    #
    cluster_vultr_api_key = "BLAHBLAH"
    provisioner_public_key = chomp(file("/home/me/keypair/id_vultr.pub"))
    cluster_name = "k8s"
    cluster_region = "ewr"
    worker_count = 2
    condor_network_subnet = "10.10.10.0"
    condor_network_subnet_mask = 26
    }

  2. The following error appears

Error: Missing required argument

│ The argument "api_key" is required, but was not set.

According to https://registry.terraform.io/modules/vultr/condor/vultr/1.1.0?tab=inputs, the only required inputs are
cluster_vultr_api_key
provisioner_public_key

Additional context

Add any other context about the problem here.

[BUG] variable "enable_backups"

Describe the bug
I encountered the following error when I performed terraform apply
module.condor.vultr_instance.workers[1]: Creating...
module.condor.vultr_instance.workers[0]: Creating...
module.condor.vultr_instance.controllers[0]: Creating...

│ Error: error creating server: {"error":"Backups option must be either enabled or disabled.","status":400}

│ with module.condor.vultr_instance.controllers[0],
│ on .terraform/modules/condor/main.tf line 49, in resource "vultr_instance" "controllers":
│ 49: resource "vultr_instance" "controllers" {



│ Error: error creating server: {"error":"Backups option must be either enabled or disabled.","status":400}

│ with module.condor.vultr_instance.workers[1],
│ on .terraform/modules/condor/main.tf line 94, in resource "vultr_instance" "workers":
│ 94: resource "vultr_instance" "workers" {



│ Error: error creating server: {"error":"Backups option must be either enabled or disabled.","status":400}

│ with module.condor.vultr_instance.workers[0],
│ on .terraform/modules/condor/main.tf line 94, in resource "vultr_instance" "workers":
│ 94: resource "vultr_instance" "workers" {

I'm suspecting that the following section of the variables.tf is incorrect
variable "enable_backups" {
description = "Enable/disable instance backups"
type = bool
default = false
}

According to this link, "backups" argument expects "enabled or disabled", not boolean, "true or false".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.