GithubHelp home page GithubHelp logo

snowdrop / k8s-infra Goto Github PK

View Code? Open in Web Editor NEW
29.0 10.0 14.0 14.11 MB

Information to bootstrap vms using dedicated server, local machine and setup using Ansible Playbooks

License: Apache License 2.0

Shell 27.51% Smarty 0.26% Python 2.73% Java 4.22% Jinja 65.06% Dockerfile 0.23%
infrastructure-as-code cloud utilities openshift kubernetes ansible-playbooks ansible minikube kind

k8s-infra's Introduction

Automating the deployment of a kubernetes/ocp cluster

Introduction

This project details the prerequisites and steps necessary to automate the installation of a Kubernetes (aka k8s) cluster or Openshift 4 top of one of the following cloud provider:

ℹ️
kind is not a cloud provider but a tool able to run a k8s cluster on a container engine

Before you start

⚠️

All the commands mentioned on this project are to be executed at the root folder of the repository, except if stated otherwise.

Prerequisites

This project uses Ansible. Check the Ansible Document for the installation and usage instructions.

To use the scripts, playbooks, part of this project, some prerequisites are needed. It is not mandatory to install all of them and the following chapters will mention which ones are needed.

Python

Several requirements are provided as Python libraries, including Ansible, and are identified on the requirements.txt file.

Using a Python Virtual Environment is recommended and can be created using the following command:

python3 -m venv .snowdrop-venv

After creating the virtual environment start using it with the following command:

source .snowdrop-venv/bin/activate

The venv will be in use when the (.snowdrop-venv) prefix is shown on the bash prompt.

The python requirements can be installed by executing:

pip3 install -r requirements.txt
ℹ️

For more information check the Python Virtual Env section on our Ansible README.

Ansible

Several Ansible Galaxy collections are used as part of this project and are listed in the collections/requirements.yml file. To install them execute the following command.

ansible-galaxy collection install -r ./collections/requirements.yml --upgrade

Kind

Tools: docker (or podman), kind

To automate the installation of a k8s "kind" cluster locally like also to set up an ingress controller or a docker container registry, use our opinionated bash scripts :-).

You can find more information about kind tool using the official documentation - https://kind.sigs.k8s.io/docs/user/quick-start/

Minikube

Tools: docker (or podman), minikube

See the official documentation to install minikube on Macos, Linux or Windows

Cloud provider

The provisioning process towards the cloud providers relies on the following assumptions:

  • Password store is installed/configured and needed k/v created

  • Flavor, volume, capacity (cpu/ram/volume) and OS can be mapped with the playbook of the target cloud provider

  • Permissions have been set to allow to provision a VM top of the target cloud provider

  • SSH key exist and has been imported (or could be created during provisioning process)

and will include the following basic steps:

  • Create a VM, mount a volume and import the SSH key

  • Execute a pos installation script to install some needed services

  • Register the Hostnames against the domain name (using Lets’encrypt and DNS provider)

  • Deploy an ocp4 cluster and configure the different ingress routes to access the console, API, registry, etc

ℹ️
Optionally, we could also install different kubernetes tools if we would like to access/use the VM (e.g. kubectl, oc, helm, k9s, konfig, ect - see tooling section).

Red Hat

This section details how to provision an Openshift 4 cluster using one of Red Hat environments available such as:

OpenStack - RHOS PSI

Tools: password store, ansible

The OpenStack page explains the process using the RHOS cloud provider.

Tools: password store, ansible

Work in progress

IBM Cloud

Tools: password store, ansible

See ibm-cloud

Hetzner

Bare metal

Tools: password store, ansible, hcloud

See hetzner page explaining how to create a vm.

Virtualized machine

Tools: password store, ansible, hcloud

See hetzner-cloud page explaining how to create a cloud vm.

Cluster Deployment

As the vm is now running and the docker daemon is up, you can install your k8s distribution using either one of the following approaches :

Kubernetes

You can then use the following instructions to install a Kubernetes cluster with the help of Ansible and the roles we created

OpenShift

  • Simple using the oc binary tool and the command oc cluster up within the vm

  • More elaborated using Ansible tool and one of the following playbook/role:

    • oc cluster up role

    • openshift-ansible all-in-one playbook as described here

Sandbox

Material not actively maintained to create a VM, run on your desktop a k8s cluster or provision it with Istio, Jaeger, Fabric8 launcher, Ansible Broker catalog, etc

k8s-infra's People

Contributors

apupier avatar aureamunoz avatar bbaassssiiee avatar cmoulliard avatar dependabot[bot] avatar geoand avatar jacobdotcosta avatar lincolnthree avatar rpelisse avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

k8s-infra's Issues

Refactor playbook add-users with role identity_provider

Refactor the playbook add-users with role identity_provider in order to :

  • Convert the add_users.yml playbook to a role
  • Extract from the identity_provider the task to install htpasswd package if not there like the steps to create the user/passwords to the new role
  • To be able to add range of users and project
  • To be able to create admin user and password. No project should be then created

Delete symbolic link of the redhat cert file under docker registry

Post creation of a new Openshift cluster using Ansible OpenShift playbooks, I had several times to delete this symbolic file within the linux vm

/etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt

in order to let the openshift server to download the openjdk1.8 image from the registry.access.redhat.com server

Remark : That was working pretty well 2 weeks ago

Refactor post_installation playbook to use new convention -> import|include_roles

Since Ansible 2.4, we can adopt this new convention as we can :

  • Define the task to be used from the role and then have one role to by example, install, uninstall

  • Pass condition

  • Old

---
- hosts: webservers
  roles:
    - { role: foo, tags: ["bar", "baz"] }
  • New
---
- hosts: webservers
  tasks:
  - import_role:
      name: foo
    tags:
    - bar
    - baz

Don't use logged user when command creating a project are used

When we create by example the infra project, our oc command is executed using the current logged user which has perhaps or not the appropriate role to create a project, configMap, serviceaccount ....

Nevertheless, if the role linked to the user is the one used, that means that he/she will be able to manage the content of the infra project. This is not an use for the admin user which is cluster wised but this is a problem for demo's users (user1, user2, ....)

To secure our platform in that case, the following parameter should be passed to the oc command when a resource is created/deleted or edited

oc --config={{ openshift.common.config_base }}/admin.kubeconfig 

where {{ openshift.common.config_base }} could be : /etc/origin/master

If we create the 'infra' project as such

- name: Create project
  command: oc --config=/etc/origin/master/admin.kubeconfig new-project {{ infra_project }}

then the user can't access content of infra folder

screenshot 2018-05-09 11 49 14

Permissions 0644 for 'inventory/id_openstack.rsa' are too open

TASK [Gathering Facts] *******************************************************************************************************************************************************************************************************************************************
fatal: [10.8.241.7]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\r\n@         WARNING: UNPROTECTED PRIVATE KEY FILE!          @\r\n@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\r\nPermissions 0644 for 'inventory/id_openstack.rsa' are too open.\r\nIt is required that your private key files are NOT accessible by others.\r\nThis private key will be ignored.\r\nLoad key \"inventory/id_openstack.rsa\": bad permissions\r\[email protected]: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).\r\n", "unreachable": true}

Add a var to specify the path to access a playbook - openshift ansible

Due to the refactoring of theopenshift-ansible project which does not use same paths to access the byo playbook or a module playbook, then we should add a var to specify what is the root path to access it according to the openshift ansible release

E.g message that you will get if you git clone openshift-ansible release-3.9 and run our playbook

ansible-playbook -i inventory/cloud_host playbook/post_installation.yml -e openshift_admin_pwd=${OPENSHIFT_ADMIN_PWD} --tags "identity_provider,enable_cluster_admin"

ERROR! Unable to retrieve file contents
Could not find or access '/Users/dabou/Downloads/snowdrop-infra/ansible/openshift-ansible/playbooks/byo/openshift-cluster/service-catalog.yml'

Include istio playbook

We should Include the istio playbook in order to install it we do for jaeger, nexus, jenkins.
Shoud we consider to use Ansible Galaxy in order to deploy istio playbook or to use git clone command ... ?

Persistence issue reported by nexus playbook when reexecuted

When we nexus role more than once (due to time out during step to configure nexus), then the following error will be reported

TASK [install_nexus : Enable persistence] ***********************************************************************************************************************************************************************************************************
fatal: [192.168.99.50]: FAILED! => {"changed": true, "cmd": "oc  --config /etc/origin/master/admin.kubeconfig  volumes dc/nexus --add --name 'nexus-volume-1' --type 'pvc' --mount-path '/sonatype-work/' --claim-name 'nexus-pv' --claim-size '5G' --overwrite", "delta": "0:00:00.300711", "end": "2018-05-09 18:51:21.346746", "msg": "non-zero return code", "rc": 1, "start": "2018-05-09 18:51:21.046035", "stderr": "error: persistentvolumeclaims \"nexus-pv\" already exists\ninfo: deploymentconfigs \"nexus\" was not changed", "stderr_lines": ["error: persistentvolumeclaims \"nexus-pv\" already exists", "info: deploymentconfigs \"nexus\" was not changed"], "stdout": "", "stdout_lines": []}
        to retry, use: --limit @/Users/dabou/Code/snowdrop/cloud-native/lab/tmp/openshift-infra/ansible/playbook/post_installation.retry

This error occurs as nexus has already been installed and pvc/pv mounted

Launcher won't work when we use as meta role `openshift_env`

Step

ansible-playbook -i inventory/cloud_host playbook/post_installation.yml \
     --tags install-launcher \
     -e launcher_catalog_git_repo=https://github.com/snowdrop/cloud-native-catalog.git \
     -e launcher_catalog_git_branch=master \
     -e launcher_github_username=YOUR_GIT_TOKEN \
     -e launcher_github_token=YOUR_GIT_USER

Error

TASK [openshift_env : Find yml files in {{ item }}] *************************************************************************************************************************************************************************************************
fatal: [192.168.99.50]: FAILED! => {"reason": "Unable to retrieve file contents\nCould not find or access '/Users/dabou/Code/snowdrop/cloud-native/lab/tmp/openshift-infra/ansible/playbook/determine_is_openshift_config_dir.yml'"}
fatal: [192.168.99.50]: FAILED! => {"reason": "Unable to retrieve file contents\nCould not find or access '/Users/dabou/Code/snowdrop/cloud-native/lab/tmp/openshift-infra/ansible/playbook/determine_is_openshift_config_dir.yml'"}

Update cloud-init to set timezone

Times diverge between local machine and VM

  1. Machine time
14:01
  1. VM time
[root@cloud ~]# date
Tue May 8 12:01:31 UTC 2018

Solution : Change timezone during vm creation timedatectl set-timezone Europe/Brussels
How : https://access.redhat.com/solutions/2996411

# vi user-data
#cloud-config
password: redhat
chpasswd: { expire: False }
ssh_pwauth: True
ssh_authorized_keys:
   - ssh-ed25519 AAAAC3Naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa0 [email protected]
runcmd:
   - timedatectl set-timezone UTC

cat: /tmp/htpwd_ip_patch.json: No such file or directory

When we install jenkins role top of openshift where identity provider htpasswd is not installed, then we get this error :

TASK [install_jenkins : Get patch file] ***********************************************************************************************************************************************************************************************************
fatal: [192.168.99.50]: FAILED! => {"changed": true, "cmd": ["cat", "/tmp/htpwd_ip_patch.json"], "delta": "0:00:00.004362", "end": "2018-05-07 08:10:05.887029", "msg": "non-zero return code", "rc": 1, "start": "2018-05-07 08:10:05.882667", "stderr": "cat: /tmp/htpwd_ip_patch.json: No such file or directory", "stderr_lines": ["cat: /tmp/htpwd_ip_patch.json: No such file or directory"], "stdout": "", "stdout_lines": []}

Use centos7 and openshift ansible 3.7 with vm (hetzner, ....)

We should focus on Centos, 3.7 as the origin repo is added automatically when installation will take place.

  • Repo added during installation on Centos7 vm
yum repolist -v
Repo-id      : centos-openshift-origin37
Repo-name    : CentOS OpenShift Origin
Repo-revision: 1514193298
Repo-updated : Mon Dec 25 09:15:01 2017
Repo-pkgs    : 44
Repo-size    : 301 M
Repo-baseurl : http://mirror.centos.org/centos/7/paas/x86_64/openshift-origin37/
Repo-expire  : 21600 second(s) (last: Sat Mar 17 09:46:41 2018)
  Filter     : read-only:present
Repo-filename: /etc/yum.repos.d/CentOS-OpenShift-Origin37.repo

and here are the packages deployed

yum list installed | grep origin
Failed to set locale, defaulting to C
origin.x86_64                        3.7.0-1.0.7ed6862                 @centos-openshift-origin37
origin-clients.x86_64                3.7.0-1.0.7ed6862                 @centos-openshift-origin37
origin-docker-excluder.noarch        3.7.0-1.0.7ed6862                 @centos-openshift-origin37
origin-excluder.noarch               3.7.0-1.0.7ed6862                 @centos-openshift-origin37
origin-master.x86_64                 3.7.0-1.0.7ed6862                 @centos-openshift-origin37

Remark : as mentioned by Michael Gugino, we should stop to use openshift_repos_enable_testing as finally it will mix different rpms during playbook execution

origin.x86_64                        3.7.0-1.0.7ed6862                 @centos-openshift-origin37
origin-clients.x86_64                3.7.0-1.0.7ed6862                 @centos-openshift-origin37
origin-master.x86_64                 3.7.0-1.0.7ed6862                 @centos-openshift-origin37
[root@cloud ~]# yum list installed | grep origin
Failed to set locale, defaulting to C
...
[root@cloud ~]# yum list installed | grep origin
Failed to set locale, defaulting to C
origin.x86_64                        3.7.1-1.el7.git.0.0a2d6a1         @centos-openshift-origin37-testing
origin-clients.x86_64                3.7.1-1.el7.git.0.0a2d6a1         @centos-openshift-origin37-testing
origin-master.x86_64                 3.7.1-1.el7.git.0.0a2d6a1         @centos-openshift-origin37-testing
origin-node.x86_64                   3.7.1-1.el7.git.0.0a2d6a1         @centos-openshift-origin37-testing

Enhance inventory_cloud j2 template to support deployment on openstack vs local, hetzner

Enhance inventory_cloud j2 template to support deployment on openstack vs local, hetzner.

A few modifications are required :

a) openshift_hostname
openshift_hostname=node_name -> openshift_hostname=ip

Should be good to see if we can also define a hostname for local, hetzner deployments.
If we use the Centos ISO created, then hostname command executed on the terminal of the vm returns : cloud and for hetzner -> CentOS-74-64-minimal
We should perhaps specify it as a name

b) Node, master, etcd

[masters]
10.8.250.104 openshift_public_hostname=10.8.250.104 openshift_hostname=172.16.195.12
Change To : -->
192.168.99.50 openshift_public_hostname=192.168.99.50 openshift_ip=192.168.99.50

[etcd]
10.8.250.104
Change To : -->
192.168.99.50 openshift_ip=192.168.99.50

[nodes]
10.8.250.104 openshift_node_labels="{'region':'infra','zone':'default', 'node-role.kubernetes.io/compute': 'true'}" \
   openshift_public_hostname=10.8.250.104 \
   openshift_hostname=172.16.195.12
   
Change To : -->

192.168.99.50 openshift_node_labels="{'region':'infra','zone':'default', 'node-role.kubernetes.io/compute': 'true'}" \
   openshift_public_hostname=192.168.99.50 \
   openshift_ip=192.168.99.50

See diff file

screenshot 2018-04-19 14 04 23

Make the installation of the service catalog optional

As the installation of the service catalog will certainly fail when the cluster is created

TASK [ansible_service_broker : Create the Broker resource in the catalog] **************************************************************************************************************************************************************************
fatal: [192.168.99.50]: FAILED! => {"changed": false, "failed": true, "msg": {"cmd": "/usr/bin/oc create -f /tmp/brokerout-dJmL1S -n default", "results": {}, "returncode": 1, "stderr": "error: unable to recognize \"/tmp/brokerout-dJmL1S\": no matches for servicecatalog.k8s.io/, Kind=ClusterServiceBroker\n", "stdout": ""}}

I propose that we install it separately and include it as one of the modules that we can install such as jaeger, nexus, jenkins, ...

ansible-playbook -i inventory openshift-ansible/playbooks/byo/openshift-cluster/service-catalog.yml

Move playbooks from https://gitlab.cee.redhat.com/snowdrop/devops/tree/master/ocp-deploy

We still have internal playbooks that we could integrate here as they aren't specific to the Red Hat infrastructure and could be used to create/delete openstack VM, ....

Playbooks :

  • create_vm.yml (used to create/delete a vm on openstack)
  • keys.yml could be used as a generic way to create private/public and import it within the vm ....
  • config_dns could also become part of this project

Externalize config directory as it is different between oc cluster up, mininishift

Externalize config directory as it is different between oc cluster up, mininishift, ....

By example, the jenkins role uses a hard coded reference which is different from the directory used by minishift or oc cluster up

- name: Set config file
  set_fact:
    config_file: /etc/origin/master/master-config.yaml

- name: Update configuration file
  shell: |
    echo "jenkinsPipelineConfig:" >> {{ config_file }}
    echo "  autoProvisionEnabled: false" >> {{ config_file }}

Study the idea to use ansible oc module

Since Ansible 2.4, it is possible to use the oc module. Then I propose that we study the idea to use it instead of installing our own oc client which is next used by our playbooks. That could resolve the issue on minishift where it is more difficult to install a package within the centos or boot2docker image

Service Catalog can't be deployed after running cluster role

Steps

ansible-playbook playbook/generate_inventory.yml -e ip_address=192.168.99.50 -e type=simple
ansible-playbook -i inventory/simple_host playbook/cluster.yml --tags "up"
ansible-playbook -i inventory/simple_host openshift-ansible/playbooks/openshift-service-catalog/config.yml -e openshift_master_unsupported_embedded_etcd=true

Error

TASK [Evaluate groups - Fail if no etcd hosts group is defined] ******************************************************************************************************
fatal: [localhost]: FAILED! => {
"changed": false, 
"msg": "Running etcd as an embedded service is no longer supported. If this is a new install please define an 'etcd' group with either one, three or five hosts.
These hosts may be the same hosts as your masters. If this is an upgrade please see https://docs.openshift.com/container-platform/latest/install_config/upgrading/migrating_embedded_etcd.html for documentation on how to migrate from embedded to external etcd.
"}
        to retry, use: --limit @/Users/dabou/Code/snowdrop/cloud-native/temp/openshift-infra/ansible/openshift-ansible/playbooks/openshift-service-catalog/config.retry

jenkins role - persistence issue

Persistent volumes exist but jenkins role fails

oc get pv 
NAME      CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                                   STORAGECLASS   REASON    AGE
pv001     5Gi        RWO            Recycle          Bound       openshift-ansible-service-broker/etcd                            2h
pv002     5Gi        RWO            Recycle          Available                                                                    2h
pv003     5Gi        RWO            Recycle          Available                                                                    2h
pv004     5Gi        RWO            Recycle          Available                                                                    2h
pv005     5Gi        RWO            Recycle          Available                                                                    2h
pv006     5Gi        RWO            Recycle          Available                                                                    2h
pv007     5Gi        RWO            Recycle          Available                                                                    2h
pv008     5Gi        RWO            Recycle          Available                                                                    2h
pv009     5Gi        RWO            Recycle          Bound       infra/jenkins                                                    2h
pv010     5Gi        RWO            Recycle          Available                                                                    2h

Error

TASK [install_jenkins : Install jenkins-persistent] ***********************************************************************************************************************************************************************************************
fatal: [192.168.99.50]: FAILED! => {"changed": true, "cmd": ["oc", "new-app", "JENKINS_PASSWORD=admin123", "jenkins-persistent", "-n", "infra"], "delta": "0:00:00.454552", "end": "2018-05-07 09:11:15.808450", "msg": "non-zero return code", "rc": 1, "start": "2018-05-07 09:11:15.353898", "stderr": "    error: routes.route.openshift.io \"jenkins\" already exists\n    error: persistentvolumeclaims \"jenkins\" already exists\n    error: deploymentconfigs.apps.openshift.io \"jenkins\" already exists\n    error: serviceaccounts \"jenkins\" already exists\n    error: rolebindings.authorization.openshift.io \"jenkins_edit\" already exists\n    error: services \"jenkins-jnlp\" already exists\n    error: services \"jenkins\" already exists", "stderr_lines": ["    error: routes.route.openshift.io \"jenkins\" already exists", "    error: persistentvolumeclaims \"jenkins\" already exists", "    error: deploymentconfigs.apps.openshift.io \"jenkins\" already exists", "    error: serviceaccounts \"jenkins\" already exists", "    error: rolebindings.authorization.openshift.io \"jenkins_edit\" already exists", "    error: services \"jenkins-jnlp\" already exists", "    error: services \"jenkins\" already exists"], "stdout": "--> Deploying template \"openshift/jenkins-persistent\" to project infra\n\n     Jenkins\n     ---------\n    
 Jenkins service, with persistent storage.\n     \n     NOTE: You must have persistent volumes available in your cluster to use this template.\n\n     A Jenkins service has been created in your project.  Log into Jenkins with your OpenShift account.  The tutorial at https://github.com/openshift/origin/blob/master/examples/jenkins/README.md contains more information about using this template.\n\n     * With parameters:\n        * Jenkins Service Name=jenkins\n        * Jenkins JNLP Service Name=jenkins-jnlp\n        * Enable OAuth in Jenkins=true\n        * Memory Limit=512Mi\n        * Volume Capacity=1Gi\n        * Jenkins ImageStream Namespace=openshift\n        * Jenkins ImageStreamTag=jenkins:2\n\n--> Creating resources ...\n--> Failed", "stdout_lines": ["--> Deploying template \"openshift/jenkins-persistent\" to project infra", "", "     Jenkins", "     ---------", "     Jenkins service, with persistent storage.", "     ", "     NOTE: You must have persistent volumes available in your cluster to use this template.", "", "     A Jenkins service has been created in your project. 
Log into Jenkins with your OpenShift account.  The tutorial at https://github.com/openshift/origin/blob/master/examples/jenkins/README.md contains more information about using this template.", "", "     * With parameters:", "        * Jenkins Service Name=jenkins", "        * Jenkins JNLP Service Name=jenkins-jnlp", "        * Enable OAuth in Jenkins=true", "        * Memory Limit=512Mi", "        * Volume Capacity=1Gi", "        * Jenkins ImageStream Namespace=openshift", "        * Jenkins ImageStreamTag=jenkins:2", "", "--> Creating resources ...", "--> Failed"]}

Kube DNS resolution issue to access the pod using its service por route

oc get pods
NAME                     READY     STATUS    RESTARTS   AGE
jaeger-210917857-x84ht   1/1       Running   0          19m
jenkins-1-jz6z8          1/1       Running   0          19m
nexus-1-cn6vg            1/1       Running   0          22m

oc rsh nexus-1-cn6vg

and within the bash shell

1) Fail
curl http://nexus.infra.svc:8081
curl: (6) Could not resolve host: nexus.infra.svc; Unknown error

2) Succeeded
sh-4.2$ curl http://172.17.0.2:8081
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 404 Not Found</title>
</head>
<body>
<h2>HTTP ERROR: 404</h2>
<p>Problem accessing /. Reason:
<pre>    Not Found</pre></p>
<hr /><i><small>Powered by Jetty://</small></i>
                                                
</body>
</html>

Improve playbook to import/authorize ssh_key

To ssh as root to the vm hetzner, local, we use these commands to import our ssh_key as authorized key

sshpass -f pwd.txt ssh -o StrictHostKeyChecking=no [email protected] -p 5222 "mkdir ~/.ssh && chmod 700 ~/.ssh && touch ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys"
sshpass -f pwd.txt ssh-copy-id -o StrictHostKeyChecking=no -i ~/.ssh/id_rsa.pub [email protected] -p 5222

We should improve the existing playbook to :

  • Create a generic private/public key and host it on a github repo
  • Import the public ssh_key as authorized user within the target vm

Interesting articles :

cluster fails to restart when service_catalog = true

When we stop the server and restart it, then we get this k8s service catalog error

Origin version : 3.9.0-alpha3
Command used:

ansible-playbook -i inventory/cloud_host playbook/cluster.yml -e openshift_node=masters -e openshift_release_tag_name=v3.9.0-alpha.3 --tags "down"
ansible-playbook -i inventory/cloud_host playbook/cluster.yml -e openshift_node=masters -e openshift_release_tag_name=v3.9.0-alpha.3 --tags "start"

Error

{
  "changed": true,
  "cmd": [
    "oc",
    "cluster",
    "up",
    "--version=v3.9.0-alpha.3",
    "--host-config-dir=/var/lib/origin/openshift.local.config",
    "--host-data-dir=/var/lib/openshift/data",
    "--host-volumes-dir=/var/lib/openshift/volumes",
    "--host-pv-dir=/var/lib/openshift/pv",
    "--use-existing-config=True",
    "--public-hostname=192.168.99.50",
    "--routing-suffix=192.168.99.50.nip.io",
    "--loglevel=1",
    "--service-catalog=True"
  ],
  "delta": "0:00:13.124961",
  "end": "2018-03-28 10:00:57.619554",
  "msg": "non-zero return code",
  "rc": 1,
  "start": "2018-03-28 10:00:44.494593",
  "stderr": "I0328 10:00:46.889982    3674 helper.go:585] Copying OpenShift config to local directory /tmp/openshift-config211463473",
  "stderr_lines": [
    "I0328 10:00:46.889982    3674 helper.go:585] Copying OpenShift config to local directory /tmp/openshift-config211463473"
  ],
  "stdout": "-- Checking OpenShift client ... \n-- Checking Docker client ... \n-- Checking Docker version ... \n-- Checking for existing OpenShift container ... \n-- Checking for openshift/origin:v3.9.0-alpha.3 image ... \n-- Checking Docker daemon configuration ... \n-- Checking for available ports ... \n-- Checking type of volume mount ... \n\n   Using nsenter mounter for OpenShift volumes\n-- Creating host directories ... \n-- Finding server IP ... \n\n   Using public hostname IP 192.168.99.50 as the host IP\n   Using 192.168.99.50 as the server IP\n-- Checking service catalog version requirements ... \n-- Starting OpenShift container ... \n\n   Starting OpenShift using container 'origin'\n   Waiting for API server to start listening\n   OpenShift server started\n-- Registering template service broker with service catalog ... \nFAIL\n   Error: cannot register the template service broker\n   Caused By:\n     Error: cannot create objects from template openshift-infra/template-service-broker-registration\n     Caused By:\n       Error: unable to recognize servicecatalog.k8s.io/v1beta1, Kind=ClusterServiceBroker: no matches for servicecatalog.k8s.io/, Kind=ClusterServiceBroker",
  "stdout_lines": [
    "-- Checking OpenShift client ... ",
    "-- Checking Docker client ... ",
    "-- Checking Docker version ... ",
    "-- Checking for existing OpenShift container ... ",
    "-- Checking for openshift/origin:v3.9.0-alpha.3 image ... ",
    "-- Checking Docker daemon configuration ... ",
    "-- Checking for available ports ... ",
    "-- Checking type of volume mount ... ",
    "",
    "   Using nsenter mounter for OpenShift volumes",
    "-- Creating host directories ... ",
    "-- Finding server IP ... ",
    "",
    "   Using public hostname IP 192.168.99.50 as the host IP",
    "   Using 192.168.99.50 as the server IP",
    "-- Checking service catalog version requirements ... ",
    "-- Starting OpenShift container ... ",
    "",
    "   Starting OpenShift using container 'origin'",
    "   Waiting for API server to start listening",
    "   OpenShift server started",
    "-- Registering template service broker with service catalog ... ",
    "FAIL",
    "   Error: cannot register the template service broker",
    "   Caused By:",
    "     Error: cannot create objects from template openshift-infra/template-service-broker-registration",
    "     Caused By:",
    "       Error: unable to recognize servicecatalog.k8s.io/v1beta1, Kind=ClusterServiceBroker: no matches for servicecatalog.k8s.io/, Kind=ClusterServiceBroker"
  ]
}

Investigate openshift ansible 3.9

For 3.7 and below, you need to do some manual preparation steps and
then the playbook you want to run is:

openshift-ansible/playbooks/byo/config.yml

For 3.9 (when the rpms will be ready)

Prerequisites: https://docs.openshift.org/latest/install_config/install/prerequisites.html
Host prep: https://docs.openshift.org/latest/install_config/install/host_preparation.html

I think some of the items in those pages are already done on atomic
host (such as installing docker).

Interesting project : https://github.com/michaelgugino/openshift-stuff/tree/master/centos

Add a var to specify how many volumes should be created

Currently, the pv and temp created are default to 3

volume:
  defaults: #define the defaults for all Persistent Volumes
      storage: 5Gi
      persistentVolumeReclaimPolicy: Recycle

volumes:
    pv001:
      storage: "{{ volume.defaults.storage }}"
      persistentVolumeReclaimPolicy: "{{ volume.defaults.persistentVolumeReclaimPolicy }}"
    pv002:
      storage: "{{ volume.defaults.storage }}"
      persistentVolumeReclaimPolicy: "{{ volume.defaults.persistentVolumeReclaimPolicy }}"
    pv003:
      storage: "{{ volume.defaults.storage }}"
      persistentVolumeReclaimPolicy: "{{ volume.defaults.persistentVolumeReclaimPolicy }}"

I suggest to add a var in order to let to override the value and remove the hard coded list as defined under volumes and use instead a dynamic list generated

Create 2 inventory files + template

Create 2 inventory files + templates

  1. Existing inventory
    For deployment within a CentOS vm where we will :
  • run node/master/etcd as systemctl service AND
  • where rpms will be used to install openshift
    then we will use the existing inventory file + template
  1. New inventory file - oc cluster up

But when when we will use oc cluster up then a lightweight inventory is required with few vars + template too

Configure Project's role and investigate to change user's role from admin to edit

The OpenShift role assigned by our role add-to-users is hard coded to admin

    - name: Grant user admin priviledges
      command: oc adm policy add-role-to-user admin user{{ item }}
      with_sequence: start={{ first_extra_user_offset }} count={{ number_of_extra_users }} format=%d

As this role could be too "high" for users accessing the hetzner machine, I suggest to do 2 things

  • Make the user's role configurable using a parameter
  • Investigate if edit role could be enough

FYI

Openshift's roles available are

oc describe clusterrole.rbac | grep Name:
Name:         admin
Name:         asb-access
Name:         asb-auth
Name:         basic-user
Name:         cluster-admin
Name:         cluster-debugger
Name:         cluster-reader
Name:         cluster-status
Name:         edit
...
Name:         view

Definition of the edit's role is

 oc describe clusterrole.rbac/edit
Name:         edit
Labels:       kubernetes.io/bootstrapping=rbac-defaults
Annotations:  openshift.io/description=A user that can create and edit most objects in a project, but can not update the project's membership.
              rbac.authorization.kubernetes.io/autoupdate=true
PolicyRule:
  Resources                                          Non-Resource URLs  Resource Names  Verbs
  ---------                                          -----------------  --------------  -----
  appliedclusterresourcequotas                       []                 []              [get list watch]
  appliedclusterresourcequotas.quota.openshift.io    []                 []              [get list watch]
  bindings                                           []                 []              [get list watch]
  buildconfigs                                       []                 []              [create delete deletecollection get list patch update watch]
  buildconfigs.build.openshift.io                    []                 []              [create delete deletecollection get list patch update watch]
  buildconfigs/instantiate                           []                 []              [create]
  buildconfigs.build.openshift.io/instantiate        []                 []              [create]
  buildconfigs/instantiatebinary                     []                 []              [create]
  buildconfigs.build.openshift.io/instantiatebinary  []                 []              [create]
  buildconfigs/webhooks                              []                 []              [create delete deletecollection get list patch update watch]
  buildconfigs.build.openshift.io/webhooks           []                 []              [create delete deletecollection get list patch update watch]
  buildlogs                                          []                 []              [create delete deletecollection get list patch update watch]
  buildlogs.build.openshift.io                       []                 []              [create delete deletecollection get list patch update watch]
  builds                                             []                 []              [create delete deletecollection get list patch update watch]
  builds.build.openshift.io                          []                 []              [create delete deletecollection get list patch update watch]
  builds/clone                                       []                 []              [create]
  builds.build.openshift.io/clone                    []                 []              [create]
  builds/details                                     []                 []              [update]
  builds.build.openshift.io/details                  []                 []              [update]
  builds/log                                         []                 []              [get list watch]
  builds.build.openshift.io/log                      []                 []              [get list watch]
  configmaps                                         []                 []              [create delete deletecollection get list patch update watch]
  cronjobs.batch                                     []                 []              [create delete deletecollection get list patch update watch]
  daemonsets.apps                                    []                 []              [get list watch]
  daemonsets.extensions                              []                 []              [get list watch]
  deploymentconfigrollbacks                          []                 []              [create]
  deploymentconfigrollbacks.apps.openshift.io        []                 []              [create]
  deploymentconfigs                                  []                 []              [create delete deletecollection get list patch update watch]
  deploymentconfigs.apps.openshift.io                []                 []              [create delete deletecollection get list patch update watch]
  deploymentconfigs/instantiate                      []                 []              [create]
  deploymentconfigs.apps.openshift.io/instantiate    []                 []              [create]
  deploymentconfigs/log                              []                 []              [get list watch]
  deploymentconfigs.apps.openshift.io/log            []                 []              [get list watch]
  deploymentconfigs/rollback                         []                 []              [create]
  deploymentconfigs.apps.openshift.io/rollback       []                 []              [create]
  deploymentconfigs/scale                            []                 []              [create delete deletecollection get list patch update watch]
  deploymentconfigs.apps.openshift.io/scale          []                 []              [create delete deletecollection get list patch update watch]
  deploymentconfigs/status                           []                 []              [get list watch]
  deploymentconfigs.apps.openshift.io/status         []                 []              [get list watch]
  deployments.apps                                   []                 []              [create delete deletecollection get list patch update watch]
  deployments.extensions                             []                 []              [create delete deletecollection get list patch update watch]
  deployments.apps/rollback                          []                 []              [create delete deletecollection get list patch update watch]
  deployments.extensions/rollback                    []                 []              [create delete deletecollection get list patch update watch]
  deployments.apps/scale                             []                 []              [create delete deletecollection get list patch update watch]
  deployments.extensions/scale                       []                 []              [create delete deletecollection get list patch update watch]
  endpoints                                          []                 []              [create delete deletecollection get list patch update watch]
  events                                             []                 []              [get list watch]
  horizontalpodautoscalers.autoscaling               []                 []              [create delete deletecollection get list patch update watch]
  imagestreamimages                                  []                 []              [create delete deletecollection get list patch update watch]
  imagestreamimages.image.openshift.io               []                 []              [create delete deletecollection get list patch update watch]
  imagestreamimports                                 []                 []              [create]
  imagestreamimports.image.openshift.io              []                 []              [create]
  imagestreammappings                                []                 []              [create delete deletecollection get list patch update watch]
  imagestreammappings.image.openshift.io             []                 []              [create delete deletecollection get list patch update watch]
  imagestreams                                       []                 []              [create delete deletecollection get list patch update watch]
  imagestreams.image.openshift.io                    []                 []              [create delete deletecollection get list patch update watch]
  imagestreams/layers                                []                 []              [get update]
  imagestreams.image.openshift.io/layers             []                 []              [get update]
  imagestreams/secrets                               []                 []              [create delete deletecollection get list patch update watch]
  imagestreams.image.openshift.io/secrets            []                 []              [create delete deletecollection get list patch update watch]
  imagestreams/status                                []                 []              [get list watch]
  imagestreams.image.openshift.io/status             []                 []              [get list watch]
  imagestreamtags                                    []                 []              [create delete deletecollection get list patch update watch]
  imagestreamtags.image.openshift.io                 []                 []              [create delete deletecollection get list patch update watch]
  ingresses.extensions                               []                 []              [create delete deletecollection get list patch update watch]
  jenkins.build.openshift.io                         []                 []              [edit view]
  jobs.batch                                         []                 []              [create delete deletecollection get list patch update watch]
  limitranges                                        []                 []              [get list watch]
  namespaces                                         []                 []              [get list watch]
  namespaces/status                                  []                 []              [get list watch]
  networkpolicies.extensions                         []                 []              [create delete deletecollection get list patch update watch]
  networkpolicies.networking.k8s.io                  []                 []              [create delete deletecollection get list patch update watch]
  persistentvolumeclaims                             []                 []              [create delete deletecollection get list patch update watch]
  poddisruptionbudgets.policy                        []                 []              [create delete deletecollection get list patch update watch]
  podpresets.settings.k8s.io                         []                 []              [create update delete get list watch]
  pods                                               []                 []              [create delete deletecollection get list patch update watch]
  pods/attach                                        []                 []              [create delete deletecollection get list patch update watch]
  pods/exec                                          []                 []              [create delete deletecollection get list patch update watch]
  pods/log                                           []                 []              [get list watch]
  pods/portforward                                   []                 []              [create delete deletecollection get list patch update watch]
  pods/proxy                                         []                 []              [create delete deletecollection get list patch update watch]
  pods/status                                        []                 []              [get list watch]
  processedtemplates                                 []                 []              [create delete deletecollection get list patch update watch]
  processedtemplates.template.openshift.io           []                 []              [create delete deletecollection get list patch update watch]
  projects                                           []                 []              [get]
  projects.project.openshift.io                      []                 []              [get]
  replicasets.apps                                   []                 []              [create delete deletecollection get list patch update watch]
  replicasets.extensions                             []                 []              [create delete deletecollection get list patch update watch]
  replicasets.apps/scale                             []                 []              [create delete deletecollection get list patch update watch]
  replicasets.extensions/scale                       []                 []              [create delete deletecollection get list patch update watch]
  replicationcontrollers                             []                 []              [create delete deletecollection get list patch update watch]
  replicationcontrollers/scale                       []                 []              [create delete deletecollection get list patch update watch]
  replicationcontrollers.extensions/scale            []                 []              [create delete deletecollection get list patch update watch]
  replicationcontrollers/status                      []                 []              [get list watch]
  resourcequotas                                     []                 []              [get list watch]
  resourcequotas/status                              []                 []              [get list watch]
  resourcequotausages                                []                 []              [get list watch]
  routes                                             []                 []              [create delete deletecollection get list patch update watch]
  routes.route.openshift.io                          []                 []              [create delete deletecollection get list patch update watch]
  routes/custom-host                                 []                 []              [create]
  routes.route.openshift.io/custom-host              []                 []              [create]
  routes/status                                      []                 []              [get list watch]
  routes.route.openshift.io/status                   []                 []              [get list watch]
  secrets                                            []                 []              [create delete deletecollection get list patch update watch]
  serviceaccounts                                    []                 []              [create delete deletecollection get list patch update watch impersonate]
  servicebindings.servicecatalog.k8s.io              []                 []              [create update delete get list watch patch]
  serviceinstances.servicecatalog.k8s.io             []                 []              [create update delete get list watch patch]
  services                                           []                 []              [create delete deletecollection get list patch update watch]
  services/proxy                                     []                 []              [create delete deletecollection get list patch update watch]
  statefulsets.apps                                  []                 []              [create delete deletecollection get list patch update watch]
  templateconfigs                                    []                 []              [create delete deletecollection get list patch update watch]
  templateconfigs.template.openshift.io              []                 []              [create delete deletecollection get list patch update watch]
  templateinstances                                  []                 []              [create delete deletecollection get list patch update watch]
  templateinstances.template.openshift.io            []                 []              [create delete deletecollection get list patch update watch]
  templates                                          []                 []              [create delete deletecollection get list patch update watch]
  templates.template.openshift.io                    []                 []              [create delete deletecollection get list patch update watch]

rolebindings.authorization.openshift.io is forbidden: User \"admin\" cannot list rolebindings.authorization.openshift.io in the namespace \"default\"

Steps to reproduce the error

git clone https://github.com/snowdrop/openshift-infra.git
cd openshift-infra/ansible
git clone -b release-3.9 https://github.com/openshift/openshift-ansible.git
ansible-playbook -i inventory/cloud_host openshift-ansible/playbooks/prerequisites.yml
ansible-playbook -i inventory/cloud_host openshift-ansible/playbooks/deploy_cluster.yml
ansible-playbook -i inventory/cloud_host playbook/post_installation.yml -e openshift_admin_pwd=admin --tags enable_cluster_admin
ansible-playbook -i inventory/cloud_host playbook/post_installation.yml -e openshift_admin_pwd=admin --tags identity_provider
ansible-playbook -i inventory/cloud_host playbook/post_installation.yml --tags add_extra_users -e number_of_extra_users=2 -e first_extra_user_offset=1 -e openshift_admin_pwd=admin

Error

TASK [add_extra_users : Grant user admin priviledges] *****************************************************************************************************************************************************************************************
failed: [192.168.99.50] (item=1) => {"changed": true, "cmd": ["oc", "adm", "policy", "add-role-to-user", "admin", "user1"], "delta": "0:00:00.215537", "end": "2018-05-09 16:20:15.113077", "item": "1", "msg": "non-zero return code", "rc": 1, "start": "2018-05-09 16:20:14.897540", "stderr": "Error from server (Forbidden): rolebindings.authorization.openshift.io is forbidden: User \"admin\" cannot list rolebindings.authorization.openshift.io in the namespace \"default\": User \"admin\" cannot list rolebindings.authorization.openshift.io in project \"default\"", "stderr_lines": ["Error from server (Forbidden): rolebindings.authorization.openshift.io is forbidden: User \"admin\" cannot list rolebindings.authorization.openshift.io in the namespace \"default\": User \"admin\" cannot list rolebindings.authorization.openshift.io in project \"default\""], "stdout": "", "stdout_lines": []}
failed: [192.168.99.50] (item=2) => {"changed": true, "cmd": ["oc", "adm", "policy", "add-role-to-user", "admin", "user2"], "delta": "0:00:00.239512", "end": "2018-05-09 16:20:15.705065", "item": "2", "msg": "non-zero return code", "rc": 1, "start": "2018-05-09 16:20:15.465553", "stderr": "Error from server (Forbidden): rolebindings.authorization.openshift.io is forbidden: User \"admin\" cannot list rolebindings.authorization.openshift.io in the namespace \"default\": User \"admin\" cannot list rolebindings.authorization.openshift.io in project \"default\"", "stderr_lines": ["Error from server (Forbidden): rolebindings.authorization.openshift.io is forbidden: User \"admin\" cannot list rolebindings.authorization.openshift.io in the namespace \"default\": User \"admin\" cannot list rolebindings.authorization.openshift.io in project \"default\""], "stdout": "", "stdout_lines": []}

Launcher role is failing

Step

ansible-playbook -i inventory/cloud_host playbook/post_installation.yml \
>      --tags install-launcher \
>      -e launcher_catalog_git_repo=https://github.com/snowdrop/cloud-native-catalog.git \
>      -e launcher_catalog_git_branch=master \
>      -e launcher_github_username=YOUR_GIT_TOKEN \
>      -e launcher_github_token=YOUR_GIT_USER 

Error

TASK [launcher : Check if project/namespace exists] *************************************************************************************************************************************************************************************************
fatal: [192.168.99.50]: FAILED! => {"changed": true, "cmd": ["oc", "get", "project/devex"], "delta": "0:00:14.985096", "end": "2018-05-09 18:55:46.441224", "msg": "non-zero return code", "rc": 1, "start": "2018-05-09 18:55:31.456128", "stderr": "Error from server (Forbidden): projects.project.openshift.io \"devex\" is forbidden: User \"user2\" cannot get projects.project.openshift.io in the namespace \"devex\": User \"user2\" cannot get project \"devex\"", "stderr_lines": ["Error from server (Forbidden): projects.project.openshift.io \"devex\" is forbidden: User \"user2\" cannot get projects.project.openshift.io in the namespace \"devex\": User \"user2\" cannot get project \"devex\""], "stdout": "", "stdout_lines": []}
...ignoring

Deployment of jenkins role fails if oc has been started using cluster role

Steps executed

git clone https://github.com/snowdrop/openshift-infra.git
cd openshift-infra/ansible
ansible-playbook playbook/generate_inventory.yml -e ip_address=192.168.99.50 -e type=simple
ansible-playbook -i inventory/simple_host playbook/cluster.yml  -e openshift_release_tag_name=v3.9.0 --tags "up"
ansible-playbook -i inventory/simple_host playbook/post_installation.yml -e openshift_admin_pwd=admin --tags "enable_cluster_admin"
ansible-playbook -i inventory/simple_host playbook/post_installation.yml --tags jenkins 

Error

TASK [install_jenkins : Get Jenkins Service Account Token] *************************************************************************************************************
fatal: [192.168.99.50]: FAILED! => {"changed": true, "cmd": ["oc", "serviceaccounts", "get-token", "jenkins"], "delta": "0:00:00.205584", "end": "2018-05-08 11:55:32.617959", "msg": "non-zero return code", "rc": 1, "start": "2018-05-08 11:55:32.412375", "stderr": "error: could not find a service account token for service account \"jenkins\"", "stderr_lines": ["error: could not find a service account token for service account \"jenkins\""], "stdout": "", "stdout_lines": []}

Add role to create for list of users AND role to add their openshift project

For HOL, it is required to create OpenShift's user, project and add them admin role (see remark hereafter nevertheless)

We have created a role to change the identityProvider of Openshift to become httpPassword[1] but we don't have anymore a role to create for each's user, its OpenShift's project and assign it to the role specified

So, I propose to split the existing role into 2 and that we create a new role

Role 1 : Install httpd-tools package if not there, create admin user, patch master-config to HTPasswdPasswordIdentityProvider, restart cluster
Role 2 : Create htpasswd user's /password from a list OR using range user1 .... user99
Role 3: Create from a list or a range of users, an openshift project and assign it a role.

e.g

oc login -u {{ user }} -p pwd{{ pwd }}
oc new-project {{ user }}
oc login -u admin -p admin
oc adm policy add-role-to-user admin user

Remarks:

  • We can grant admin as role for the moment but long term, we should certainly revisit that to give less rights on the machine.
  • A profile should also be defined to limit cpu/memory for each project created. Otherwise, we will overpass the limit capacity of the machine [2]

[1] https://goo.gl/cW1ChU
[2] https://docs.openshift.com/enterprise/3.2/admin_guide/limits.html#admin-guide-limits

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.