GithubHelp home page GithubHelp logo

ocp4-vsphere-upi-automation-gauth's Introduction

OCP4 on VMware vSphere UPI Automation

The goal of this repo is to make deploying and redeploying a new OpenShift v4 cluster a snap. Using the same repo and with minor tweaks, it can be applied to any version of OpenShift higher than the current version of 4.4.

As it stands right now, the repo works for several installation use cases:

  • DHCP with OVA template
  • DHCP with PXE boot (needs helper node)
  • Static IPs for nodes (lack of isolated network to let helper run DHCP server)
  • w/o Cluster-wide Proxy (HTTP and SSL/TLS with certs supported)
  • Restricted network (with or without DHCP)
  • No Cloud Provider (Useful for mixed clusters with both virtual and physical Nodes)

This repo is most ideal for Home Lab and Proof-of-Concept scenarios. Having said that, if prerequisites (below) can be met and if the vCenter service account can be locked down to access only certain resources and perform only certain actions, the same repo can then be used for DEV or higher environments. Refer to this link for more details on required permissions for a vCenter service account.

Quickstart

This is a concise summary of everything you need to do to use the repo. Rest of the document goes into details of every step.

  1. Setup helper node or ensure appropriate services (DNS/DHCP/LB/etc.) are available and properly referenced.
  2. Edit group_vars/all.yml, the following must be changed while the rest can remain the same
    • helper_vm_ip (used for webserver to host ignition files)
    • helper_vm_port (port used by webserver)
    • pull secret
    • ip and mac addresses, host/domain names
    • enable/disable fips mode
    • networkType: OpenShiftSDN (default), OVNKubernetes (uncomment this setting to use)
    • isolationMode: NetworkPolicy (default), Multitenant, Subnet (uncomment and set to use Multitenant or Subnet)
    • master_schedulable: false (default), set to true to install 3-node cluster
      • set worker_vms to empty list (i.e. []) while removing or commenting out list items for worker node details
    • vcenter details
      • datastore name
      • datacenter name
      • username and passwords of admin/service accounts
    • validate current Govc version is set
    • enable/disable registry/proxy/ntp with their details, as required
  3. Customize ansible.cfg and use/copy/modify staging inventory file as required
  4. Run one of the several install options

Infrastructure Prerequisites

  1. vSphere ESXi and vCenter 6.7 installed. For vCenter 6.5 please see cautionary note below:
  2. A datacenter created with a vSphere host added to it, a datastore exists and has adequate capacity
  3. The playbook(s) assumes you are running a helper node in the same network to provide all the necessary services such as [DHCP/DNS/HAProxy as LB]. Also, the MAC addresses for the machines should match between helper repo and this. If not using the helper node, the minimum expectation is that the webserver and tftp server (for PXE boot) are running on the same external host, which we will then treat as a helper node.
  4. The necessary services such as [DNS/LB(Load Balancer] must be up and running before this repo can be used
  5. Ansible on the machine where this repo is cloned.
    • This repository has been tested with Ansible 2.9.x. Issues were encountered when running with a version > 2.9.
    • Before you install Ansible, install the epel-release, run yum -y install epel-release

For vSphere 6.5, the files relating to interaction with VMware/vCenter such as this may need to have vmware_deploy_ovf module to include cluster, resource-pool parameters and their values set to work correctly.

Installation Steps

Set Global Variables

Pre-populated entries in group_vars/all.yml are ready to be used unless you need to customize further. Any updates described below refer to group_vars/all.yml unless otherwise specified.

  1. Get the pull secret from here. Save the pull secret as a file called pullsecret in the home directory of the user running this automation. Alternatively, modify pull_secret with the contents of the downloaded pull secret.
  2. Get the vCenter details:
    1. IP address
    2. Service account username (can be the same as admin)
    3. Service account password (can be the same as admin)
    4. Admin account username
    5. Admin account password
    6. Datacenter name (created in the prerequisites mentioned above)
    7. Datastore name
    8. Absolute path of the vCenter folder to use (optional). If this field is not populated, its is auto-populated and points to /${vcenter.datacenter}/vm/${config.cluster_name}
    9. Specify hardware version for VM compatibility. Defaults to 15.
  3. Downloadable link to govc (vSphere CLI, pre-populated)
  4. OpenShift cluster
    1. base domain (pre-populated with example.com)
    2. cluster name (pre-populated with ocp4)
  5. HTTP URL of the bootstrap.ign file (pre-populated with an example config pointing to helper node)
  6. For bootstrap_vms, master_vms, and worker_vms, when using static_ips, you can delete macaddr from the dictionary. VMware will auto-generate your MAC addresses. If you are using DHCP, defining macaddr will allow you to reserve the specified IP addresses on your DHCP server to ensure the OpenShift nodes always get the same IP address.
  7. For network_modifications: Network CIDRs default to sensible ranges. If a conflict is present (these ranges of addresses are assigned elsewhere in the organization), you may select other non-conflicting CIDR ranges by changing "enabled: false" to "enabled: true" and entering the new ranges. The ranges shown in the repository are the ones that are used by default, even if "enabled: false" is left as it is. The machine network is the network on which the VMs are created. Be sure to specify the right machine network if you set enabled: true
  8. Furnish any proxy details with the section like below.
    • If proxy.enabled is set to False anything defined under proxy and the proxy setup is ignored
    • The cert_content shown below is only for illustration to show the format
    • When there is no certificate, leave the variable cert_content value empty
    proxy:
       enabled: true
       http_proxy: http://helper.ocp4.example.com:3129
       https_proxy: http://helper.ocp4.example.com:3129
       no_proxy: example.com
       cert_content: |
          -----BEGIN CERTIFICATE-----
             <certficate content>
          -----END CERTIFICATE-----
    
  9. When doing the restricted network install and following instructions from restricted.md, furnish details related to the registry with a section like below. If registry.enabled is set to False anything defined under registry and the registry setup is ignored
    registry:
       enabled: true
       product_repo: openshift-release-dev
       product_release_name: ocp-release
       product_release_version: 4.4.0-x86_64
       username: ansible
       password: ansible
       email: [email protected]
       cert_content:
       host: helper.ocp4.example.com
       port: 5000
       repo: ocp4/openshift4
    
  10. NEW: disconnected/restricted install for existing registry
registry:
  enabled: true
  host: registry.example.com (if not 443, include :<port> here)
  repo_release: openshift/release (the actual OpenShift container images)
  repo_images: openshift/release-images (the OpenShift release images, i.e. 4.y.z-x86_64)
  1. If you wish to install without enabling the Kubernetes vSphere Cloud Provider (Useful for mixed installs with both Virtual Nodes and Bare Metal Nodes), change the provider: to none in all.yaml.
config:
  provider: none
  base_domain: example.com
  ...
  1. If you wish to enable custom NTP servers on your nodes, set ntp.custom to True and define ntp.ntp_server_list to fit your requirements.
ntp:
  custom: True
  ntp_server_list:
  - 0.rhel.pool.ntp.org
  - 1.rhel.pool.ntp.org
  1. Network Policy is enabled by default. To use Multitenant or Subnet, change isolationMode
    isolationMode: Multitenant
    

Step #5 needn't exist at the time of running the setup/installation step, so provide an accurate guess of where and at what context path bootstrap.ign will eventually be served

Set Ansible Inventory and Configuration

Now configure ansible.cfg and staging inventory file based on your environment before picking one of the 5 different install options listed below.

Update the staging inventory file

Under the webservers.hosts entry, use one of two options below :

  1. localhost : if the ansible-playbook is being run on the same host as the webserver that would eventually host bootstrap.ign file
  2. the IP address or FQDN of the machine that would run the webserver.

Update the ansible.cfg based on your needs

  • Running the playbook as a root user
    • If the localhost runs the webserver
      [defaults]
      host_key_checking = False
      
    • If the remote host runs the webserver
      [defaults]
      host_key_checking = False
      remote_user = root
      ask_pass = True
      
  • Running the playbook as a non-root user
    • If the localhost runs the webserver
      [defaults]
      host_key_checking = False
      
      [privilege_escalation]
      become_ask_pass = True
      
    • If the remote host runs the webserver
      [defaults]
      host_key_checking = False
      remote_user = root
      ask_pass = True
      
      [privilege_escalation]
      become_ask_pass = True
      

Run Installation Playbook

# Option 1: DHCP + use of OVA template
ansible-playbook -i staging dhcp_ova.yml

# Option 2: DHCP + PXE boot
ansible-playbook -i staging dhcp_pxe.yml

# Option 3: ISO + Static IPs
ansible-playbook -i staging static_ips.yml

# Refer to restricted.md file for more details
# Option 4: DHCP + use of OVA template in a Restricted Network
ansible-playbook -i staging restricted_dhcp_ova.yml

# Option 5: Static IPs + use of ISO images in a Restricted Network
ansible-playbook -i staging restricted_static_ips.yml

# Option 6: Static IPs + use of OVA template
# Note: OpenShift 4.6 or higher required
ansible-playbook -i staging static_ips_ova.yml

# Option 7: Static IPs + use of OVA template in a Restricted Network
# Note: OpenShift 4.6 or higher required
ansible-playbook -i staging restricted_static_ips_ova.yml

Miscellaneous

  • If you are re-running the installation playbook make sure to blow away any existing VMs (in ocp4 folder) listed below:

    1. bootstrap
    2. masters
    3. workers
    4. rhcos-vmware template (if not using the extra param as shown below)
  • If a template by the name rhcos-vmware already exists in vCenter, you want to reuse it and skip the OVA download from Red Hat and upload into vCenter, use the following extra param.

    -e skip_ova=true
  • If you would rather want to clean all folders bin, downloads, install-dir and re-download all the artifacts, append the following to the command you chose in the first step

    -e clean=true

Expected Outcome

  1. Necessary Linux packages installed for the installation. NOTE: support for Mac client to run this automation has been added but is not guaranteed to be complete
  2. SSH key-pair generated, with key ~/.ssh/ocp4 and public key ~/.ssh/ocp4.pub
  3. Necessary folders [bin, downloads, downloads/ISOs, install-dir] created
  4. OpenShift client, install and .ova binaries downloaded to the downloads folder
  5. Unzipped versions of the binaries installed in the bin folder
  6. In the install-dir folder:
    1. append-bootstrap.ign file with the HTTP URL of the boostrap.ign file
    2. master.ign and worker.ign
    3. base64 encoded files (append-bootstrap.64, master.64, worker.64) for (append-bootstrap.ign, master.ign, worker.ign) respectively. This step assumes you have base64 installed and in your $PATH
  7. The bootstrap.ign is copied over to the web server in the designated location
  8. A folder is created in the vCenter under the mentioned datacenter and the template is imported
  9. The template file is edited to carry certain default settings and runtime parameters common to all the VMs
  10. VMs (bootstrap, master0-2, worker0-2) are generated in the designated folder and (in state of) poweredon

Final Check:

If everything goes well you should be able to log into all of the machines using the following command:

# Assuming you are able to resolve bootstrap.ocp4.example.com on this machine
# Replace the bootstrap hostname with any of the master or worker hostnames
ssh -i ~/.ssh/ocp4 [email protected]

Once logged in, on bootstrap node run the following command to understand if/how the masters are (being) setup:

journalctl -b -f -u bootkube.service

Once the bootkube.service is complete, the bootstrap VM can safely be poweredoff and the VM deleted. Finish by checking on the OpenShift with the following commands:

# In the root folder of this repo run the following commands
export KUBECONFIG=$(pwd)/install-dir/auth/kubeconfig
export PATH=$(pwd)/bin:$PATH

# OpenShift Client Commands
oc whoami
oc get co

Debugging

To check if the proxy information has been picked up:

 # On Master
 cat /etc/systemd/system/machine-config-daemon-host.service.d/10-default-env.conf

 # On Bootstrap
 cat /etc/systemd/system.conf.d/10-default-env.conf

To check if the registry information has been picked up:

# On Master or Bootstrap
cat /etc/containers/registries.conf
cat /root/.docker/config.json

To check if your certs have been picked up:

# On Master
cat /etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt

# On Bootstrap
cat /etc/pki/ca-trust/source/anchors/ca.crt

ocp4-vsphere-upi-automation-gauth's People

Contributors

vchintal avatar mallmen avatar gauthiersiri avatar cptmorgan-rh avatar ddreggors avatar jimbarlow avatar christianh814 avatar dlbewley avatar marcno avatar therevoman avatar thoward-rh avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.