GithubHelp home page GithubHelp logo

nsx-t-gen's Introduction

nsx-t-gen

Concourse pipeline to install NSX-T v2.1

The concourse pipeline uses ansible scripts created by Yasen Simeonov and forked by the author of this pipeline.

There is an associated blog post detailing the features, options here: Introducing nsx-t-gen: Automating NSX-T Install with Concourse

Recommending checking the FAQs for full details on handling various issues/configurations before starting the install.

Things handled by the pipeline:

  • Deploy the VMware NSX-T Manager, Controller and Edge ova images
  • Configure the Controller cluster and add it to the management plane
  • Configure hostswitches, profiles, transport zones
  • Configure the Edges and ESXi Hosts to be part of the Fabric
  • Create T0 Router (one per run, in HA vip mode) with uplink and static route
  • Configure arbitrary set of T1 Routers with logical switches and ports
  • NAT Rules setup for T0 Router
  • Container IP Pools and External IP Blocks
  • Self-signed cert generation and registration against NSX-T Manager
  • Route redistribution for T0 Router
  • HA Spoofguard Switching Profile
  • Load Balancer (with virtual servers and server pool) creation

Not handled by pipeline:

  • BGP or Static Route setup (outside of NSX-T) for T0 Routers

Warning

This is purely a trial work-in-progress and not officially supported by anyone. Use caution while using it at your own Risk!!.

Also, NSX-T cannot co-reside on the same ESXi Host & Cluster as one already running NSX-V. So, ensure you are either using a different set of vCenter, Clusters and hosts or atleast the cluster that does not have NSX-V. Also, the ESXi hosts should be atleast 6.5. Please refer to NSX-T Documentation for detailed set of requirements for NSX-T.

Pre-reqs

  • Concourse setup
  • If using docker-compose to bring up local Concourse and there is a web proxy, make sure to specify the proxy server and dns details following the template provided in docs/docker-compose.yml
  • If the webserver & the ova images are not still reachable from concourse without a proxy in middle, check if ubuntu firewall got enabled. This can happen if you used concourse directly as well as docker-compose. In that case, either relax the iptable rules or allow routed in ufw or just disable it:
sudo ufw allow 8080
sudo ufw default allow routed
  • There should be atleast one free vmnic on each of the ESXi hosts
  • Ovftool would fail to deploy the Edge VMs in the absence of VM Network or standard switch (non NSX-T) with Host did not have any virtual network defined error message. So, ensure presence of either one. Refer to Adding VM Network for detailed instructions.
  • Docker hub connectivity to pull docker image for the concourse pipeline
  • NSX-T 2.1 ova images and ovftool install bits for linux
  • Web server to serve the NSX-T ova images and ovftool
# Sample nginx server to host bits
sudo apt-get nginx
cp <*ova> <VMware-ovftool*.bundle> /var/www/html
# Edit nginx config and start

S3 can also be used to host the bits. Change the reference to s3 so concourse can pull down the bits as needed.

  • vCenter Access
  • SSH enabled on the Hosts

Offline envs

This is only applicable if the docker image nsxedgegen/nsx-t-gen-worker:latest is unavailable or env is restricted to offline.

  • Download and copy the VMware ovftool install bundle (linux 64-bit version) along with nsx-t python modules (including vapi_common, vapi_runtime, vapi_common_client libs) and copy that into the Dockerfile folder
  • Create and push the docker image using
 docker build -t nsx-t-gen-worker Dockerfile
 # To test image:  docker run --rm -it nsx-t-gen-worker bash
 docker tag  nsx-t-gen-worker nsxedgegen/nsx-t-gen-worker:latest
 docker push nsxedgegen/nsx-t-gen-worker:latest

VMware NSX-T 2.1.* bits

Download and make the following bits available on a webserver so it can be used by pipeline to install the NSX-T 2.1 bits:

# Download NSX-T 2.1 bits from
# https://my.vmware.com/group/vmware/details?downloadGroup=NSX-T-210&productId=673

#nsx-mgr-ova
nsx-unified-appliance-2.1.0.0.0.7380167.ova   

#nsx-ctrl-ova
nsx-controller-2.1.0.0.0.7395493.ova  

#nsx-edge-ova
nsx-edge-2.1.0.0.0.7395502.ova  

# Download VMware ovftool from https://my.vmware.com/group/vmware/details?productId=614&downloadGroup=OVFTOOL420#
VMware-ovftool-4.2.0-5965791-lin.x86_64.bundle  

Edit the pipelines/nsx-t-install.yml with the correct webserver endpoint and path to the files.

Register with concourse

Use the sample params template file (under pipelines) to fill in the nsx-t, vsphere and other configuration details. Register the pipeline and params against concourse.

Sample setup

Copy over the sample params as nsx-t-params.yml and then use following script to register the pipeline (after editing the concourse endpoint, target etc.)

#!/bin/bash

# EDIT names and domain
CONCOURSE_ENDPOINT=concourse.corp.local.com
CONCOURSE_TARGET=nsx-concourse
PIPELINE_NAME=install-nsx-t

alias fly-s="fly -t $CONCOURSE_TARGET set-pipeline -p $PIPELINE_NAME -c pipelines/nsx-t-install.yml -l nsx-t-params.yml"
alias fly-l="fly -t $CONCOURSE_TARGET containers | grep $PIPELINE_NAME"
alias fly-h="fly -t $CONCOURSE_TARGET hijack -b "

echo "Concourse target set to $CONCOURSE_ENDPOINT"
echo "Login using fly"
echo ""

fly --target $CONCOURSE_TARGET login --insecure --concourse-url https://${CONCOURSE_ENDPOINT} -n main

After registering the pipeline, unpause the pipeline before kicking off any job group

Video Recording of Pipeline Execution

Follow the two part video for more details on the steps and usage of the pipeline:

  • Part 1 - Install of OVAs and bringing up VMs
  • Part 2 - Rest of install and config

Options to run

  • Run the full-install-nsx-t group for full deployment of ova's followed by configuration of routers and nat rules.

  • Run the smaller independent group:

base-install for just deployment of ovas and control management plan. This uses ansible scripts under the covers.

add-routers for creation of the various transport zones, nodes, hostswitches and T0/T1 Routers with Logical switches. This also uses ansible scripts under the covers.

config-nsx-t-extras for adding nat rules, route redistribution, HA Switching Profile, Self-signed certs. This particular job is currently done via direct api calls and does not use Ansible scripts.

nsx-t-gen's People

Contributors

alphasite avatar

Watchers

 avatar

nsx-t-gen's Issues

need to add check for Edge node deployment

My pipeline failed because I did not have enough memory in my hosts to power on the edge.
I fixed the problem and restarted the pipeline, it just skipped that task and turned green while the edge was still down and not working.
Need to check that the edge is on and registered before moving on, same for controllers

Sample cluster auto install parameter

  • display_name: "{{host_uplink_prof}}"
    description: "Host Overlay Profile"
    teaming:
    active_list:
    • uplink_name: "uplink-1"
      uplink_type: PNIC
      standby_list:
    • uplink_name: "uplink-2"
      uplink_type: PNIC
      policy: FAILOVER_ORDER
      transport_vlan: "21"

We are missing the assignment of the uplinks to physical NICs

Edge stays in "Failed to power on VM"

Edge failed to power-on due to resource constraints, after fixing the issue and powering on the VM the pipeline cannot continue due to the status of the object in NSX manager
image
image
image

Need to move away from MOIDs

The current state of the param file requires specifying clusters, network and storage for the controllers using their MOID due to requirements by the NSX-Manager auto deploy feature. .
controller_compute_id: "domain-c26"
controller_storage_id: "datastore-817"
controller_management_network_id: "dvportgroup-42 "

This is very cumbersome and lengthy. Perhaps there's a way to bypass? if not we should input the names of the objects in the param file and convert to the MOID automatically.

here is a nice article from William lam
https://www.virtuallyghetto.com/2011/11/vsphere-moref-managed-object-reference.html
and another external one
https://www.danilochiavari.com/2014/03/28/how-to-quickly-match-a-moref-id-to-a-name-in-vmware-vsphere/

Switch NSX manager cert

Need to change the certificate for the manager to the common name as the fqdn
Make this the last task in the pipeline as it will reset the connection after so other tasks won't fail
Here are the instructions:

The steps are also outlined in the specific section of the documentation:
https://docs.vmware.com/en/VMware-NSX-T/2.0/com.vmware.nsxt.admin.doc/GUID-9BBF8A54-DFBD-4B24-B7A1-492CB42DD0D5.html#GUID-9BBF8A54-DFBD-4B24-B7A1-492CB42DD0D5

In System >> Trust >> CSRs press ‘GENERATE CSR’

Generate a Certificate Signing Request (CSR), and use the NSX Manager FQDN as the Common Name (e.g. nsx-man.corp.local) and make sure it can be resolved in DNS

Then, self-sign the CSR from the UI

The new certificate is now available in the ‘Certificates’ tab

Cut and paste the UUID of the certificate into a text editor

we need to use curl or postman to send an API Call to exchange the default certificate.
curl --insecure -u admin:VMware1! -X POST https://10.208.40.236/api/v1/node/services/http?action=apply_certificate&certificate_id=26fc8a7a-7d6c-49eb-b846-e5bacb431937
Exchange ‘-u admin:VMware1!’ with the NSX Manager username/password you are using. And the IP with the IP Address of NSX-Manager.
NSX-Manager will exchange the certificate, and restart the Web services for the UI and the API with the new certificate.

Load balancer is not being created

VIP are created but LB is not.
This is the error in the pipeline

{u'error_code': 23500, u'related_errors': [{u'error_code': 23721, u'error_message': u'The logical router LogicalRouter/071c4339-808e-450c-a3ac-1ebad1cc74db does not have associated edge cluster to deploy a load balancer service LoadBalancerService/fec19889-a804-424b-b92b-11bb7cb1d9da.', u'httpStatus': u'BAD_REQUEST', u'module_name': u'LOAD-BALANCER'}], u'error_message': u'Found errors in the request. Please refer to the relatedErrors for details.', u'httpStatus': u'BAD_REQUEST', u'module_name': u'LOAD-BALANCER'}
NSX Mgr updated to use newly generated CSR!!
Update response code:202

This is the config

  • name: PAS-ERT-LBR
    t1_router: T1-Router-PAS-ERT # Should match a previously declared T1 Router
    size: small # Allowed sizes: small, medium, large

T1 router T1-Router-PAS-ERT does have an edge cluster in teh param

  • name: T1-Router-PAS-ERT
    switches:
    • name: PAS-ERT
      logical_switch_gw: 192.168.20.1 # Last octet should be 1 rather than 0
      subnet_mask: 24
      edge_cluster: true

But its not getting assigned in NSX

image

Update and restart pipelines

Need to have a clear guidance on how to update the param file on the pipeline when needed. The update should allow pull from a git repo where customers may manage their param files.
Also guidance on how to clear the pipeline cache and start over in the shortest way possible

/home/run.sh not running due to fly cli version

The container that runs /home/run.sh has a much lower version of fly then the concourse server.
Need to run "fly -t nsx-concourse sync" before trying to register pipeline
I ran it manually with "docker exec" and it worked

Auto install of NSX in cluster

Need to enable the capability to auto-install nsx-t and creation of TNs when a server is added to a cluster in the compute manager
image
image

SNAT Pools have not been created

SNAT Pools were not created.
these are the required params:

nsx_t_external_ip_pool_spec: |
external_ip_pools:

  • name: snat-vip-pool-for-pas
    cidr: 10.208.40.0/24
    start: 10.208.40.10 # Should not include gateway
    end: 10.208.40.200 # Should not include gateway

  • name: snat-vip-pool-for-pks
    cidr: 10.208.50.0/24
    start: 10.208.50.10 # Should not include gateway
    end: 10.208.50.200 # Should not include gateway

Uplink logical switch vlan

The uplink logical switch "uplink-vlan-ls" was configured with a VLAN.
But couldn't find a parameter to set it.
We should set it to 0 always and allow an advanced config in the future

Have separate compute entities

We need to have the ability to deploy NSX to separate compute entities:

  1. Multiple clusters in the compute manager- In many cases customers will have multiple clusters where they will need to deploy NSX to. Example
    multi compute feature
    multi AZ for PAS and PKS. for this purpose we will need to be able to support multiple clusters configured in a compute managers (Or in multiple compute managers but more likely in a single one)
  2. Separate vCenters - In some cases customers will have separate vCenters for management and compute. We need to be able to specify multiple compute managers where NSX will be installed automatically (This is not in high priority but may be easy to do with the multi cluster capability)
  3. Individual ESXi hosts - in cases where a customer is deploying the management cluster or any other cluster spread across L3 boundaries (multiple racks) we will need to allow the addition of individual ESXi hosts to install NSX on with different uplink profile
  4. Multiple uplink profile - for point 3 above, and for places where there are different northbound network configurations for ESXi hosts (such as VLAN IDs) we need to be able to specify an uplink profile that is different

LB is not created

The pipeline has created the LB virtual servers and pools but not the LB itself
That is needed for the PAS

IP block not being created

IP Blocks not being created

here are the required params:

nsx_t_container_ip_block_spec: |
container_ip_blocks:

  • name: PAS-container-ip-block
    cidr: 10.4.0.0/16
  • name: PKS-node-container-ip-block
    cidr: 11.4.0.0/16
  • name: PKS-pod-container-ip-block
    cidr: 12.4.0.0/16

Edge network config

In the parameter file there are two edge networks requested:
edge_data_network_id: "network-33"
edge_management_network_id: "dvportgroup-42"

I assume this refers to the "uplink" network and the "Overlay" network.
There should be 3 networks for the edge:
Management - management
Uplink - switch communication, can be the same as management but many times is separate
OverLay - where the tunnel is created

Add Uplink ports and static route

nsx_t_t0router_spec: |
t0_router:
name: DefaultT0Router
ha_mode: 'ACTIVE_STANDBY'
Specify the edges to be used for hosting the T0Router instance
edge_indexes:
Index starts from 1 -> denoting nsx-t-edge-01
primary: 1 # Index for primary edge to be used
secondary: 2 # Index for secondary edge to be used
vip: 10.13.12.103/27
ip1: 10.13.12.101/27
ip2: 10.13.12.102/27
vlan_uplink: 0
static_route:
next_hop: 10.13.12.1
network: 0.0.0.0/0
admin_distance: 1

Route advertisement

Need to enable route advertisement on T1 routers:

Connected networks on all T1 routers

Connected + T1 LB VIPs - T1-ERT

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.