GithubHelp home page GithubHelp logo

ibm-mas / multicloud-bootstrap Goto Github PK

View Code? Open in Web Editor NEW
2.0 25.0 5.0 16.76 MB

Bootstrap code for MAS on AWS/Azure

License: Eclipse Public License 2.0

Shell 58.10% HCL 24.05% Python 14.08% HTML 0.13% Jinja 3.64%

multicloud-bootstrap's Introduction

MultiCloud Bootstrap Process

This folder contains the automation required for the bootstrap process. The scripts in this folder are not meant to be called manually unless needed for troubleshooting. These scripts are called in a specific order during the bootstrap process. The bootstrap process is called from the virtual server (aka the bootnode) automatically when the bootnode is created. The bootnode is a virtual server (EC2 instance in AWS and virtual machine in Azure and Google Cloud) that gets created in the buyer's account during the MAS instance deployment.

For example,

  • In AWS, the Marketplace product has associated CloudFormation template, and the template creates the EC2 instance. The UserData section in the EC2 instance has the commands to start the bootstrap process.
  • In Azure, the Marketplace product has associated ARM template, and the template creates the virtual machine. The virtual machine has the CustomScript extension defined that has the commands to start the bootstrap process.

Below are the steps that are invoked by Cloud provider automatically upon the creation of the bootnode.

From the template associated with the Marketplace product

  1. Clone the GitHub repo having bootstrap code
  2. Make the required scripts executable
  3. Execute init.sh script, which is the starting point of the bootstrap process.

From the init.sh:

  1. Call pre-validate.sh to perform pre-validation checks before starting the deployment.
  2. Call deploy.sh to perform the OpenShift cluster creation and application deployment.
    1. Run Terraform automation to create OpenShift cluster
    2. Run Terraform automation to create bastion host.
    3. Upload the deployment context to storage.
    4. Call Ansible playbooks to deploy Mas and prerequisites in correct order.
  3. Call Cloud specific notify.sh to send the email notifications.
  4. Send stack create completion signal to CloudFormation (specific to AWS).
  5. Upload the log file to storage.

The bootstrap code is organized in such a way that there is some generic code which will be common to any Cloud (with conditional handling of the Cloud type within the code), and some code that is specific to the Cloud. The below table defines all the files/folders with those details.

File/Directory Generic/Cloud-Specific Cloud type Details
init.sh (file) Generic The entrypoint of the bootstrap process. It will have all the common code to read the parameters passed by the Cloud init process and initiate the bootstrap flow. It has a parameter as Cloud type that can be used to perform any Cloud specific processing within the script.
helper.sh (file) Generic Helper functions used by various other scripts. It has functions for logging, retrieving details from OCP cluster, processing user inputs etc. It can also have Cloud specific functions defined and can be called from other scripts as needed.
pre-validate.sh (file) Generic Perform the pre-validation before starting the cluster deployment. If any of the pre-deployment checks fail, the bootstrap flow fails. Note: It can also have Cloud specific checks by making use of Cloud type global variable.
jdbc-prevalidate.py (file) Generic AWS Python code to check if provided database details are valid by making a connection check to it.
ansible (directory) Generic Ansible automation for Mas and prerequisite component deployments. There is a separate documentation for Ansible playbooks and roles.
aws (directory) Cloud-Specific AWS This folder contains the code specific to AWS implementation.
aws/bootnode-ami (directory) Cloud-Specific AWS Contains the code to create/manage Bootnode AMI.
aws/bootnode-ami/prepare-bootnode-ami.sh (file) Cloud-Specific AWS This script is used to install required packages in the EC2 instance to create the image (AMI) from it. Please check the developer documentation here for the detailed steps on creating Bootnode AMI. Normally the AMI should be created in us-east-1 region.
aws/bootnode-ami/copy-ami-to-region.sh (file) Cloud-Specific AWS Helper script that can be executed manually to copy the AMI from us-east-1 to all supported regions. This will be useful in the development stage where once the AMI is created in us-east-1 region, we want to copy it to other regions to perform the testing in those regions. Make sure to update the CloudFromation template for the AMI IDs for those regions. For the actual Marketplace product, AWS takes care of copying the AMI to all supported regions and updating the CloudFormation template.
aws/iam (directory) Cloud-Specific AWS Files specific IAM configuration required for the deployment.
aws/iam/policy.json (file) Cloud-Specific AWS A policy definition file used to create the policy using AWS CLI.
aws/master-cft/cft-mas-core-dev.json (file) Cloud-Specific AWS The CloudFormation template file used to deploy during the development testing. This template pulls the code from dev branch of the repo where bootstrap code resides.
aws/master-cft/cft-mas-core.json (file) Cloud-Specific AWS The CloudFormation template file that we share with AWS. This template pulls the code from main/master branch of the repo where bootstrap code resides. Please note that, the CloudFormation template used for the deployments by buyers is taken from the S3 bucket that is already pre-populated by AWS. None of the templates kept in the GitHub repo are used in the actual product deployments done by the buyers.
aws/notification/message-details.json (file) Cloud-Specific AWS Email template containing the environment details to be used by SES raw email notification.
aws/notification/message-creds.json (file) Cloud-Specific AWS Email template containing the credentials to be used by SES raw email notification.
aws/ocp-terraform (directory) Cloud-Specific AWS Terraform automation code used to deploy OpenShift cluster, configure OCS storage etc.
aws/ocp-bastion-host (directory) Cloud-Specific AWS Terraform automation code used to create the bastion host.
create-bastion-host.sh (file) Cloud-Specific AWS Script that creates the bastion host using the Terraform code at aws/ocp-bastion-host.
deploy.sh (file) Cloud-Specific AWS Script containing the actual deployment code that calls the underlying automation.
notify.sh (file) Cloud-Specific AWS Send the email notifications using Amazon SES service.
cleanup-mas-deployment.sh (file) Cloud-Specific AWS Uninstall the product. It basically deletes all the AWS resources for a particular MAS instance.

multicloud-bootstrap's People

Contributors

shajeena avatar vaibhavkulkarniibm avatar amitmangalvedkar avatar santoshpawaribm avatar vskhabani avatar santoshjpawar avatar natarajbti avatar aadawadk avatar vishwajit-dandage avatar padmankosalaram avatar sayedfayaz8022 avatar omqarrr avatar omkarhalankaribm avatar prasanthgelli1 avatar shrivastava-varsha avatar shubhammalvankar avatar isha-sangrolkar avatar prajyotnarulkar25 avatar

Stargazers

Prajith Ramachandran avatar  avatar

Watchers

Sanjay Prabhakar avatar  avatar  avatar Eric Klingelberger avatar  avatar  avatar  avatar Andrew Whitfield avatar marco avatar Tom Klapiscak avatar  avatar  avatar Juan avatar  avatar Andre Ferreira Guimaraes Junior avatar  avatar Sachin Balagopalan avatar Terence Quinn avatar Tremaine Hart avatar  avatar Alexandre Quinteiro avatar Daniel Istrate avatar  avatar Jenny Wang avatar Li Lin avatar

multicloud-bootstrap's Issues

Enhancement - Azure: MAS Core + Manage > New IPI > Without Jdbc connectionString > ByPassUpgradeVersioncheck should be True

Current result:

When user is deploying MAS Core + Manage without JDBC connection string in IPI, Automation is passing bypassUpgradeVersionCheck as FALSE

Expected result:

For without jdbc connection string scenario, it should pass bypassUpgradeVersionCheck as TRUE

Logs from boot node

TASK [ibm.mas_devops.suite_app_config : Debug information] *********************
ok: [localhost] => {
    "msg": [
        "Instance ID ............................ i2yfjz",
        "Application ID ......................... manage",
        "Workspace ID ........................... wsmasocp",
        "Application namespace .................. mas-i2yfjz-manage",

        "JDBC Binding ........................... workspace-application",
        "Templated workspace CR ................. 
---\napiVersion: \"apps.mas.ibm.com/v1\"\nkind: \"ManageWorkspace\"\nmetadata:\n  name: \"i2yfjz-wsmasocp\"\n  namespace: \"mas-i2yfjz-manage\"\n  labels:\n    mas.ibm.com/instanceId: \"i2yfjz\"\n    mas.ibm.com/workspaceId: \"wsmasocp\"\n    mas.ibm.com/applicationId: \"manage\"\nspec: {'bindings': {'jdbc': 'workspace-application'}, 'components': {'base': {'version': 'latest'}}, 'settings': {'deployment': {'persistentVolumes': [], 'serverBundles': [{'bundleType': 'all', 'isDefault': True, 'isMobileTarget': True, 'isUserSyncTarget': True, 'name': 'all', 'replica': 1, 'routeSubDomain': 'all'}]}, 'languages': {'baseLang': 'EN', 'secondaryLangs': []}, 'aio': {'install': True}, 'db': {'dbSchema': 'maximo', 'maxinst': {'demodata': False, 'db2Vargraphic': True, 'tableSpace': 'MAXDATA', 'indexSpace': 'MAXINDEX', '**bypassUpgradeVersionCheck**': False}}}}\n"
    ]
}

Mas8 install on AWS fails for UPI with existing VPC CIDRs except 10.0.0.0/16 (default)

Hello,

The AWS MAS8 Cloudformation template for the User Provisioned Infrastructure (UPI) option fails to install for all existing CIDR ranges except the default 10.0.0.0/16.

The template is at AWS Marketplace:
https://aws.amazon.com/marketplace/pp/prodview-aehjeun4gvcis?ref_=aws-mp-console-subscription-detail#pdp-usage

CloudFormation link:
https://aws.amazon.com/marketplace/pp/prodview-aehjeun4gvcis?ref_=aws-mp-console-subscription-detail#pdp-usage:~:text=Download%20CloudFormation%20Template

The bootnode log (mas-provisioning.log) has the following entries indicating the cause of the failure:

�[0m�[1mmodule.ocp[0].null_resource.install_openshift (local-exec):�[0m �[0mlevel=error msg=failed to fetch Metadata: failed to load asset "Install Config": failed to create install config: [platform.aws.subnets[0]: Invalid value: "subnet-xxx": subnet's CIDR range start 198.168.4.0 is outside of the specified machine networks, platform.aws.subnets[1]: Invalid value: "subnet-xxx": subnet's CIDR range start 198.168.4.128 is outside of the specified machine networks, platform.aws.subnets[2]: Invalid value: "subnet-0xxx": subnet's CIDR range start 198.168.5.0 is outside of the specified machine networks, platform.aws.subnets[3]: Invalid value: "subnet-xxx": subnet's CIDR range start 198.168.0.0 is outside of the specified machine networks, platform.aws.subnets[4]: Invalid value: "subnet-xxx": subnet's CIDR range start 198.168.0.128 is outside of the specified machine networks, platform.aws.subnets[5]: Invalid value: "subnet-xxxx": subnet's CIDR range start 198.168.1.0 is outside of the specified machine networks]
�[33m╷�[0m�[0m
�[33m│�[0m �[0m�[1m�[33mWarning: �[0m�[0m�[1mArgument is deprecated�[0m
�[33m│�[0m �[0m
�[33m│�[0m �[0m�[0m with module.network.aws_eip.eip1,
�[33m│�[0m �[0m on network/main.tf line 75, in resource "aws_eip" "eip1":
�[33m│�[0m �[0m 75: vpc = �[4mtrue�[0m�[0m
�[33m│�[0m �[0m
�[33m│�[0m �[0muse domain attribute instead
�[33m│�[0m �[0m
�[33m│�[0m �[0m(and 2 more similar warnings elsewhere)
�[33m╵�[0m�[0m
�[31m╷�[0m�[0m
�[31m│�[0m �[0m�[1m�[31mError: �[0m�[0m�[1mlocal-exec provisioner error�[0m
�[31m│�[0m �[0m
�[31m│�[0m �[0m�[0m with module.ocp[0].null_resource.install_openshift,
�[31m│�[0m �[0m on ocp/main.tf line 48, in resource "null_resource" "install_openshift":
�[31m│�[0m �[0m 48: provisioner "local-exec" �[4m{�[0m�[0m
�[31m│�[0m �[0m
�[31m│�[0m �[0mError running command 'cd ./installer-files && ./openshift-install create
�[31m│�[0m �[0mcluster --log-level=debug
�[31m│�[0m �[0m': exit status 3. Output: level=debug msg=OpenShift Installer 4.12.18
�[31m│�[0m �[0mlevel=debug msg=Built from commit xxx
�[31m│�[0m �[0mlevel=debug msg=Fetching Metadata...
�[31m│�[0m �[0mlevel=debug msg=Loading Metadata...
�[31m│�[0m �[0mlevel=debug msg= Loading Cluster ID...
�[31m│�[0m �[0mlevel=debug msg= Loading Install Config...
�[31m│�[0m �[0mlevel=debug msg= Loading SSH Key...
�[31m│�[0m �[0mlevel=debug msg= Loading Base Domain...
�[31m│�[0m �[0mlevel=debug msg= Loading Platform...
�[31m│�[0m �[0mlevel=debug msg= Loading Cluster Name...
�[31m│�[0m �[0mlevel=debug msg= Loading Base Domain...
�[31m│�[0m �[0mlevel=debug msg= Loading Platform...
�[31m│�[0m �[0mlevel=debug msg= Loading Networking...
�[31m│�[0m �[0mlevel=debug msg= Loading Platform...
�[31m│�[0m �[0mlevel=debug msg= Loading Pull Secret...
�[31m│�[0m �[0mlevel=debug msg= Loading Platform...
�[31m│�[0m �[0mlevel=info msg=Credentials loaded from default AWS environment variables
�[31m│�[0m �[0mlevel=error msg=failed to fetch Metadata: failed to load asset "Install
�[31m│�[0m �[0mConfig": failed to create install config: [platform.aws.subnets[0]: Invalid
�[31m│�[0m �[0mvalue: "subnet-xxx": subnet's CIDR range start 198.168.4.0 is
�[31m│�[0m �[0moutside of the specified machine networks, platform.aws.subnets[1]: Invalid
�[31m│�[0m �[0mvalue: "subnet-xxxx": subnet's CIDR range start 198.168.4.128
�[31m│�[0m �[0mis outside of the specified machine networks, platform.aws.subnets[2]:
�[31m│�[0m �[0mInvalid value: "subnet-0xxx": subnet's CIDR range start
�[31m│�[0m �[0m198.168.5.0 is outside of the specified machine networks,
�[31m│�[0m �[0mplatform.aws.subnets[3]: Invalid value: "subnet-xxx":
�[31m│�[0m �[0msubnet's CIDR range start 198.168.0.0 is outside of the specified machine
�[31m│�[0m �[0mnetworks, platform.aws.subnets[4]: Invalid value:
�[31m│�[0m �[0m"subnet-xxx": subnet's CIDR range start 198.168.0.128 is
�[31m│�[0m �[0moutside of the specified machine networks, platform.aws.subnets[5]: Invalid
�[31m│�[0m �[0mvalue: "subnet-xxx: subnet's CIDR range start 198.168.1.0 is
�[31m│�[0m �[0moutside of the specified machine networks]
�[31m│�[0m �[0m
�[31m╵�[0m�[0m
2024-01-25 23:20:21 ==== OCP cluster creation completed ====
2024-01-25 23:20:24 AWS_VPC_ID ===> xxx
error: dial tcp: lookup api.masocp-ql4u0o.xxx on 198.168.0.2:53: no such host - verify you have provided the correct host and port and that the server is currently running.
2024-01-25 23:20:30 Deployment return code is 1
2024-01-25 23:20:30 Deployment failed
2024-01-25 23:20:30 ===== PROVISIONING FAILED =====
2024-01-25 23:20:30 STATUS=FAILURE

Prior to this in the log it shows the following network metadata when creating the install_config_yaml file:

      networking:
          clusterNetwork:
          - cidr: 10.128.0.0/14
            hostPrefix: 23
          machineNetwork:
          - cidr: 10.0.0.0/16
          networkType: OpenShiftSDN
          serviceNetwork:
          - 172.30.0.0/16

It appears that the default CIDR range is being used and not that of the existing VPC into which the cluster is being installed by the UPI template: In this case the existing VPC CIDR range is 198.168.0.0/21. After the install failed into this existing VPC, a new VPC was created with CIDR 10.0.0.0/16 and the install completed the OCP components without issue so appears there is a bug in the UPI code that causes failure for any other CIDR but 10.0.0.0/16

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.