GithubHelp home page GithubHelp logo

azure / sap-hana Goto Github PK

View Code? Open in Web Editor NEW
178.0 46.0 142.0 17.73 MB

Tools to create, monitor and maintain SAP landscapes in Azure.

License: MIT License

Shell 27.30% HCL 41.61% Python 7.30% PowerShell 20.88% Jinja 2.90%
suse achmea

sap-hana's Introduction

sap-hana's People

Contributors

abhinababasu avatar balram2697 avatar centiqmurphy avatar chaosex avatar dwj300 avatar eric-camplin avatar jeffaco avatar kimforss avatar microsoftopensource avatar mkdeegan avatar msftgits avatar pabowers avatar pakdliu avatar pbbpage avatar persiaaziz avatar persiaaziz-zz avatar peterchoo avatar prtyag avatar robfallows avatar rukawata avatar sidrabindran avatar suhanselvan avatar trstroem avatar yunzizhangazure avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sap-hana's Issues

Please activate version control

Hi, can we have version control on this repository?
This would be extremly helpful to get notified when there has been a change and also gives the possibility to use previous versions if the current version is broken.
Thanks!

Still NSG Error when deploying

Hi, I get following issue during deployment of single_node_hana:

Error: Error creating/updating NSG "HN1-nsg" (Resource Group "SAPAutoDemo"): network.SecurityGroupsClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="SecurityRuleParametersMissing" Message="Required security rule parameters are missing for security rule with Id: /subscriptions/35b67b4c-4fd4-4f0b-997c-bbb82032d45d/resourceGroups/SAPAutoDemo/providers/Microsoft.Network/networkSecurityGroups/HN1-nsg/securityRules/open-hana-db-ports. Security rule must specify SourceAddressPrefixes, SourceAddressPrefix, or SourceApplicationSecurityGroups." Details=[]

on ../common_setup/nsg.tf line 1, in resource "azurerm_network_security_group" "sap_nsg":
1: resource "azurerm_network_security_group" "sap_nsg" {

Any ideas?
Thanks

Automate Unique Resource Group Names with Utility Scripts

Problem Statement

The util scripts provide a simplified interface into using and testing the codebase without over-customization for the user (typically an Engineer). This should allows multiple users to run the same deployments and get the same (or expectedly similar) results. Currently there is an issue when two or more users attempt to deploy into the same subscription, because the templated input JSON files contain hard-coded details - in particular the resource group name, which needs to be unique within an Azure subscription.

Enhancement

The util scripts should be enhanced to support editing the resource group name through a script rather than manual editing (in a similar manner to util/set_sap_download_credentials.sh). This could then be either run by an engineer, or embedded as part of the terraform_v2.sh wrapper script. Through this, deployments can auto-generate a more-unique resource group name from, for example, properties of the local environment (i.e. environment variables). This follows the existing behaviour of the util/terraform_v2.sh script, such that:

When the user runs util/set_resource_group.sh <resource group name> <target template>
Then the script:

  • Sets the given resource group name in the given template
  • Fails with a suitable error when either no resource group name or no target template are provided or the given template does not exist
  • When a non-valid set of inputs is provided, the script should provide a list of valid files based on those available in the templates dir. For example:
    Where <JSON template name> is one of the following:
    - clustered_hana
    - rti_only
    - single_node_hana
    

Notes

  1. The scripts only have to work for the V2 templates in the standard directory (e.g. single_node_hana, clustered_hana, etc.)
  2. To aid future refactoring the code should follow the patterns and conventions already laid out for similar functionality in terraform_v2.sh.
  3. Some valid sets of inputs are:
    • util/set_resource_group.sh "rg-dev-hana" single_node_hana
    • util/set_resource_group.sh "rg-prd-hana" clustered_hana
  4. Some invalid sets of inputs are:
    • util/set_resource_group.sh "" single_node_hana
    • util/set_resource_group.sh "rg"
  5. There's some existing code within the build pipeline setup that solves the same kind of problem to allow multiple pipelines to deploy in parallel (these solution could be consolidated)

Dependencies

None, but relates to #340 Capability to specify target input JSON for set_sap_download_credentials.sh

Future Work

  1. Generalize the ability to programatically edit the JSON templates for other use cases, such as:
    1. Switching between SLES and RHEL
    2. Setting HANA version
    3. Setting the SID/Instance Number
    4. Switching the ansible execution on/off
  2. Integrate the use of these scripts within the pipeline to create more readable pipeline test scenarios

Checklist

  • Usage documentation updated as necessary
  • Architecture documentation updated as necessary

References

None

Capability to provision Azure resources for clustered HANA systems on SLES

Problem Statement

The current V2 codebase only supports single node HANA instances, but customers would like to be able to provision 2-node clustered HANA systems that support automated failover when a node in the cluster fails. In order to support this, changes are required to the Azure resources being provisioned.

Enhancement

Ensure that the codebase can support provisioning of a HANA cluster on the SLES platform.
This will be demonstrated by introducing a new V2 templated input JSON file clustered_hana such that:

Given the user runs through the USAGE guide up to and including step 1 of Build/Update/Destroy Lifecycle
And the user runs util/terraform_v2.sh plan clustered_hana
When the user runs util/terraform_v2.sh apply clustered_hana
Then the script:

  • Deploys 2 VMs and associated resources (rather than 1 in single_node_hana)
  • Deploys both VMs running SLES 12 SP5 Gen 1 - SAP edition
  • Deploys both VMs into the given Availability Set
  • Deploys both VMs behind the Load Balancer
  • Ensures VMs have access to the HA clustering packages in the SLES repo (e.g. corosync).

Future Work

  • Ensure the underlying Terraform supports N-node clusters (N>2)

Notes

  1. This work is believed to be mainly Terraform-based changes, and does not cover changes made on the VM OS itself.
  2. Testing and acceptance of this feature should not require ansible to be run as part of the provisioning (i.e. disable in the input JSON template to reduce deployment time)

Dependencies

None

Checklist

  • Usage documentation updated as necessary
  • Architecture documentation updated as necessary

References

  1. High availability of SAP HANA on Azure VMs on SUSE Linux Enterprise Server (manual deployment)

Enable ansible in azure v2 pipeline

Problem Statement

We need to test PR against master in sap-hana/deploy/v2 in azure pipieline.
Currently we switch off ansible execution since current workstream is TF only.

Enhancement

Ensure ansible execution is switched back on for v2 pipeline.

Notes

None

Dependencies

  • #342 Capability to configure HANA database replication

Future Work

None

Checklist

  • Ensure ansible execution runs without any problem for new PRs.

References

None

Regardless of sap_sid, HDB shared volume is always mounted to /hana/shared/PV1

Description: Regardless of the value of sap_sid, the HDB shared volume is mounted at /hana/shared/PV1, whereas /hana/shared/ stays on the local HDD. This explains the (very) long provisioning times of hdblcm, since the local HDD is a lot slower.

How to reproduce: In terraform.tfvars, set sap_sid = "XYZ" - after TF deployment, there is a directory /hana/shared/PV1, which is mounted to the large shared volume. The correct mountpoint (in this case, /hana/shared/XYZ) is not mounted and exists on the local HDD.

Details:

tniek@XYZ-db0:/hana/shared> ls
install  PV1  XYZ

tniek@XYZ-db0:/hana/shared> df -h /hana/shared/PV1
Filesystem                                     Size  Used Avail Use% Mounted on
/dev/mapper/vg_hana_shared_PV1-lv_hana_shared  512G   33M  512G   1% /hana/shared/PV1

tniek@XYZ-db0:/hana/shared> df -h /hana/shared/XYZ
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2        29G   23G  4.8G  83% /

Support real SAP applications

Do you plan to support major SAP applications like S/4HANA, BW on HANA (incl. BPC), BW/4HANA, Solution Manager etc. Having a toy install script for a HANA based application isn't something we need on a regular basis.

Do you have any plans to add content from SAP transports to move through each environment e.g. via ABAPgit or similar so that you can have a spin up / down approach?

SBD device

Hello,

In this scenario there is only 1 iSCSI target for SBD - have you run into a situation where this iSCSI wasn't available either because of a problem or maintenance being performed on that VM?

Is there a better way to do it besides having a 3 device SBD?

thks

Capability to specify target input JSON for set_sap_download_credentials.sh

Problem Statement

When running deployments which progress to the ansible stage, the input JSON files require that the SAP download credentials are specified in order to authenticate with SAP download marketplace. This currently only works with the util scripts for the single_node_hana template, but needs to be more generalized with the introduction of a template for clustering. See the note in the Description section of #298.

Enhancement

Ensure the current util/set_sap_download_credentials.sh script can work with multiple templates by allowing the target template to be specified on the command line. This follows the existing behaviour of the util/terraform_v2.sh script, such that:

When the user runs util/set_sap_download_credentials.sh <SAP user> <SAP password> <target template>
Then the script:

  • Sets the given username and password as the SAP download credentials in the given template
  • Accepts empty strings for username and password provided as ""
  • Fails with a suitable error when no target template is provided or the given template does not exist
  • When a non-valid set of inputs is provided, the script should provide a list of valid based on those available in the templates dir, and filters the list to those containing hana. For example:
    Where <JSON template name> is one of the following:
    - clustered_hana
    - single_node_hana
    

Notes

  1. To aid future refactoring the code should follow the patterns and conventions already laid out for similar functionality in terraform_v2.sh.
  2. Some valid sets of inputs are:
    • util/set_sap_download_credentials.sh "user" "pass" single_node_hana
    • util/set_sap_download_credentials.sh "" "" clustered_hana
  3. Some invalid sets of inputs are:
    • util/set_sap_download_credentials.sh single_node_hana
    • util/set_sap_download_credentials.sh "user" "pass" rti_only

Dependencies

None

Future Work

None identified

Checklist

  • Usage documentation updated as necessary
  • Architecture documentation updated as necessary

References

Backlog: v2 -> v1 migration

The following folder will need to be part of the move:

  • sap-hana/deploy/v2
  • sap-hana/util

The following folder will become obsolete:

  • sap-hana/deploy/vm
  • sap-hana/monitor

The following folder stays:

  • sap-hana/tools

sap_instancenum not considered at provisioning

Description: During hdblcm install via Ansible, the specified instance number is not considered.

How to reproduce: In terraform.tfvars, set sap_instancenum = "01"; after deployment, there will be install with instance number 00 instead.

NSG Error when deploying

Hi,
when trying to execute the single node hana deployment we are facing the following issue. Kindly asking for some advise. Thanks a lot.

azurerm_network_security_group.sap_nsg: Error creating/updating NSG "HN1-nsg" (Resource Group "hanademo"): network.SecurityGroupsClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="SecurityRuleParametersMissing" Message="Required security rule parameters are missing for security rule with Id: /subscriptions/REMOVEDSUBSCRIPTION/resourceGroups/hanademo/providers/Microsoft.Network/networkSecurityGroups/HN1-nsg/securityRules/open-hana-db-ports. Security rule must specify SourceAddressPrefixes, SourceAddressPrefix, or SourceApplicationSecurityGroups." Details=[]

Kind Regards,
Darius

Backlog: when to use function in the code

Under the following condition, wrap the code in a function is over kill and reduce the readability of the code:

  • One liner
  • String concatenation
  • small amount of code that does not get reused again anywhere else

Some of the examples that should be adjusted listed in PR #298 in Comment.

Capability to configure OS clustering software on SLES

Problem Statement

The current V2 codebase only supports single node HANA instances, but customers would like to be able to provision clustered HANA systems that support automated failover when a node in the cluster fails. In order to support this, the OS must be configured as part of a corosync cluster.

Enhancement

Ensure that the codebase can support provisioning of a HANA cluster on the SLES platform.
This will be demonstrated by introducing a new V2 templated input JSON file clustered_hana such that:

Given the user runs through the USAGE guide up to and including step 1 of Build/Update/Destroy Lifecycle
And the user runs util/terraform_v2.sh plan clustered_hana
When the user runs util/terraform_v2.sh apply clustered_hana
Then the script:

  • Configures the HA clustering packages on both VMs in the 2-node cluster
  • Fails with a suitable error when the fencing agent service principal details are not available

Notes

  1. It should be possible to demonstrate node failure by running some ansible commands/playbooks that trigger a cluster failure, and illustrate the cluster behaviour.

Dependencies

  • #337 Capability to provision Azure resources for clustered HANA systems on SLES
  • #339 Capability to manage Azure Fencing Agent

Checklist

  • Usage documentation updated as necessary
  • Architecture documentation updated as necessary

References

  1. Setting up Pacemaker on SUSE Linux Enterprise Server in Azure (uses SBD method, rather than Azure fencing agent)
  2. Setting up Pacemaker on Red Hat Enterprise Linux in Azure (uses Azure fencing agent)

Addition of Load Balancer to Single Node HANA Systems

Problem Statement

The current single node HANA systems built from V2 of the codebase do not have (or need) a load balancer. However, to support the scenario where a user wishes to upgrade a single node HANA system into a clustered system, we wish to add a load balancer to single node systems too.

This proposal originated from the Design Workshop, where it was stated that there is no Azure runtime cost in having a load balancer added to a single node system (or perhaps the cost is very minimal). However, adding in a load balancer into the original build, should support building out to a clustered system with negligible or no downtime.

Design Overview

This change is unlikely to warrant any change to the interface, but may require some changes to the input/output.json configuration.

The design should support both HANA 1 and HANA 2 (from design workshop notes).

Checklist

  • V2 Documentation updated as necessary (now in #319)
  • V2 Pipeline testing updated as necessary
  • Differing load balancer rules for different HANA versions should be supported
  • New V2 Azure resource naming should be supported (i.e. prefixes for env details, and suffixes for resource type) Now out of scope
  • Ensure LB is configured with HANA VM (not RTI, which we used for testing)

HA Cluster Join fails (using Azure Shell)

Build fails with error when second node joins the cluster.
Tried in multiple regions, used fresh clone of the the project in Azure Shell.

TASK [ha-cluster-join : HA cluster join csync2] ********************************
changed: [ha1-hdb1]

TASK [ha-cluster-join : HA cluster join cluster] *******************************
fatal: [ha1-hdb1]: FAILED! => {"changed": true, "cmd": "ha-cluster-join -y cluster", "delta": "0:00:05.717640", "end": "2019-11-26 10:54:22.156122", "msg": "non-zero return code", "rc": 1, "start": "2019-11-26 10:54:16.438482", "stderr": "ERROR: cluster.join: No value for ring0", "stderr_lines": ["ERROR: cluster.join: No value for ring0"], "stdout": " Probing for new partitions...done\n No existing IP/hostname specified - skipping mountpoint detection/creation", "stdout_lines": [" Probing for new partitions...done", " No existing IP/hostname specified - skipping mountpoint detection/creation"]}

PLAY RECAP *********************************************************************
ha1-hdb0 : ok=63 changed=33 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
ha1-hdb1 : ok=60 changed=30 unreachable=0 failed=1 skipped=2 rescued=0 ignored=0
hanaonazuresm-iscsi : ok=42 changed=19 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0

Failing to connect to hdb host 10.1.2.4 via ssh

Failing to connect to hdb host via 10.1.2.4. I can see two NICs (10.1.1.4 and 10.1.2.4) are attached to the VM hdb1 from Portal, however i am having a difficulty to connect from linux jumpbox to hdb1 via ssh. I wonder if the second NIC was correctly configured / enabled from the OS side so that the OS acknowledge all the attached NICs. BTW I didn't encounter this issue yesterday.
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/multiple-nics

Bug: Addition of Prometheus changes break existing deployments

Summary

The latest changes made to the master repo in #286 seem to have broken deployments that were working prior to that PR's merge.

Steps to Reproduce

  1. Checkout and use v2 code prior to #286 (e.g. git checkout 94fac2a53677ccf3e0f2680bf2e2311535e77a72)
  2. Do all the basic setup, like SP, Terraform Init, etc. (if not done already)
  3. Ensure you have no local changes/config in files, other than SAP media download credentials set in example template
  4. From the project root, run the deploy: terraform apply -auto-approve -var-file=deploy/v2/template_samples/single_node_hana.json deploy/v2/terraform/

Expected Behaviour

Deployment completes successfully without error.

Actual Behaviour

Deployment fails at the Ansible stage, on the following task:

TASK [enable-prometheus : Set OS version] ****************************************************************************************
fatal: [10.1.2.4]: FAILED! => {"msg": "The conditional check 'output.options.enable_prometheus == True' failed. The error was: error while evaluating conditional (output.options.enable_prometheus == True): 'dict object' has no attribute 'options'\n\nThe error appears to be in '/home/azureadm/sap-hana/deploy/v2/ansible/roles/enable-prometheus/tasks/main.yml': line 5, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Set OS version\n  ^ here\n"}

Relevant Resources (URLs, screenshots, logs, etc.)

The following is expecting output.options to exist, but it doesn't if the customer is using a codebase version prior to #286 (at the time of writing this is the very latest, so unlikely that the customer might have updated):

- output.options.enable_prometheus == True

Possible fixes

  1. A trivial fix might be to use something along these lines (⚠️code not tested ⚠️):
    when:
      - output.options is defined
      - (output.options.enable_prometheus is defined) and output.options.enable_prometheus
    
  2. A more robust best practice for avoiding similar issues would be to provide defaults for all role variables (i.e. don't expect the calling code to be smart enough to have defined everything - especially in the output.json)

SAP Download Center Doesn't include the versions of the packages listed in the readme

The readme (https://github.com/Azure/sap-hana#required-sap-downloads) list several required packages. However, as of 2019-04-29, not all listed packages are available (only newer versions are available for some of them). The four packages marked with a red 'X' are affected (blue lines indicate packages I didn't use in my deployment and have no data for):
image

I'm getting a build deployment failure (Assigning Additional Roles to the Local Host failed) which may be caused by a version mismatch. According to SAP note 2564474 that error is caused because "HANA is installed on a Network File System (NFS) share with the "-nosuid" mount option". Since this script doesn't support NFS yet I'm not sure what could be causing this error aside from a version mismatch.

Capability to configure HANA database replication

Problem Statement

Clustering requires that both the active and passive nodes maintain a synced copy of the data. In order to achieve this, the HANA database must be configured for replication from the active to the passive node.

Enhancement

Ensure that the codebase can support configuration of HANA replication in a cluster of two nodes on the SLES platform.

Given the user runs through the USAGE guide up to and including step 3 of Build/Update/Destroy Lifecycle
When the user checks the HANA replication status on the primary node

  • Then the status reports OK

And when the user runs the takeover process

  • Then the standby node becomes the new active node

Notes

  1. Replication configuration requires initiation and completion of a full backup of HDB.
  2. It should be possible to demonstrate the takeover process on the standby node by running some ansible commands/playbooks that trigger a takeover, and illustrate the cluster behaviour.

Dependencies

  • #337 Capability to provision Azure resources for clustered HANA systems on SLES

Future Work

  1. Scheduling/managing regular backups
  2. HANA DB log management

Checklist

  • Usage documentation updated as necessary
  • Architecture documentation updated as necessary

References

None

cluster resource monitor is not running

Righ after the ha-pair deployment I have checked the crm status and see several resources not running:

Failed Actions:

  • rsc_ip_HN1_HDB01_monitor_10000 on hn1-hdb0 'not running' (7): call=60, status=complete, exitreason='',
    last-rc-change='Thu May 2 13:07:45 2019', queued=0ms, exec=0ms
  • rsc_SAPHana_HN1_HDB01_monitor_61000 on hn1-hdb0 'not running' (7): call=73, status=complete, exitreason='',
    last-rc-change='Thu May 2 13:09:41 2019', queued=4ms, exec=5926ms
  • rsc_ip_HN1_HDB01_monitor_10000 on hn1-hdb1 'not running' (7): call=57, status=complete, exitreason='',
    last-rc-change='Thu May 2 13:09:51 2019', queued=0ms, exec=0ms

After a while, also main resources like HANA DB are restarted and moved from primary node to another. The reason might be due to errors related to rsc_azure-events, see pacemaker.log:

ERROR:azure-events:Command '['crm_attribute', '--name', 'azure-events_curNodeState', '--query', '--quiet', '--node', 'hn1-hdb1']' returned non-zero exit status 6 ]
Traceback (most recent call last): ]
File "/usr/lib/ocf/resource.d/heartbeat/azure-events", line 171, in _exec ]
ret = subprocess.check_output(command) ]
File "/usr/lib64/python2.7/subprocess.py", line 219, in check_output ]
raise CalledProcessError(retcode, cmd, output=output) ]

HANA Installation fails due to missing OS Libraries

HANA Installation fails due to missing OS Libraries

Required OS libaries : libatomic1, libgcc_s1 and libstdc++6 needs to be installed in Hana servers as part of the code.
Need help on how to incorporate this to the code. Thanks.

"Scanning software locations...", "Detected components:", " SAP HANA Database (2.00.042.00.1564994110) in /hana/shared/install/SAP_HANA_DATABASE/server", "Log file written to '/var/tmp/hdb_HA1_hdblcm_install_2019-09-03_09.50.09/hdblcm.log' on host 'ha1-hdb0'."]}
fatal: [ha1-hdb1]: FAILED! => {"changed": true, "cmd": "pwd=$(<../hdbserver_HA1_passwords.xml); rm ../hdbserver_HA1_passwords.xml; echo $pwd | ./hdblcm --batch --action=install --configfile='../hdbserver_HA1_install.cfg' --read_password_from_stdin=xml", "delta": "0:00:01.217988", "end": "2019-09-03 09:50:10.534197", "msg": "non-zero return code", "rc": 1, "start": "2019-09-03 09:50:09.316209", "stderr": "rpm package 'libatomic1' is not installed\nThe operating system is not ready to perform gcc 7 assemblies\nFor more information, see SAP Note 2593824.\nChecking system requirements failed", "stderr_lines": ["rpm package 'libatomic1' is not installed", "The operating system is not ready to perform gcc 7 assemblies", "For more information, see SAP Note 2593824.", "Checking system requirements failed"], "stdout": "\n\nSAP HANA Lifecycle Management - SAP HANA Database 2.00.042.00.1564994110\n************************************************************************\n\n\nScanning software locations...\nDetected components:\n SAP HANA Database (2.00.042.00.1564994110) in /hana/shared/install/SAP_HANA_DATABASE/server\nLog file written to '/var/tmp/hdb_HA1_hdblcm_install_2019-09-03_09.50.09/hdblcm.log' on host 'ha1-hdb1'.", "stdout_lines": ["", "", "SAP HANA Lifecycle Management - SAP HANA Database 2.00.042.00.1564994110", "************************************************************************", "", "", "Scanning software locations...", "Detected components:", " SAP HANA Database (2.00.042.00.1564994110) in /hana/shared/install/SAP_HANA_DATABASE/server", "Log file written to '/var/tmp/hdb_HA1_hdblcm_install_2019-09-03_09.50.09/hdblcm.log' on host 'ha1-hdb1'."]}

Error: Error running command [...] '-i '../../ansible/azure_rm.py' [...]

Hi,
I have issues with single_node and ha_pair deployments. The ansible part fails with:

Error: Error running command ' OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
AZURE_RESOURCE_GROUPS="HADemo"
[...]
-i '../../ansible/azure_rm.py' ../../ansible/ha_pair_playbook.yml
': exit status 4. Output: [ERROR]: /usr/lib/python2.7/site-packages/requests/init.py:91:
RequestsDependencyWarning: urllib3 (1.25.6) or chardet (2.2.1) doesn't match a
supported version! RequestsDependencyWarning)

Thanks for fixing.
Cheers,
Michael

Capability to manage Azure Fencing Agent

Problem Statement

Clustering requires an Azure fencing agent to be created to manage the VMs in a cluster should one node need to notify Azure that a STONITH activity needs to take place. Azure fencing agents are implemented with a service prinicpal in Azure AD, and therefore the process to provision/destory a cluster, in turn, requires that a corresponding service principal is provisioned/destoryed in Azure AD. This might require a different (often higher) set of privileges to those typically required in provisioning/destroying Azure resources, and so bundling this process with the main Azure resource provisioning and configuration for clusters is not always appropriate.

Enhancement

Ensure that the codebase can support activities to provision/destroy an appropriate Azure fencing agent for HANA cluster management.
This will be demonstrated by introducing appropriate utility scripts such that:

Given the user has the appropriate permissions
When the user runs util/create_fencing_agent.sh <SID>
Then the script:

  • Creates the fencing agent service principal for the given SAP HANA SID with the appropriate permissions
  • Assigns the appropriate custom role to the service principal
  • Stores the service principal details in a local file fencing-agent-<sid>.sh (similar to #288 ) so they can be used when configuring the clustering software when the VMs are configured (Terraform will copy the file to the RTI for ansible to use in configuring the cluster)
  • Fails with a suitable error when the user does not have the correct permissions to create service principals
  • Fails with a suitable error when no SID or an existing SID (matching local file) is given

Notes

  1. At the current time, no reference architecture is available for HA clustering of SLES on Azure using the Azure fencing agent STONITH method
  2. See the Important note in Create SAP HANA cluster resources
  3. Assumption: The entity relationship between the clustering service principal and a clustered SAP system/input JSON is 1:1, rather than reusing the same fencing agent across all SAP systems in a subscription. However, it's technically possible to have a single fencing agent responsible for all SAP systems - even across subscriptions.

Dependencies

  • #337 Capability to provision Azure resources for clustered HANA systems on SLES

Future Work

  1. Manage the clustering service principal through Terraform
  2. Use a Key Vault to store the clustering SP credentials

Checklist

  • Usage documentation updated as necessary
  • Architecture documentation updated as necessary

References

  1. Create Azure fence agent STONITH device

TASK [saphana-install : run hdblcm] Cannot resolve host name 'linux'

Hi, deployment fails during HANA install.

module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): TASK [saphana-install : run hdblcm] ********************************************
module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): fatal: [www-hdb0]: FAILED! => {"changed": true, "cmd": "pwd=$(<../hdbserver_WWW_passwords.xml); rm ../hdbserver_WWW_passwords.xml; echo $pwd | ./hdblcm --batch --action=install --configfile='../hdbserver_WWW_install.cfg' --read_password_from_stdin=xml", "delta": "0:00:01.478872", "end": "2020-03-20 09:02:46.439284", "msg": "non-zero return code", "rc": 1, "start": "2020-03-20 09:02:44.960412", "stderr": "Running in batch mode\n Cannot resolve host name 'linux'", "stderr_lines": ["Running in batch mode", " Cannot resolve host name 'linux'"], "stdout": "\n\nSAP HANA Lifecycle Management - SAP HANA Database 2.00.037.04.1571818940\n************************************************************************\n\n\nScanning software locations...\nDetected components:\n SAP HANA Database (2.00.037.04.1571818940) in /hana/shared/install/SAP_HANA_DATABASE/server\nLog file written to '/var/tmp/hdb_WWW_hdblcm_install_2020-03-20_09.02.45/hdblcm.log' on host 'www-hdb0'.", "stdout_lines": ["", "", "SAP HANA Lifecycle Management - SAP HANA Database 2.00.037.04.1571818940", "************************************************************************", "", "", "Scanning software locations...", "Detected components:", " SAP HANA Database (2.00.037.04.1571818940) in /hana/shared/install/SAP_HANA_DATABASE/server", "Log file written to '/var/tmp/hdb_WWW_hdblcm_install_2020-03-20_09.02.45/hdblcm.log' on host 'www-hdb0'."]}

Thanks for help!

Terraform doesn't remove disks all the time

* module.single_node_hana.azurerm_virtual_machine_data_disk_attachment.disk[0] (destroy): 1 error(s) occurred:

* azurerm_virtual_machine_data_disk_attachment.disk.0: Error removing Disk "db0-disk0" from Virtual Machine "PV1-db0" (Resource Group "<ResourceGroupName>"): compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidRequestContent" Message="The request content was invalid and could not be deserialized: 'Could not find member 'resources' on object of type 'ResourceDefinition'. Path 'resources', line 1, position 2316.'."

This is a known issue with Terraform, and they are fixing the bug here.

The work around that I am currently using is
az group delete -n <ResourceGroupName>.

Hopefully this will be fixed soon

Capability to support Azure Availability Sets

Problem Statement

Prepare the Terraform code for building with High Availability in mind by building the Single Node HANA VM within an Azure Availability Set.

Design Overview

By updating the Terraform to deploy a single HANA VM into an availability set, the future clustered VMs will also be in the availability set.

Checklist

RPM package libatomic needs to be installed for HANA 2 SPS04 Rev 040.00 - 046.00

As instructed in SAP 2593824, RPM packages libgcc_s1, libstdc++6 and libatomic1 need to be installed before running hdblcm for HANA installation. The SLES12 image used in the scripts is missing the library libatomic, hence hdblcm execution fails.

Suggestion : zypper install libgcc_s1 libstdc++6 libatomic1 in https://github.com/Azure/sap-hana/blob/master/deploy/vm/ansible/roles/saphana-install/tasks/main.yml

Error message :
TASK [saphana-install : run hdblcm] ********************************************
fatal: [hn1-hdb0]: FAILED! => {"changed": true, "cmd": "pwd=$(<../hdbserver_HN1_passwords.xml); rm ../hdbserver_HN1_passwords.xml; echo $pwd | ./hdblcm --batch --action=install --configfile='../hdbserver_HN1_install.cfg' --read_password_from_stdin=xml", "delta": "0:00:01.239377", "end": "2020-03-09 22:09:45.450050", "msg": "non-zero return code", "rc": 1, "start": "2020-03-09 22:09:44.210673", "stderr": "rpm package 'libatomic1' is not installed\nThe operating system is not ready to perform gcc 7 assemblies\nFor more information, see SAP Note 2593824.\nChecking system requirements failed", "stderr_lines": ["rpm package 'libatomic1' is not installed", "The operating system is not ready to perform gcc 7 assemblies", "For more information, see SAP Note 2593824.", "Checking system requirements failed"], "stdout": "\n\nSAP HANA Lifecycle Management - SAP HANA Database 2.00.046.00.1581325702\n************************************************************************\n\n\nScanning software locations...\nDetected components:\n SAP HANA Database (2.00.046.00.1581325702) in /hana/shared/install/SAP_HANA_DATABASE/server\nLog file written to '/var/tmp/hdb_HN1_hdblcm_install_2020-03-09_22.09.44/hdblcm.log' on host 'hn1-hdb0'.", "stdout_lines": ["", "", "SAP HANA Lifecycle Management - SAP HANA Database 2.00.046.00.1581325702", "************************************************************************", "", "", "Scanning software locations...", "Detected components:", " SAP HANA Database (2.00.046.00.1581325702) in /hana/shared/install/SAP_HANA_DATABASE/server", "Log file written to '/var/tmp/hdb_HN1_hdblcm_install_2020-03-09_22.09.44/hdblcm.log' on host 'hn1-hdb0'."]}

Backlog: Update Linux Jumpboxes to Ubuntu 18.04

In #284 some of the Ubuntu images were updated to 18.04, but others were missed (see #298 (comment)).

All the images should be consistent, and there should be some form of regression test to ensure they don't drift apart following any future updates.

Acceptance Criteria

  • All references are to the same image
  • Regression test added to find future config drift

Errors in module.create_hdb.module.nic_and_pip_setup

Hi,

PFA the issue I am getting all the time.
issue.txt

Error running plan: 3 error(s) occurred:

  • module.create_hdb.module.nic_and_pip_setup.azurerm_network_interface.nic: 1 error(s) occurred:
  • module.create_hdb.module.nic_and_pip_setup.azurerm_network_interface.nic: Resource 'azurerm_public_ip.pip' not found for variable 'azurerm_public_ip.pip.id'
  • module.create_hdb.module.nic_and_pip_setup.output.pip_name: Resource 'azurerm_public_ip.pip' not found for variable 'azurerm_public_ip.pip.name'
  • module.create_hdb.module.nic_and_pip_setup.output.fqdn: Resource 'azurerm_public_ip.pip' not found for variable 'azurerm_public_ip.pip.fqdn'

Thanks in advance for any hint how to solve that...
Jan

fatal: [localhost]: FAILED! => {"msg": "'dict object' has no attribute 'hdb0'"}

Hi,

I getting error like that:

module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): TASK [stonith-device-creation : Configure STONITH timeout] *********************
module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): fatal: [localhost]: FAILED! => {"msg": "'dict object' has no attribute 'hdb0'"}
module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): to retry, use: --limit @/home/saponazure/sap-hana/deploy/vm/ansible/ha_pair_playbook.retry

module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): PLAY RECAP *********************************************************************
module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): localhost : ok=7 changed=0 unreachable=0 failed=1

Regards,
Sebastian

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.