azure / sap-hana Goto Github PK

View Code? Open in Web Editor NEW

178.0 46.0 142.0 17.73 MB

Tools to create, monitor and maintain SAP landscapes in Azure.

License: MIT License

Shell 27.30% HCL 41.61% Python 7.30% PowerShell 20.88% Jinja 2.90%

suse achmea

sap-hana's Introduction

SAP Deployment Automation Framework

This repository is no longer active. Please use the new SAP on Azure Deployment Automation Framework

sap-hana's People

Contributors

Stargazers

Watchers

Forkers

persiaaziz-zz pabowers tniek diegoakechi msftphleiten mmcsa daverendon neekhild agovi rkrishna12 sasg worldquest pranoybej melinamoyer fepere steinbrueckri boobboo mschlarb suhanselvan rjudin devopsarchitect1 rmveera softwareone-sap-legacy wtejasukmana jjoswig dkloos ron-palis dokspribadi mikenestlerode-1 chiragnayyar dikeat umairmoinuddin amico1234 dmhelm softwareone-sap sanjay-suv victorin73 lukeenterprise kjreddyy takhoshi secustor polichtm sbonds nasunkar wkdang fmlokt henrypi73 danieladolcan cjpark-sapcsa sahebjamee irbetola xjq-suse autokilla47 balajitu https-github-com-vfulitod176641 komport jessica19921 rhmk thzandvl meyyazhagan shregms taffywrinkle claudiusgonzo fadoboy managedsvc zimnol venky04321 pdtit robinaggarwal kksat mpolicht rajkten sapbandy rukawata yertaylan67 petri-o-ojala isabella232 shekharsorot rushins bobwobbler abourcevet rsteeno janapb hdamecharla everdeguzman kiranese rtamalin rajendravenkata mahadevanmani kash-yap mauchy81 balram2697 stbaehr delipr trphan pwp-464 pmeshrampm abhilash-keloth microsoft-saponazure-openhack brmccutc

sap-hana's Issues

Single Node HANA VM deployment does not put the HDB VMs in the Availability Set

When I deployed a Single Node HANA system using the azurerm provider v2.0, I noticed in the Azure Portal that the Availability Set did not have the HDB VM associated with it.

I believe the cause is the availability_set_id association has been removed from the DB Node resources.

Please activate version control

Hi, can we have version control on this repository?
This would be extremly helpful to get notified when there has been a change and also gives the possibility to use previous versions if the current version is broken.
Thanks!

Still NSG Error when deploying

Hi, I get following issue during deployment of single_node_hana:

Error: Error creating/updating NSG "HN1-nsg" (Resource Group "SAPAutoDemo"): network.SecurityGroupsClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="SecurityRuleParametersMissing" Message="Required security rule parameters are missing for security rule with Id: /subscriptions/35b67b4c-4fd4-4f0b-997c-bbb82032d45d/resourceGroups/SAPAutoDemo/providers/Microsoft.Network/networkSecurityGroups/HN1-nsg/securityRules/open-hana-db-ports. Security rule must specify SourceAddressPrefixes, SourceAddressPrefix, or SourceApplicationSecurityGroups." Details=[]

on ../common_setup/nsg.tf line 1, in resource "azurerm_network_security_group" "sap_nsg":
1: resource "azurerm_network_security_group" "sap_nsg" {

Any ideas?
Thanks

Automate Unique Resource Group Names with Utility Scripts

Problem Statement

The util scripts provide a simplified interface into using and testing the codebase without over-customization for the user (typically an Engineer). This should allows multiple users to run the same deployments and get the same (or expectedly similar) results. Currently there is an issue when two or more users attempt to deploy into the same subscription, because the templated input JSON files contain hard-coded details - in particular the resource group name, which needs to be unique within an Azure subscription.

Enhancement

The util scripts should be enhanced to support editing the resource group name through a script rather than manual editing (in a similar manner to util/set_sap_download_credentials.sh). This could then be either run by an engineer, or embedded as part of the terraform_v2.sh wrapper script. Through this, deployments can auto-generate a more-unique resource group name from, for example, properties of the local environment (i.e. environment variables). This follows the existing behaviour of the util/terraform_v2.sh script, such that:

When the user runs util/set_resource_group.sh <resource group name> <target template>
Then the script:

Sets the given resource group name in the given template
Fails with a suitable error when either no resource group name or no target template are provided or the given template does not exist
When a non-valid set of inputs is provided, the script should provide a list of valid files based on those available in the templates dir. For example:
```
Where <JSON template name> is one of the following:
- clustered_hana
- rti_only
- single_node_hana
```

Notes

The scripts only have to work for the V2 templates in the standard directory (e.g. single_node_hana, clustered_hana, etc.)
To aid future refactoring the code should follow the patterns and conventions already laid out for similar functionality in terraform_v2.sh.
Some valid sets of inputs are:
- util/set_resource_group.sh "rg-dev-hana" single_node_hana
- util/set_resource_group.sh "rg-prd-hana" clustered_hana
Some invalid sets of inputs are:
- util/set_resource_group.sh "" single_node_hana
- util/set_resource_group.sh "rg"
There's some existing code within the build pipeline setup that solves the same kind of problem to allow multiple pipelines to deploy in parallel (these solution could be consolidated)

Dependencies

None, but relates to #340 Capability to specify target input JSON for set_sap_download_credentials.sh

Future Work

Generalize the ability to programatically edit the JSON templates for other use cases, such as:
1. Switching between SLES and RHEL
2. Setting HANA version
3. Setting the SID/Instance Number
4. Switching the ansible execution on/off
Integrate the use of these scripts within the pipeline to create more readable pipeline test scenarios

Checklist

Usage documentation updated as necessary
Architecture documentation updated as necessary

References

None

Capability to provision Azure resources for clustered HANA systems on SLES

Problem Statement

The current V2 codebase only supports single node HANA instances, but customers would like to be able to provision 2-node clustered HANA systems that support automated failover when a node in the cluster fails. In order to support this, changes are required to the Azure resources being provisioned.

Enhancement

Ensure that the codebase can support provisioning of a HANA cluster on the SLES platform.
This will be demonstrated by introducing a new V2 templated input JSON file clustered_hana such that:

Given the user runs through the USAGE guide up to and including step 1 of Build/Update/Destroy Lifecycle
And the user runs util/terraform_v2.sh plan clustered_hana
When the user runs util/terraform_v2.sh apply clustered_hana
Then the script:

Deploys 2 VMs and associated resources (rather than 1 in single_node_hana)
Deploys both VMs running SLES 12 SP5 Gen 1 - SAP edition
Deploys both VMs into the given Availability Set
Deploys both VMs behind the Load Balancer
Ensures VMs have access to the HA clustering packages in the SLES repo (e.g. corosync).

Future Work

Ensure the underlying Terraform supports N-node clusters (N>2)

Notes

This work is believed to be mainly Terraform-based changes, and does not cover changes made on the VM OS itself.
Testing and acceptance of this feature should not require ansible to be run as part of the provisioning (i.e. disable in the input JSON template to reduce deployment time)

Dependencies

None

Checklist

Usage documentation updated as necessary
Architecture documentation updated as necessary

References

High availability of SAP HANA on Azure VMs on SUSE Linux Enterprise Server (manual deployment)

Enable ansible in azure v2 pipeline

Problem Statement

We need to test PR against master in sap-hana/deploy/v2 in azure pipieline.
Currently we switch off ansible execution since current workstream is TF only.

Enhancement

Ensure ansible execution is switched back on for v2 pipeline.

Notes

None

Dependencies

#342 Capability to configure HANA database replication

Future Work

None

Checklist

Ensure ansible execution runs without any problem for new PRs.

References

None

Regardless of sap_sid, HDB shared volume is always mounted to /hana/shared/PV1

Description: Regardless of the value of sap_sid, the HDB shared volume is mounted at /hana/shared/PV1, whereas /hana/shared/ stays on the local HDD. This explains the (very) long provisioning times of hdblcm, since the local HDD is a lot slower.

How to reproduce: In terraform.tfvars, set sap_sid = "XYZ" - after TF deployment, there is a directory /hana/shared/PV1, which is mounted to the large shared volume. The correct mountpoint (in this case, /hana/shared/XYZ) is not mounted and exists on the local HDD.

Details:

tniek@XYZ-db0:/hana/shared> ls
install  PV1  XYZ

tniek@XYZ-db0:/hana/shared> df -h /hana/shared/PV1
Filesystem                                     Size  Used Avail Use% Mounted on
/dev/mapper/vg_hana_shared_PV1-lv_hana_shared  512G   33M  512G   1% /hana/shared/PV1

tniek@XYZ-db0:/hana/shared> df -h /hana/shared/XYZ
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2        29G   23G  4.8G  83% /

Support real SAP applications

Do you plan to support major SAP applications like S/4HANA, BW on HANA (incl. BPC), BW/4HANA, Solution Manager etc. Having a toy install script for a HANA based application isn't something we need on a regular basis.

Do you have any plans to add content from SAP transports to move through each environment e.g. via ABAPgit or similar so that you can have a spin up / down approach?

SBD device

Hello,

In this scenario there is only 1 iSCSI target for SBD - have you run into a situation where this iSCSI wasn't available either because of a problem or maintenance being performed on that VM?

Is there a better way to do it besides having a 3 device SBD?

thks

Backlog: Ability for utility scripts to work with multiple input.json templates

As discussed here, the initial utility scripts introduced in #288 are only designed to work with the current (and sole) V2 template (singe node HANA). We should add the ability to specify an alternative template.

Capability to specify target input JSON for set_sap_download_credentials.sh

Problem Statement

When running deployments which progress to the ansible stage, the input JSON files require that the SAP download credentials are specified in order to authenticate with SAP download marketplace. This currently only works with the util scripts for the single_node_hana template, but needs to be more generalized with the introduction of a template for clustering. See the note in the Description section of #298.

Enhancement

Ensure the current util/set_sap_download_credentials.sh script can work with multiple templates by allowing the target template to be specified on the command line. This follows the existing behaviour of the util/terraform_v2.sh script, such that:

When the user runs util/set_sap_download_credentials.sh <SAP user> <SAP password> <target template>
Then the script:

Sets the given username and password as the SAP download credentials in the given template
Accepts empty strings for username and password provided as ""
Fails with a suitable error when no target template is provided or the given template does not exist
When a non-valid set of inputs is provided, the script should provide a list of valid based on those available in the templates dir, and filters the list to those containing hana. For example:
```
Where <JSON template name> is one of the following:
- clustered_hana
- single_node_hana
```

Notes

To aid future refactoring the code should follow the patterns and conventions already laid out for similar functionality in terraform_v2.sh.
Some valid sets of inputs are:
- util/set_sap_download_credentials.sh "user" "pass" single_node_hana
- util/set_sap_download_credentials.sh "" "" clustered_hana
Some invalid sets of inputs are:
- util/set_sap_download_credentials.sh single_node_hana
- util/set_sap_download_credentials.sh "user" "pass" rti_only

Dependencies

None

Future Work

None identified

Checklist

Usage documentation updated as necessary
Architecture documentation updated as necessary

References

PR for similar previous behaviour in terraform_v2.sh

additional ports to be opened for bastion VM

TCP 3389 and 5986 need to be opened in the NSG for bastion VM otherwise no one can remotely connect the bastion VM. https://github.com/Azure/sap-hana/blob/master/deploy/vm/modules/common_setup/nsg.tf

Backlog: v2 -> v1 migration

The following folder will need to be part of the move:

sap-hana/deploy/v2
sap-hana/util

The following folder will become obsolete:

sap-hana/deploy/vm
sap-hana/monitor

The following folder stays:

sap-hana/tools

ha_pair deployment failed with "Failed to provide Package python-pyparsing-2.1.10-1.14.noarch"

ha_pair deployment failed with the

next error:

"Failed to provide Package python-pyparsing-2.1.10-1.14.noarch (SLE12-SP3-SAP-Pool). Do you want to retry retrieval?"

More detials about error and terraform.tfvars files are attached:
ha_pair_error.txt
terraform.tfvars.txt

sap_instancenum not considered at provisioning

Description: During hdblcm install via Ansible, the specified instance number is not considered.

How to reproduce: In terraform.tfvars, set sap_instancenum = "01"; after deployment, there will be install with instance number 00 instead.

NSG Error when deploying

Hi,
when trying to execute the single node hana deployment we are facing the following issue. Kindly asking for some advise. Thanks a lot.

azurerm_network_security_group.sap_nsg: Error creating/updating NSG "HN1-nsg" (Resource Group "hanademo"): network.SecurityGroupsClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="SecurityRuleParametersMissing" Message="Required security rule parameters are missing for security rule with Id: /subscriptions/REMOVEDSUBSCRIPTION/resourceGroups/hanademo/providers/Microsoft.Network/networkSecurityGroups/HN1-nsg/securityRules/open-hana-db-ports. Security rule must specify SourceAddressPrefixes, SourceAddressPrefix, or SourceApplicationSecurityGroups." Details=[]

Kind Regards,
Darius

Backlog: when to use function in the code

Under the following condition, wrap the code in a function is over kill and reduce the readability of the code:

One liner
String concatenation
small amount of code that does not get reused again anywhere else

Some of the examples that should be adjusted listed in PR #298 in Comment.

Change naming conventions

use lbh-nsg or wbh-nsg for bastion host and specify the db type for terraform

Backlog: Add utility scripts for Windows

Add same set of utility script for Windows.

Capability to configure OS clustering software on SLES

Problem Statement

The current V2 codebase only supports single node HANA instances, but customers would like to be able to provision clustered HANA systems that support automated failover when a node in the cluster fails. In order to support this, the OS must be configured as part of a corosync cluster.

Enhancement

Ensure that the codebase can support provisioning of a HANA cluster on the SLES platform.
This will be demonstrated by introducing a new V2 templated input JSON file clustered_hana such that:

Configures the HA clustering packages on both VMs in the 2-node cluster
Fails with a suitable error when the fencing agent service principal details are not available

Notes

It should be possible to demonstrate node failure by running some ansible commands/playbooks that trigger a cluster failure, and illustrate the cluster behaviour.

Dependencies

#337 Capability to provision Azure resources for clustered HANA systems on SLES
#339 Capability to manage Azure Fencing Agent

Checklist

~~Usage documentation updated as necessary~~
~~Architecture documentation updated as necessary~~

References

Setting up Pacemaker on SUSE Linux Enterprise Server in Azure (uses SBD method, rather than Azure fencing agent)
Setting up Pacemaker on Red Hat Enterprise Linux in Azure (uses Azure fencing agent)

Any plan to support Rehat distribution? Most US customers are using RedHat by default.

V2 single node HANA architecture diagram update

With the addition of a load balancer (LB) into the architecture for single node HANA instances in #318, the architecture diagram in deploy/v2/terraform will become out-of-date (no LB!).

If we have access to the source files/templates for this diagram, it would make updating it as the architecture evolved a lot simpler.

Addition of Load Balancer to Single Node HANA Systems

Problem Statement

The current single node HANA systems built from V2 of the codebase do not have (or need) a load balancer. However, to support the scenario where a user wishes to upgrade a single node HANA system into a clustered system, we wish to add a load balancer to single node systems too.

This proposal originated from the Design Workshop, where it was stated that there is no Azure runtime cost in having a load balancer added to a single node system (or perhaps the cost is very minimal). However, adding in a load balancer into the original build, should support building out to a clustered system with negligible or no downtime.

Design Overview

This change is unlikely to warrant any change to the interface, but may require some changes to the input/output.json configuration.

The design should support both HANA 1 and HANA 2 (from design workshop notes).

Checklist

V2 Documentation updated as necessary (now in #319)
V2 Pipeline testing updated as necessary
Differing load balancer rules for different HANA versions should be supported
~~New V2 Azure resource naming should be supported (i.e. prefixes for env details, and suffixes for resource type)~~ Now out of scope
Ensure LB is configured with HANA VM (not RTI, which we used for testing)

HA Cluster Join fails (using Azure Shell)

Build fails with error when second node joins the cluster.
Tried in multiple regions, used fresh clone of the the project in Azure Shell.

TASK [ha-cluster-join : HA cluster join csync2] ********************************
changed: [ha1-hdb1]

TASK [ha-cluster-join : HA cluster join cluster] *******************************
fatal: [ha1-hdb1]: FAILED! => {"changed": true, "cmd": "ha-cluster-join -y cluster", "delta": "0:00:05.717640", "end": "2019-11-26 10:54:22.156122", "msg": "non-zero return code", "rc": 1, "start": "2019-11-26 10:54:16.438482", "stderr": "ERROR: cluster.join: No value for ring0", "stderr_lines": ["ERROR: cluster.join: No value for ring0"], "stdout": " Probing for new partitions...done\n No existing IP/hostname specified - skipping mountpoint detection/creation", "stdout_lines": [" Probing for new partitions...done", " No existing IP/hostname specified - skipping mountpoint detection/creation"]}

PLAY RECAP *********************************************************************
ha1-hdb0 : ok=63 changed=33 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
ha1-hdb1 : ok=60 changed=30 unreachable=0 failed=1 skipped=2 rescued=0 ignored=0
hanaonazuresm-iscsi : ok=42 changed=19 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0

service principal for credentials needs to be set up if using cloudshell

I thought that I had these requirements in a previous version of the document, but this is needed for the Ansible part of the script to work. Even in cloud shell https://docs.microsoft.com/en-us/azure/virtual-machines/linux/ansible-install-configure#create-azure-credentials . Either add these instructions, or we shouldn't advocate using cloud shell when following the given instructions don't work

Backlog: extend USAGE.md

Please extend the USAGE.md with terraform plan and clean. Thanks!

Failing to connect to hdb host 10.1.2.4 via ssh

Failing to connect to hdb host via 10.1.2.4. I can see two NICs (10.1.1.4 and 10.1.2.4) are attached to the VM hdb1 from Portal, however i am having a difficulty to connect from linux jumpbox to hdb1 via ssh. I wonder if the second NIC was correctly configured / enabled from the OS side so that the OS acknowledge all the attached NICs. BTW I didn't encounter this issue yesterday.
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/multiple-nics

SHINE installation failes due to missing component

SHINE installation failes due to missing SW component SAP UI5 Service Broker (XSAC_UI5_SB), see e.g. Note 2602532 - SAP UI5 Service Broker.

shine_installation_error.txt

Bug: Addition of Prometheus changes break existing deployments

Summary

The latest changes made to the master repo in #286 seem to have broken deployments that were working prior to that PR's merge.

Steps to Reproduce

Checkout and use v2 code prior to #286 (e.g. git checkout 94fac2a53677ccf3e0f2680bf2e2311535e77a72)
Do all the basic setup, like SP, Terraform Init, etc. (if not done already)
Ensure you have no local changes/config in files, other than SAP media download credentials set in example template
From the project root, run the deploy: terraform apply -auto-approve -var-file=deploy/v2/template_samples/single_node_hana.json deploy/v2/terraform/

Expected Behaviour

Deployment completes successfully without error.

Actual Behaviour

Deployment fails at the Ansible stage, on the following task:

TASK [enable-prometheus : Set OS version] ****************************************************************************************
fatal: [10.1.2.4]: FAILED! => {"msg": "The conditional check 'output.options.enable_prometheus == True' failed. The error was: error while evaluating conditional (output.options.enable_prometheus == True): 'dict object' has no attribute 'options'\n\nThe error appears to be in '/home/azureadm/sap-hana/deploy/v2/ansible/roles/enable-prometheus/tasks/main.yml': line 5, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Set OS version\n  ^ here\n"}

Relevant Resources (URLs, screenshots, logs, etc.)

The following is expecting output.options to exist, but it doesn't if the customer is using a codebase version prior to #286 (at the time of writing this is the very latest, so unlikely that the customer might have updated):

sap-hana/deploy/v2/ansible/sap_playbook.yml

Line 38 in 4f17955

- output.options.enable_prometheus == True

Possible fixes

A trivial fix might be to use something along these lines (⚠️code not tested ⚠️):

when:
  - output.options is defined
  - (output.options.enable_prometheus is defined) and output.options.enable_prometheus

A more robust best practice for avoiding similar issues would be to provide defaults for all role variables (i.e. don't expect the calling code to be smart enough to have defined everything - especially in the output.json)

SAP Download Center Doesn't include the versions of the packages listed in the readme

The readme (https://github.com/Azure/sap-hana#required-sap-downloads) list several required packages. However, as of 2019-04-29, not all listed packages are available (only newer versions are available for some of them). The four packages marked with a red 'X' are affected (blue lines indicate packages I didn't use in my deployment and have no data for):

I'm getting a build deployment failure (Assigning Additional Roles to the Local Host failed) which may be caused by a version mismatch. According to SAP note 2564474 that error is caused because "HANA is installed on a Network File System (NFS) share with the "-nosuid" mount option". Since this script doesn't support NFS yet I'm not sure what could be causing this error aside from a version mismatch.

Not compatible with Terraform v0.12

Please see https://www.terraform.io/docs/extend/terraform-0.12-compatibility.html
Upgrading via "terraform 0.12upgrade" works module by module, but nonetheless this should be mentioned.

Capability to configure HANA database replication

Problem Statement

Clustering requires that both the active and passive nodes maintain a synced copy of the data. In order to achieve this, the HANA database must be configured for replication from the active to the passive node.

Enhancement

Ensure that the codebase can support configuration of HANA replication in a cluster of two nodes on the SLES platform.

Given the user runs through the USAGE guide up to and including step 3 of Build/Update/Destroy Lifecycle
When the user checks the HANA replication status on the primary node

Then the status reports OK

And when the user runs the takeover process

Then the standby node becomes the new active node

Notes

Replication configuration requires initiation and completion of a full backup of HDB.
It should be possible to demonstrate the takeover process on the standby node by running some ansible commands/playbooks that trigger a takeover, and illustrate the cluster behaviour.

Dependencies

#337 Capability to provision Azure resources for clustered HANA systems on SLES

Future Work

Scheduling/managing regular backups
HANA DB log management

Checklist

Usage documentation updated as necessary
Architecture documentation updated as necessary

References

None

ha_pair deployment (without bastion) systimaticaly fails with "Could not detect IP address for eth0" error

Do my best to deploy ha_pair configuration and get systematically error "Could not detect IP address for eth0".

Log and tfvars files are attached
ha_pair_terraform.tfvars.txt
05 - ha_pair_Could_not_detect_IP_address_for_eth0.txt

keep all the variables in a separate tf file and not main resource execution file

Edits for variables should not be in the resource creation file.

cluster resource monitor is not running

Righ after the ha-pair deployment I have checked the crm status and see several resources not running:

Failed Actions:

rsc_ip_HN1_HDB01_monitor_10000 on hn1-hdb0 'not running' (7): call=60, status=complete, exitreason='',
last-rc-change='Thu May 2 13:07:45 2019', queued=0ms, exec=0ms
rsc_SAPHana_HN1_HDB01_monitor_61000 on hn1-hdb0 'not running' (7): call=73, status=complete, exitreason='',
last-rc-change='Thu May 2 13:09:41 2019', queued=4ms, exec=5926ms
rsc_ip_HN1_HDB01_monitor_10000 on hn1-hdb1 'not running' (7): call=57, status=complete, exitreason='',
last-rc-change='Thu May 2 13:09:51 2019', queued=0ms, exec=0ms

After a while, also main resources like HANA DB are restarted and moved from primary node to another. The reason might be due to errors related to rsc_azure-events, see pacemaker.log:

ERROR:azure-events:Command '['crm_attribute', '--name', 'azure-events_curNodeState', '--query', '--quiet', '--node', 'hn1-hdb1']' returned non-zero exit status 6 ]
Traceback (most recent call last): ]
File "/usr/lib/ocf/resource.d/heartbeat/azure-events", line 171, in _exec ]
ret = subprocess.check_output(command) ]
File "/usr/lib64/python2.7/subprocess.py", line 219, in check_output ]
raise CalledProcessError(retcode, cmd, output=output) ]

HANA Installation fails due to missing OS Libraries

Required OS libaries : libatomic1, libgcc_s1 and libstdc++6 needs to be installed in Hana servers as part of the code.
Need help on how to incorporate this to the code. Thanks.

"Scanning software locations...", "Detected components:", " SAP HANA Database (2.00.042.00.1564994110) in /hana/shared/install/SAP_HANA_DATABASE/server", "Log file written to '/var/tmp/hdb_HA1_hdblcm_install_2019-09-03_09.50.09/hdblcm.log' on host 'ha1-hdb0'."]}
fatal: [ha1-hdb1]: FAILED! => {"changed": true, "cmd": "pwd=$(<../hdbserver_HA1_passwords.xml); rm ../hdbserver_HA1_passwords.xml; echo $pwd | ./hdblcm --batch --action=install --configfile='../hdbserver_HA1_install.cfg' --read_password_from_stdin=xml", "delta": "0:00:01.217988", "end": "2019-09-03 09:50:10.534197", "msg": "non-zero return code", "rc": 1, "start": "2019-09-03 09:50:09.316209", "stderr": "rpm package 'libatomic1' is not installed\nThe operating system is not ready to perform gcc 7 assemblies\nFor more information, see SAP Note 2593824.\nChecking system requirements failed", "stderr_lines": ["rpm package 'libatomic1' is not installed", "The operating system is not ready to perform gcc 7 assemblies", "For more information, see SAP Note 2593824.", "Checking system requirements failed"], "stdout": "\n\nSAP HANA Lifecycle Management - SAP HANA Database 2.00.042.00.1564994110\n************************************************************************\n\n\nScanning software locations...\nDetected components:\n SAP HANA Database (2.00.042.00.1564994110) in /hana/shared/install/SAP_HANA_DATABASE/server\nLog file written to '/var/tmp/hdb_HA1_hdblcm_install_2019-09-03_09.50.09/hdblcm.log' on host 'ha1-hdb1'.", "stdout_lines": ["", "", "SAP HANA Lifecycle Management - SAP HANA Database 2.00.042.00.1564994110", "************************************************************************", "", "", "Scanning software locations...", "Detected components:", " SAP HANA Database (2.00.042.00.1564994110) in /hana/shared/install/SAP_HANA_DATABASE/server", "Log file written to '/var/tmp/hdb_HA1_hdblcm_install_2019-09-03_09.50.09/hdblcm.log' on host 'ha1-hdb1'."]}

Error: Error running command [...] '-i '../../ansible/azure_rm.py' [...]

Hi,
I have issues with single_node and ha_pair deployments. The ansible part fails with:

Error: Error running command ' OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
AZURE_RESOURCE_GROUPS="HADemo"
[...]
-i '../../ansible/azure_rm.py' ../../ansible/ha_pair_playbook.yml
': exit status 4. Output: [ERROR]: /usr/lib/python2.7/site-packages/requests/init.py:91:
RequestsDependencyWarning: urllib3 (1.25.6) or chardet (2.2.1) doesn't match a
supported version! RequestsDependencyWarning)

Thanks for fixing.
Cheers,
Michael

Backlog: Util scripts should send errors to SYSERR

See discussion: #294 (comment)

Capability to manage Azure Fencing Agent

Problem Statement

Clustering requires an Azure fencing agent to be created to manage the VMs in a cluster should one node need to notify Azure that a STONITH activity needs to take place. Azure fencing agents are implemented with a service prinicpal in Azure AD, and therefore the process to provision/destory a cluster, in turn, requires that a corresponding service principal is provisioned/destoryed in Azure AD. This might require a different (often higher) set of privileges to those typically required in provisioning/destroying Azure resources, and so bundling this process with the main Azure resource provisioning and configuration for clusters is not always appropriate.

Enhancement

Ensure that the codebase can support activities to provision/destroy an appropriate Azure fencing agent for HANA cluster management.
This will be demonstrated by introducing appropriate utility scripts such that:

Given the user has the appropriate permissions
When the user runs util/create_fencing_agent.sh <SID>
Then the script:

Creates the fencing agent service principal for the given SAP HANA SID with the appropriate permissions
Assigns the appropriate custom role to the service principal
Stores the service principal details in a local file fencing-agent-<sid>.sh (similar to #288 ) so they can be used when configuring the clustering software when the VMs are configured (Terraform will copy the file to the RTI for ansible to use in configuring the cluster)
Fails with a suitable error when the user does not have the correct permissions to create service principals
Fails with a suitable error when no SID or an existing SID (matching local file) is given

Notes

At the current time, no reference architecture is available for HA clustering of SLES on Azure using the Azure fencing agent STONITH method
See the Important note in Create SAP HANA cluster resources
Assumption: The entity relationship between the clustering service principal and a clustered SAP system/input JSON is 1:1, rather than reusing the same fencing agent across all SAP systems in a subscription. However, it's technically possible to have a single fencing agent responsible for all SAP systems - even across subscriptions.

Dependencies

#337 Capability to provision Azure resources for clustered HANA systems on SLES

Future Work

Manage the clustering service principal through Terraform
Use a Key Vault to store the clustering SP credentials

Checklist

Usage documentation updated as necessary
Architecture documentation updated as necessary

References

Create Azure fence agent STONITH device

TASK [saphana-install : run hdblcm] Cannot resolve host name 'linux'

Hi, deployment fails during HANA install.

module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): TASK [saphana-install : run hdblcm] ********************************************
module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): fatal: [www-hdb0]: FAILED! => {"changed": true, "cmd": "pwd=$(<../hdbserver_WWW_passwords.xml); rm ../hdbserver_WWW_passwords.xml; echo $pwd | ./hdblcm --batch --action=install --configfile='../hdbserver_WWW_install.cfg' --read_password_from_stdin=xml", "delta": "0:00:01.478872", "end": "2020-03-20 09:02:46.439284", "msg": "non-zero return code", "rc": 1, "start": "2020-03-20 09:02:44.960412", "stderr": "Running in batch mode\n Cannot resolve host name 'linux'", "stderr_lines": ["Running in batch mode", " Cannot resolve host name 'linux'"], "stdout": "\n\nSAP HANA Lifecycle Management - SAP HANA Database 2.00.037.04.1571818940\n************************************************************************\n\n\nScanning software locations...\nDetected components:\n SAP HANA Database (2.00.037.04.1571818940) in /hana/shared/install/SAP_HANA_DATABASE/server\nLog file written to '/var/tmp/hdb_WWW_hdblcm_install_2020-03-20_09.02.45/hdblcm.log' on host 'www-hdb0'.", "stdout_lines": ["", "", "SAP HANA Lifecycle Management - SAP HANA Database 2.00.037.04.1571818940", "************************************************************************", "", "", "Scanning software locations...", "Detected components:", " SAP HANA Database (2.00.037.04.1571818940) in /hana/shared/install/SAP_HANA_DATABASE/server", "Log file written to '/var/tmp/hdb_WWW_hdblcm_install_2020-03-20_09.02.45/hdblcm.log' on host 'www-hdb0'."]}

Thanks for help!

Terraform doesn't remove disks all the time

* module.single_node_hana.azurerm_virtual_machine_data_disk_attachment.disk[0] (destroy): 1 error(s) occurred:

* azurerm_virtual_machine_data_disk_attachment.disk.0: Error removing Disk "db0-disk0" from Virtual Machine "PV1-db0" (Resource Group "<ResourceGroupName>"): compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidRequestContent" Message="The request content was invalid and could not be deserialized: 'Could not find member 'resources' on object of type 'ResourceDefinition'. Path 'resources', line 1, position 2316.'."

This is a known issue with Terraform, and they are fixing the bug here.

The work around that I am currently using is
az group delete -n <ResourceGroupName>.

Hopefully this will be fixed soon

Capability to support Azure Availability Sets

Problem Statement

Prepare the Terraform code for building with High Availability in mind by building the Single Node HANA VM within an Azure Availability Set.

Design Overview

By updating the Terraform to deploy a single HANA VM into an availability set, the future clustered VMs will also be in the availability set.

Checklist

V2 Documentation updated as necessary - See #326
Availability Set provisioned by Terraform - #318

RPM package libatomic needs to be installed for HANA 2 SPS04 Rev 040.00 - 046.00

As instructed in SAP 2593824, RPM packages libgcc_s1, libstdc++6 and libatomic1 need to be installed before running hdblcm for HANA installation. The SLES12 image used in the scripts is missing the library libatomic, hence hdblcm execution fails.

Suggestion : zypper install libgcc_s1 libstdc++6 libatomic1 in https://github.com/Azure/sap-hana/blob/master/deploy/vm/ansible/roles/saphana-install/tasks/main.yml

Error message :
TASK [saphana-install : run hdblcm] ********************************************
fatal: [hn1-hdb0]: FAILED! => {"changed": true, "cmd": "pwd=$(<../hdbserver_HN1_passwords.xml); rm ../hdbserver_HN1_passwords.xml; echo $pwd | ./hdblcm --batch --action=install --configfile='../hdbserver_HN1_install.cfg' --read_password_from_stdin=xml", "delta": "0:00:01.239377", "end": "2020-03-09 22:09:45.450050", "msg": "non-zero return code", "rc": 1, "start": "2020-03-09 22:09:44.210673", "stderr": "rpm package 'libatomic1' is not installed\nThe operating system is not ready to perform gcc 7 assemblies\nFor more information, see SAP Note 2593824.\nChecking system requirements failed", "stderr_lines": ["rpm package 'libatomic1' is not installed", "The operating system is not ready to perform gcc 7 assemblies", "For more information, see SAP Note 2593824.", "Checking system requirements failed"], "stdout": "\n\nSAP HANA Lifecycle Management - SAP HANA Database 2.00.046.00.1581325702\n************************************************************************\n\n\nScanning software locations...\nDetected components:\n SAP HANA Database (2.00.046.00.1581325702) in /hana/shared/install/SAP_HANA_DATABASE/server\nLog file written to '/var/tmp/hdb_HN1_hdblcm_install_2020-03-09_22.09.44/hdblcm.log' on host 'hn1-hdb0'.", "stdout_lines": ["", "", "SAP HANA Lifecycle Management - SAP HANA Database 2.00.046.00.1581325702", "************************************************************************", "", "", "Scanning software locations...", "Detected components:", " SAP HANA Database (2.00.046.00.1581325702) in /hana/shared/install/SAP_HANA_DATABASE/server", "Log file written to '/var/tmp/hdb_HN1_hdblcm_install_2020-03-09_22.09.44/hdblcm.log' on host 'hn1-hdb0'."]}

Backlog: Update Linux Jumpboxes to Ubuntu 18.04

In #284 some of the Ubuntu images were updated to 18.04, but others were missed (see #298 (comment)).

All the images should be consistent, and there should be some form of regression test to ensure they don't drift apart following any future updates.

Acceptance Criteria

All references are to the same image
Regression test added to find future config drift

Single node case is broken with addition of service principal

Error: module "configure_vm": missing required argument "azure_service_principal_id"

Error: module "configure_vm": missing required argument "azure_service_principal_pw"

Errors in module.create_hdb.module.nic_and_pip_setup

Hi,

PFA the issue I am getting all the time.
issue.txt

Error running plan: 3 error(s) occurred:

module.create_hdb.module.nic_and_pip_setup.azurerm_network_interface.nic: 1 error(s) occurred:
module.create_hdb.module.nic_and_pip_setup.azurerm_network_interface.nic: Resource 'azurerm_public_ip.pip' not found for variable 'azurerm_public_ip.pip.id'
module.create_hdb.module.nic_and_pip_setup.output.pip_name: Resource 'azurerm_public_ip.pip' not found for variable 'azurerm_public_ip.pip.name'
module.create_hdb.module.nic_and_pip_setup.output.fqdn: Resource 'azurerm_public_ip.pip' not found for variable 'azurerm_public_ip.pip.fqdn'

Thanks in advance for any hint how to solve that...
Jan

Compatibility with HANA 2.0 SPS04

Deployments using HANA 2.0 SPS04 will fail, as they use a newer GCC version.
See SAP Note 2593824 - Linux: Running SAP applications compiled with GCC 7.x

Consider updating Terraform util script so user must specify JSON file extension

In the util script changes added in #298 the user now has the ability to specify which template they want to apply/destroy. In the current version, the user just specifies the "name" of the template (e.g. single_node_hana), but it was suggested in the review, that this should include the file extension .json.

We could discuss the merits of either approach on this issue.

Backlog: Add instruction to setup enviorment for terraform on Windows to README.MD

fatal: [localhost]: FAILED! => {"msg": "'dict object' has no attribute 'hdb0'"}

Hi,

I getting error like that:

module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): TASK [stonith-device-creation : Configure STONITH timeout] *********************
module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): fatal: [localhost]: FAILED! => {"msg": "'dict object' has no attribute 'hdb0'"}
module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): to retry, use: --limit @/home/saponazure/sap-hana/deploy/vm/ansible/ha_pair_playbook.retry

module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): PLAY RECAP *********************************************************************
module.configure_vm.null_resource.mount-disks-and-configure-hana (local-exec): localhost : ok=7 changed=0 unreachable=0 failed=1

Regards,
Sebastian

azure / sap-hana Goto Github PK

sap-hana's Introduction

SAP Deployment Automation Framework

sap-hana's People

Contributors

Stargazers

Watchers

Forkers

sap-hana's Issues

Problem Statement

Enhancement

Notes

Dependencies

Future Work

Checklist

References

Problem Statement

Enhancement

Future Work

Notes

Dependencies

Checklist

References

Problem Statement

Enhancement

Notes

Dependencies

Future Work

Checklist

References

Problem Statement

Enhancement

Notes

Dependencies

Future Work

Checklist

References

Problem Statement

Enhancement

Notes

Dependencies

Checklist

References

Problem Statement

Design Overview

Checklist

Summary

Steps to Reproduce

Expected Behaviour

Actual Behaviour

Relevant Resources (URLs, screenshots, logs, etc.)

Possible fixes

Problem Statement

Enhancement

Notes

Dependencies

Future Work

Checklist

References

Problem Statement

Enhancement

Notes

Dependencies

Future Work

Checklist

References

Problem Statement

Design Overview

Checklist

Acceptance Criteria

Recommend Projects

Recommend Topics

Recommend Org

Jobs