jbenninghoff / cluster-validation Goto Github PK
View Code? Open in Web Editor NEWScripts to validate that a cluster is ready for MapR Data Platform installation
Scripts to validate that a cluster is ready for MapR Data Platform installation
The cluster-audit.sh script recommends setting vm.swappiness to 1. The documentation for 6.0 states that the MapR recommendation is 10. Should probably correct one or the other.
Reference: https://maprdocs.mapr.com/60/AdvancedInstallation/PreparingEachNode-memory.html?hl=swappiness
Issue: The cluster audit reports that JDK is not installed, but it is. Here is the relevant error message from the audit logs:
ls: cannot access /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el6_8.x86_64/jre/bin/jps: No such file or directory
JDK not installed
clush -ab -o -qtt "sudo grep '^Defaults.*requiretty' /etc/sudoers"
if
else
fi
Each command that's run gets this error.
Some ideas on the YCSB Test:
Provide option to create a new volume with X settings to put the MapR DB Table in. This would allow one to script the performance tests and see what changing those settings may entail. (replication etc)
Provide table creation options (region size etc)
Provide option for inmem true and false to see differences.
Create docker file that includes all the things (MapR clients, Hbase -client, YCSB etc). Option to build and then push to local docker registry, and have then run options that allow for local running and the remote running by pulling docker image... I may try to power through this, it would be very helpful for many of these tests to add Dockerization to ensure consistency of environment dependancies.
Add check in cluster-audit.sh and mapr-audit.sh for 'mapr' access to /etc/shadow
export MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket
sudo -u mapr maprcli config save -values {mapr.targetversion:"cat /opt/mapr/MapRBuildVersion
"}
sudo -u mapr maprcli cluster feature enable -all
OR:
su - mapr -c 'MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket maprcli dashboard info -json'
OS Disk is /dev/sda.
Server has more than 27 disks.
So device name of last disk is /dev/sdaa.
Run disk-test.sh
.
Command line used: /root/cluster-validation/pre-install/iozone -I -r 1M -s 4G -k 10 -+n -i 0 -i 1 -i 2 -f /dev/sdaa
Command line used: /root/cluster-validation/pre-install/iozone -I -r 1M -s 4G -k 10 -+n -i 0 -i 1 -i 2 -f a
clustershell is a dependency for many of the scripts. It should be added to the project, along with its dependencies YAML and some other one, without requiring to have a local copy of all of EPEL.
/opt/mapr/server/mrconfig sp list -v
Check for any/all Java RPMs.
sudo yum list installed *jdk* *java*
Check for openjdk-devel
Check for java bin using readlink
javapath=$(dirname $(readlink -f /usr/bin/java))
if [ -f $javapath/jps ]
[[
URL can replace rpm in the package.
http://mirror.math.princeton.edu/pub/epel/6/x86_64/clustershell-1.7.2-1.el6.noarch.rpm
I'm having trouble when trying to setup (as root) the cluster validation tools on a Ubuntu x86_64. I'm also having the same problem on MapR M3 on AWS EMR.
$ rpm -i clustershell-1.6-1.el6.noarch.rpm
rpm: RPM should not be used directly install RPM packages, use Alien instead!
rpm: However assuming you know what you are doing...
warning: clustershell-1.6-1.el6.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY
error: Failed dependencies:
/usr/bin/python is needed by clustershell-1.6-1.el6.noarch
python(abi) = 2.6 is needed by clustershell-1.6-1.el6.noarch
When I use the flag --nodeps
, the installation succeed and the folder /etc/clustershell/
is created but still have a problem:
$ clush -a date
Traceback (most recent call last):
File "/usr/bin/clush", line 7, in <module>
from ClusterShell.CLI.Clush import main
Any idea what went wrong and how to fix it?
Hi,
any chance you are available for a chat/discussion around comparison I'm doing between running the runTeraGenSort.sh in a bare metal environment vs a virtualised environment ?
thanks
Magnus Andersson
For non US countries, the install process requires to check for the locale being set to en_US.UTF-8 or else there are all kind of issues post install like not being able to login to MCS or non-english log messages that may cause issues for tech support.
should be easy to add it... i may send a PR later.
timeout 9 mapred job -list
timeout 9 hadoop job -list
Please adjust network test for 80 Gbit bonded interfaces.
It seems that the tests are optimized for 10 GBit interfaces. I was not able to verify 40 GBit or 80 GBit interfaces, because there seems to be a boundary at 10 GBit.
lspci, dmidecode, ethtool fail against RHEL 6 install. not in path.
I think runRWSpeedTest.sh has a minor bug. Line 25 defines dbug as false, line 31 sets dbg to true, and line 87 uses dbg.
nodeset -l
@ALL
@zk
@cldb
@rm
@hist
root@psnode40 zsh#0 cat /etc/clustershell/groups
all: psnode[40-44]
zk: psnode[40-42]
cldb: psnode[40-42]
rm: psnode43
hist: psnode44
hbm: psnode44
hbr: @ALL
root@psnode40 zsh#0 cat /etc/clustershell/groups.d/local.cfg
all: example[4-6,32-159]
Use similar approach as in network-test.sh near top of script:
ssh
Add timeout to curl in install_patch() in mapr-install.sh
Implement test for duplicate hostnames
/opt/mapr/bin/guts cpu:none rpc:none cache:none db:none cleaner:none time:all dsec:6
Ops, MB/sec
tmpfile=$(mktemp); trap "rm
iplist+=( $(ssh $host hostname -i) ) #TBD: check for more than 1 IP address
[ -n "$DBG" ] && read -p "$DBG: Press enter to continue or ctrl-c to abort"
The output produced provides information on the status of certain things, without any indication of whether it’s a problem. For example, on my cluster audit, I got this message, which is good (as it should be):
SElinux status: SELINUX=disabled
Disabled
And this message, which is bad (not as it should be):
Required RPMs:
package ntp is not installed
And this message, which is neutral:
mapr account for MapR Hadoop
mapr user NOT found!
(and the use of NOT in all caps could be taken to indicate this is a problem).
It would be nice if a setting that was problematic had a WARN: (or something) in front of it, so people could look for that keyword and make sure they’re not missing problems (or interpreting something as problematic when it’s not).
clush -ab -o -qtt sudo su - mapr -c "bash -c "echo 'localhost:/mapr /mapr hard,intr,nolock,noatime' > /opt/mapr/conf/mapr_fstab""
The last update of disk-test.sh broke the script:
[root@ip-10-0-0-33 cluster-validation]# /root/cluster-validation/pre-install/disk-test.sh
/root/cluster-validation/pre-install/disk-test.sh: line 137: syntax error: unexpected end of file
awk 'FNR==1 {print FILENAME}; /"^[ ]version":/; /^[ ]"cluster":/,/},/' mapr-audit-*.log
export JAVA_HOME=...
I am trying to setup 5 nodes cluster. However I am having issues installing. The installer failed for the last nodes with (Unable to execute command: timeout -s HUP 2m hadoop fs -put -f /opt/mapr/hive/hive-2.1/lib/hive-orc-2.1.1-mapr-1710.jar /installer/hive-2.1/. ).
When I try to run the script "network-test.sh" I get the following error:
DNS lookup (host not found: 2(SERVFAIL))
Reverse DNS lookup (Host 115.10.27.172.in-addr.arpa. not found: 3(NXDOMAIN)
The installer works for single node cluster. Please let me know how to solve this network related issue.
Thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.