mapr-emea / mapr-clustercheck Goto Github PK
View Code? Open in Web Editor NEWThis is supposed to be used by PS
This is supposed to be used by PS
Implement MapR-Streams Benchmark
Documentation: How to setup project / guidelines
Proof of concept: Implement CLI application generating human-readable text output
General: Handle authentication with Kerberos for benchmarks
benchmark-maprfs-dfsio should not calculate the number of files during runtime based on disks.
When generate template is executed, the correct value should be set.
File should include:
Same stuff as done here: https://github.com/jbenninghoff/cluster-validation/blob/master/pre-install/cluster-audit.sh
Also add comments and advises
Implement Drill Benchmark on MapR-DB binary
Component Benchmark: YARN - Spark TeraSort
General: Historize and versionize all outputs
Goal: compare values with older results
Component Benchmark: YARN - Spark PI
General: Handle authentication with Native Security for benchmarks
Implement Hive Benchmark
Specification: Define interface how modules can be triggered in a generic way
Proof of concept: Implement a Java module to submit a Java application to cluster
Cluster health: Memory benchmark / check
https://github.com/jbenninghoff/cluster-validation/blob/master/pre-install/memory-test.sh
Proof of concept: Implement CLI application calling a module
Component Benchmark: YARN - TeraSort MapReduce
Proof of concept: Implement a Python/Bash module
Implement MapR-DB binary Benchmark
Core: Add way to add hosts via generic patterns
something like: host[1-123] or host[1,2,3,4,5]
Cluster health: Check Memory Swap / Overcommit
Config Check: Verify Storage pools on all nodes and compare sizes of SPs and disks.
Implement MapR-DB JSON Benchmark
Config Check: YARN
Collect configs and compare, check important properties
Component Benchmark: YARN - Spark consuming from MapR-Streams
Specification: Define result exchange format between standalone CLI app and modules
Fields in output should be at least:
Make Spark Terasort more fault tolerant, if fails, scripts should not abort
Proof of concept: Implement a module using Ansible
Check if MEP version is supported on MapR core version
Specification: Define configuration format
I would like to see something like YAML.
Config Check: MapR-Core MapR-SASL
Make MR Terasort more fault tolerant, if fails, scripts should not abort
Config Check: MapR-Core unsecure
This should work with NFS, Fuse and Samba
Make roles overridable for all modules for single tests
Implement MapR-FS Benchmark
something similar:
https://github.com/jbenninghoff/cluster-validation/blob/master/post-install/runRWSpeedTest.sh
and
https://github.com/jbenninghoff/cluster-validation/blob/master/post-install/runDFSIO.sh
Implement HttpFS Benchmark
Implement command validateSsh, which validates node settings
Also check if /tmp/ is writable.
Should be executed, before validate.
Config Check: MapR-Core Kerberos
General: Store all result in machine readable JSON format
Cluster health: Check for dropped packets on NICs
Module: benchmark-network-iperf - Implement MultiNIC support and tests
Implement Drill Benchmark on CSV files
Cluster Audit: Check OS / Server settings
Use this one as template:
https://github.com/jbenninghoff/cluster-validation/blob/master/pre-install/cluster-audit.sh
Proof of concept: Implement CLI application generating JSON output
Implement Drill Benchmark on MapR-DB JSON
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.