GithubHelp home page GithubHelp logo

gabrieleiannetti / lustre-scripts Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gsi-hpc/lustre-scripts

0.0 0.0 0.0 176 KB

Scripts for the Lustre File System and Robinhood Policy Engine

Shell 1.04% Python 98.96%

lustre-scripts's Introduction

lustre-scripts

Useful scripts based on the Lustre file system (http://lustre.org/) and the Robinhood Policy Engine (https://github.com/cea-hpc/robinhood).

Lustre Job Analyser

Description:

Produce an overview of the Slurm jobs that have a higher write or read sample count based on the Lustre job statistics regarding specfied file server (OSS).

Requisites:

  • Enabled Lustre job stats.
  • ClusterShell - For executing remote commands in parallel from a workstation on the cluster.
  • show_high_jobstats.pl - Perl script to collect and filter queried jobstats from Lustre OSS (https://www.scc.kit.edu/scc/sw/lustre_tools/show_high_jobstats.tgz).
  • slurm-client - Squeue command to query Slurm information about jobs.

Script Parameter:

  • -D/--enable-debug: Enables debug log messages.
  • -u/--user: User for executing remote commands e.g. querying Lustre job statitics or squeue information from Slurm.
  • -s/--oss-nodes: Specification of OSS nodes by using ClusterShell NodeSet syntax.
  • -n/--client-node: Specification of Client Node.
  • -m/--min-samples: Minimum number of read or write Lustre jobstats sample count.

Script Execution:

lustre_job_analyser.py -s file_server[000-100] -n client_node0200 -u admin_user -m 10000

Output Schema:

SLURM_JOB_ID|USER|GROUP|EXECUTABLE|CLIENT_NODE_LIST|FILE_SERVER_LIST ... SLURM_JOB_ID|USER|GROUP|EXECUTABLE|CLIENT_NODE_LIST|FILE_SERVER_LIST

Lustre Collect Changelog Indexes

Description:

This script collects changelog indexes from a Lustre MDT. It can be executed continuously or just in a capture mode.

for displaying the delta between the Lustre changelog index and the specified changelog reader index.

Script Parameter:

  • -i/--interval: Specifies the collect interval in seconds.
  • -d/--delimiter: Defines the delimiter for unloaded indexes.
  • -m/--mdt-name: Specifies the MDT name where to read the current index from.
  • -r/--changelog-reader: Specifies an additional changelog reader to read the index from.
  • -f/--unload-file: Specifies the unload file where the collected information is saved (changelog_indexes.unl).
  • --direct-flush: If enabled after each collection interval a disk write flush is done of the collected values.
  • --capture-delta: Prints the delta between the changelog consumer and the MDT index after one interval on the stdout and quits.

Script Execution:

Continuous Collect Mode:

lustre_collect_changelog_indexes.py -m fs-MDT0000 -r cl1

Capture Delta Mode:

lustre_collect_changelog_indexes.py -m fs-MDT0000 -r cl1 --capture-delta

Output Schema:

In Continuous Collect Mode:

  • TIMESTAMP;
  • MDT CHANGELOG INDEX PRODUCER COUNT;
  • CHANGELOG READER INDEX CONSUMER COUNT;
  • DELTA BETWEEN CHANGELOG READER INDEX AND MDT CHANGELOG INDEX

In Capture Delta Mode: DELTA BETWEEN CHANGELOG READER INDEX AND MDT CHANGELOG INDEX

Lustre Large File Notifier

Description:

This program queries the Robinhood Entries database table for large files and keeps track of that files in separate notification table. Based on the notification table user can be notified of large files saved on the Lustre file system and/or an overview of large files can be send to administrators of the file system.

Requisites:

python-mysqldb - Python interface to MySQL LDAP service running for determining users email adress on user notification.

Script Parameter:

  • -f/--config-file: Path of the config file.
  • -D/-enable-debug: Enables logging of debug messages.
  • --create-table: If set the notifiers table is created.
  • --no-mail: Disables mail send.

Structure of the Configuration File:

[mysqld]
host         =
database     =
user         =
password     =

[check]
file_size            = 500GB
file_system          =
check_interval_days  = 7

[notify]
table    = NOTIFIES
database =

[mail]
server             =
sender             =
overview_recipient =
subject            = Large File Report
send_user_mail     = off

[ldap]
server =
dc     =

Script Execution:

Executing the rbh-large-file-notifier with debug messages saved into a proper log file:

./rbh-large-file-notifier.py -f rbh-large-file-notifier.conf -D >> rbh-large-file-notifier.log 2>&1

Schema of the Notifier Table:

+---------------+----------------------+------+-----+---------+-------+
| Field         | Type                 | Null | Key | Default | Extra |
+---------------+----------------------+------+-----+---------+-------+
| fid           | varbinary(64)        | NO   | PRI | NULL    |       |
| uid           | varbinary(127)       | NO   |     | NULL    |       |
| size          | bigint(20)           | NO   |     | NULL    |       |
| path          | varchar(1000)        | NO   |     | NULL    |       |
| last_check    | datetime             | NO   |     | NULL    |       |
| last_notify   | datetime             | YES  |     | NULL    |       |
| ignore_notify | enum('FALSE','TRUE') | YES  |     | FALSE   |       |
+---------------+----------------------+------+-----+---------+-------+
7 rows in set (0.00 sec)

Example Mail Text for Administrators:

Dear All,

this is the automated report of stored large files on '/lustre/fs' that are equal or larger than 500GB.

The following information is provided in CSV format: uid;size;path;last_notify

...

Example Mail Text for Users:

Dear user *uid*,

this is an automated e-mail that contains a list of your stored files on '/lustre/fs' that are equal or larger than 500GB.

Please check if you really need those files stored on the file system.

...

Robinhood Unload Processing Tool

Script: rbh-ost-file-map-creator.py

Description:

Creates mapping files between start OST index and filepath based on Robinhood unloads generated with rbh-report.

This tool is used to create input files for file migration on Lustre with the Cyclone Framework.

Script Parameter:

options:
  -h, --help            show this help message and exit
  -e FILENAME_EXT, --filename-ext FILENAME_EXT
                        Default: .unl
  -f FILENAME_PATTERN, --filename-pattern FILENAME_PATTERN
                        For instance: file_class_ost{INDEX}, where {INDEX} is a placeholder for the OST index.
  -i OST_INDEXES, --ost-indexes OST_INDEXES
                        Defines a RangeSet for the OST indexes e.g. 0-30,75,87-103
  -s SPLIT_INDEX, --split-index SPLIT_INDEX
                        Default: 1
  -w WORK_DIR, --work-dir WORK_DIR
                        Default: '.'
  -x EXACT_FILENAME, --exact-filename EXACT_FILENAME
                        Explicit filename to process.
  -l LOG_FILE, --log-file LOG_FILE
                        Specifies logging file.
  -D, --enable-debug    Enables logging of debug messages.

Usage of rbh-report:

The following creates the unload files with rbh-report:

FILE_CLASS=XXX

for i in {280..310}
do
    rbh-report --dump-ost ${i} --filter-class=${FILE_CLASS} --csv > ${FILE_CLASS}_ost${i}.unl &
done

The unload files must be created in CSV format and with header.

lustre-scripts's People

Contributors

gabrieleiannetti avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.