GithubHelp home page GithubHelp logo

ansible-role-slurm's Introduction

License CI

SLURM cluster Role

Install SLURM cluster.

Role Variables

The variables that can be passed to this role and a brief description about them are as follows.

# SLURM version to install (in case of RH systems)
slurm_version: 20.02.7
# List of servers to download the slurm code
slurm_mirrors: [ "http://ftpgrycap.i3m.upv.es/src/", "https://download.schedmd.com/slurm/" ]
# Type of node to install: front or wn
slurm_type_of_node: front
# Name of the SLURM server
slurm_server_name: slurmserver
# IP address of the SLURM server
slurm_server_ip: 127.0.0.1
# Prefix to set to the SLURM working nodes
slurm_vnode_prefix: vnode-
# List of the names of the WNs
slurm_wn_nodenames: []
# Number of CPUs of the WNs
slurm_wn_cpus: 1
# Amount of memory of the WNs (in MB, see RealMemory). If 0 it is not set
slurm_wn_mem: 0
# GRES specification for the WN
slurm_wn_gres: ""
# GRES types specification for the WN
slurm_wn_gres_tpes: ""
# GRES conf data file
slurm_wn_gres_conf: "AutoDetect=nvml"
# Default user for ssh and slurm management
# Default ssh user
user: user1
# Install DRMAA library
drmaa_lib_install: false
drmaa_lib_version: 1.0.7
# SLURM default configuration options
slurm_default_conf_options:
	AuthType: auth/munge
	CryptoType: crypto/munge
	FirstJobId: 1
	JobRequeue: 0
	JobSubmitPlugins: all_partitions
	ProctrackType: proctrack/pgid
	ReturnToService: 2
	SlurmctldPidFile: /var/run/slurmctld.pid
	SlurmctldPort: 6817
	SlurmdPidFile: /var/run/slurmctld.pid
	SlurmdPort: 6818
	SlurmdSpoolDir: /var/spool/slurm
	SlurmUser: slurm
	StateSaveLocation: /var/slurm/checkpoint
	SwitchType: switch/none
	TaskPlugin: task/none
	InactiveLimit: 0
	KillWait: 30
	MessageTimeout: 30
	MinJobAge: 300
	SlurmctldTimeout: 30
	SlurmdTimeout: 40
	Waittime: 0
	FastSchedule: 1
	SchedulerType: sched/backfill
	SelectType: select/linear
	AccountingStorageType: accounting_storage/none
	ClusterName: cluster
	JobCompType: jobcomp/none
	JobAcctGatherFrequency: 30
	JobAcctGatherType: jobacct_gather/none
	SlurmctldDebug: debug5
	SlurmctldLogFile: /var/log/slurm/slurmctld.log
	SlurmdDebug: debug5
	SlurmdLogFile: /var/log/slurm/slurmd.log
# SLURM user configuration options
slurm_conf_options: {}
# SLURM configuration options for cgroup
slurm_cgroup_conf_options:
	CgroupPlugin: cgroup/v1

Example Playbook

This an example of how to install a SLURM cluster:

  - hosts: server
  roles:
  - { role: 'grycap.slurm', slurm_type_of_node: 'front', slurm_server_ip: '{{ansible_default_ipv4}}', slurm_wn_nodenames: "{{ groups['wns']|map('extract', hostvars, 'ansible_hostname')|list }}" }
  - hosts: wns
  roles:
  - { role: 'grycap.slurm', slurm_type_of_node: 'wn', slurm_server_ip: "{{hostvars['server']['ansible_default_ipv4']}}" }

Contributing to the role

In order to keep the code clean, pushing changes to the master branch has been disabled. If you want to contribute, you have to create a branch, upload your changes and then create a pull request.
Thanks

ansible-role-slurm's People

Contributors

amcaar avatar micafer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

ansible-role-slurm's Issues

No package ntp available.

I noted that on CentOS 8 the following happens, probably because ntp has been replaced by chronyd:

TASK [grycap.ntp : Install the required packages in Redhat derivatives] **************************************************************************************************************************************************************************************
fatal: [hostname.removed]: FAILED! => {"changed": false, "failures": ["No package ntp available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}

I masked the hostname

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.