GithubHelp home page GithubHelp logo

nayohan / gpu-monitoring-slack Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mharrend/gpu-monitoring-slack-mattermost

1.0 0.0 0.0 109 KB

Monitoring of a GPU system sending either Slack messages via webhooks

License: MIT License

Shell 25.28% Python 74.72%

gpu-monitoring-slack's Introduction

GPU-Monitoring via Slack or Mattermost

Screenshot showing output in Slack

In a nut shell

Monitoring of a GPU system sending either Slack or Mattermost messages via webhooks

Requirements

  • NVidia® GPU since it is using nvidia-smi to monitor the GPU jobs.
  • Linux.
  • Python 2 (port to Python 3 should be straightforward).

Usage

  1. Create an incoming webhook in Slack or Mattermost and save the web adress of the web hook.
  2. Open the file gpumonitor.py and add the web address of the web hook to the mattermostIncomingWebhook variable.
  3. Start the gpumonitor.py script

If you would like, you can also install the gpumonitor as an init.d service

  1. Copy the gpumonitor.py script to /usr/local/bin/gpumonitor: cp gpumonitor.py /usr/local/bin/gpumonitor
  2. Make sure that the file is executable: chmod +x /usr/local/bin/gpumonitor
  3. Copy the file gpumonitor-init.d to /etc/init.d: cp gpumonitor-init.d /etc/init.d/gpumonitor
  4. Start the init service via /etc/init.d/gpumonitor start
  5. As usual you can monitor the service status via /etc/init.d/gpumonitor status

Options for configuration

  • mattermostIncomingWebhook: Web address of incoming Slack or Mattermost webhook
  • nvidiaLogoLink: Logo used for the bot in Mattermost

Screenshot

Screenshot taken from Mattermost group contains

  • Message after a new job was started.
  • Status message since a new job was created.
  • Message after job has finished. Screenshot showing output in Mattermost group

Legal notice

  • This is a private project.
  • I am not in any way affiliated with Mattermost, NVidia or Slack.
  • NVIDIA, the NVIDIA logo, and all other trademarks mentioned in this document are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.

gpu-monitoring-slack's People

Contributors

nayohan avatar

Stargazers

Yani Ioannou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.