GithubHelp home page GithubHelp logo

boavizta / ai-power-measures-sharing Goto Github PK

View Code? Open in Web Editor NEW

This project forked from orange-opensource/ai-power-measures-sharing

2.0 0.0 2.0 2.88 MB

This project defines a json ontology standard describing a power consumption measure in a given software/hardware context, noticeably in machine learning tasks. It also provides the tooling for conversion to tabular datasets.

License: Other

Python 100.00%

ai-power-measures-sharing's Introduction

This energy report format is published under Creative Commons 4.0. https://creativecommons.org/licenses/by/4.0/

CO2 Reporting

1. Goal

The goal described by this document is to setup a simple and resilient digital ecosystem, so as to gather homogeneous, well-formated measures of energy consumption from an atomic software task in general and Machine Learning / Deep Learning / AI / GenAI tasks in particular.

The purpose thereby followed is to build a large, open, database of energy consumption of IT / AI tasks depending on data nature, algorithms, hardware, etc., in order to improve energy efficiency approaches based on empiric knowledge.

More concretely, this empiric knowledge may be used in applied research to improve frugal approaches in AI models grid search and avoid energy-intensive tasks.

2. Energy Measurement

It is assumed that the measurement of an atomic task can be achieved by one or several means among the following.

Software-based

CodeCarbon Carbon AI PyJoules PowerGadget ...

Hardware-based

Direct physical measure with a Watt-meter.

3. Knowledge Elements

Elements that are likely to take place as pieces of context in each knowledge item, must be at the same time meaningful and minimal. Meaningful, in order to learn valuable patterns from data gathered. Minimal, in order to keep the monitoring task as light as possible.

  • Data type (mandatory): basically, the nature of the data (text/csv, audio, image, etc.)
  • Data dimensions (mandatory): the shape of the dataset, the first figure being the number of items
  • Task type (mandatory): noticeably in machine learning, the nature of the process being achieved (clustering, classification, reinforcement, etc.)
  • Measurement method (mandatory): software-based or hardware-based
  • Measurement solution (mandatory): for software versions, the library used ; for hardware, the watt-meter manufacturer
  • Measurement unit (mandatory)
  • The measure itself (mandatory)!
  • Algorithm(s) (conditional): if the task is a learning task, what kind of algorithm is used
  • Hyperparameters (conditional and optional): if the task is a learning task, what hyperparameters were used, with which values
  • Hardware environment (recommended): What kind of host (container, VM, dedicated server), and electronic chips (GPU, CPU, RAM) are used for the measure
  • System environment (recommended): What OS, version kernel, etc.
  • Energy source (recommended): depending on the location or the private energy plants, permits to extrapolate the carbon emissions induced by the energy consumption
  • Publisher (recommended) : information about the identity of the publisher with various levels of anonymization

4. State-of-the-art

On ML tasks categorization

On ML description frameworks

These could typically inspire the ground for a format of reporting.

5. Format principles

The JSON structure is proposed for the sake of clarity for human users, and fields extensibility. Since flattening an object containing arrays leads to a naming issues (array items have no label, and array index is not guaranteed between instances), a specific flattening scheme is proposed for arrays, thanks to the reserved property label "$$key". If an array is present, then all array items must contain a property "$$key". Otherwise, a flattening exception is raised.

6. Architecture Scenarii

The differences among possible architecture scenarii are bound to several considerations.

  • Simplicity: The architecture must be compatible with the way measures are produced. For instance, in the case of software libraries such as CodeCarbon, some contextual information can be added in the output produced. Should such field be used for arbitrarily long payload ? Probably not.
  • Integrity: To expect large public datasets of energy monitoring figures in-the-field, individual data of corporations / projects / teams must describe things in a unique, non-ambiguous, terminology. For instance, the ‘Random Forest’ algorithm should not be described ‘randomforest’ in one case, ‘Random_Forest’ in another, ‘rf’ in a third one. Homogeneity of descriptors must be enforced.
  • Privacy: Energy monitoring data publishers must be in the conditions to keep the desired level of privacy concerning the data they publish
  • Trust: The data published, especially on a public database, must own the appropriate level of trust

ai-power-measures-sharing's People

Contributors

romotchka avatar

Stargazers

Damien Fernandes avatar bpetit avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.