greenscheduler / cats Goto Github PK
View Code? Open in Web Editor NEWCATS: the Climate-Aware Task Scheduler :cat2: :tiger2: :leopard:
Home Page: https://greenscheduler.github.io/cats/
License: MIT License
CATS: the Climate-Aware Task Scheduler :cat2: :tiger2: :leopard:
Home Page: https://greenscheduler.github.io/cats/
License: MIT License
Currently we have three failing tests on main. Two of them are easy to fix (see #82).
The test that still fails comes from line 218 of __init__.py
where we expect four values to be returned by get_runtime_config
(in configure.py
) but that function returns six values. I think we just need to plumb in jobinfo and PU to __init__
but I'm not totally sure which way around the fix should go. Looks like some merge conflict resolution gone wrong to me. I think this is the cause of the error reported in #81.
While we're at it, I think the type hints in get_runtime_config
are out of sync with the values that actually get returned (four rather than six types listed). It's not the only issue that mypy finds:
cats/check_clean_arguments.py:31: error: Need type annotation for "info" (hint: "info: Dict[<type>, <type>] = ...") [var-annotated]
cats/check_clean_arguments.py:31: error: Argument 1 to "dict" has incompatible type "list[tuple[str | Any, ...]]"; expected "Iterable[tuple[Never, Never]]" [arg-type]
cats/check_clean_arguments.py:56: error: Expected keyword arguments, {...}, or dict(...) in TypedDict constructor [misc]
cats/configure.py:50: error: Module has no attribute "eror"; maybe "error"? [attr-defined]
cats/configure.py:67: error: Incompatible return value type (got "tuple[Mapping[str, Any], APIInterface, str, int, list[tuple[int, float]] | None, Any]", expected "tuple[dict[Any, Any], APIInterface, str, int]") [return-value]
cats/configure.py:87: error: Missing return statement [return]
cats/__init__.py:199: error: Incompatible types in assignment (expression has type "bytes", variable has type "CATSOutput") [assignment]
cats/__init__.py:209: error: Missing return statement [return]
Found 8 errors in 3 files (checked 10 source files)
Worth adding a mypy test to our CI? Worth setting github up to preclude merging / pushing onto main when the tests don't pass?
A follow-on from #47, add a GitHub Action workflow to generate and host via GitHub Pages the built Sphinx documentation pages from the source .rst
content.
As part of this Issue, the following should be completed:
If anyone wants to take this on, please feel free and if so, 'assign' yourself here so we know you are working on it. I can do it, but it might be some weeks before I get round to it, so until the point then when I assign myself, I won't be working on it. (And of course I'm happy to provide any guidance relating to the content and infrastructure I added in #47 towards getting this done.)
The documentation source including confirguration and makefile (etc.) is all contained under the docs/
repository. The README document in that directory explains the build process via the core command make html
, which is what we need to automate via the Actions workflow.
It would aid greatly if (e.g.) "cats -h" gave a full list of each parameter and its meaning.
(e.g. what is -c COMMAND for? how would I find that out from the command line?)
NB I cannot see in https://greenscheduler.github.io/cats/quickstart.html#basic-usage where "-c COMMAND" is discussed
Although there is useful info in https://github.com/GreenScheduler/cats/blob/main/cats/__init__.py I do not get this displayed:
mkb@deb12:~/src/mmu-linux/mmu-bin$ ~/.local/bin/cats
usage: cats [-h] -d DURATION [-s {at}] [-a API] [-c COMMAND] [--dateformat DATEFORMAT] [-l LOCATION] [--config CONFIG] [--profile PROFILE] [--format {json}] [-f] [--cpu CPU] [--gpu GPU]
[--memory MEMORY]
cats: error: the following arguments are required: -d/--duration
In the current design we use cats
to generate some output (on standard out) that can be used as an argument to at
to set the runtime. All other output goes to standard error. This makes handling out output a bit complicated and cats does not really look like a stand alone scheduler.
One option would be to rebuild the command line interface to cats
such that it looks like a scheduler itself and under the hood calls at
having done the calculation for the start time. This would probably use processing from the standard library. It could also open the door to letting us ship more than one command line programme.
Anyhow, this came up in #52 and it seems like it needs thinking about.
Get coverage as close to 100% before v1 release
We need some documentation for the project. Both a framework to create / host the documentation (e.g. github pages or read the docs via sphinx) and a first pass at some content. This may involve trimming down the readme file.
Suggested in #29
HI, a quick one to note that the README command quoted to utilise the output of cats
with the at
scheduling command, namely:
command | at `python -m cats -d <job_duration> --loc
is not working. Clearly there is a missing or spurious backtick in there, but eve when one is added in a logical place, namely "<trivial job command to test> | at `python -m cats -d <job_duration> --loc <postcode>`" (or I even try it without, just to test to be sure, logically that wouldn't work to my knowledge though), the command won't work, e.g:
$ mkdir mydir | at `python -m cats -d 120 --loc "RG6 6ES"`
{'timestamp': datetime.datetime(2023, 5, 13, 12, 30), 'carbon_intensity': 102.0, 'est_total_carbon': 102.0}
syntax error. Last token seen: B
Garbled time
I am not sure of the context, but imagine this command might have worked before the latest changes to include the estimated carbon intensity, because obviously with the output now being a Python-like dict
it is not a valid at
timestamp input (unless processed with further commands e.g. by pipe to extract it), whereas if it was just the timestamp before that could have worked assuming the extra backtick.
Until the CLI input and output format is tidied and we can provide a means to grab the timestamp only, to pass to at
etc., we could either:
awk
pipe command or similar to parse the timestamp out, to get a working command for hooking up to at
; orat
and other schedulers, etc.Suggested tool black
, isort
in #29, could also use ruff
.
How do we take this forward?
From a Twitter thread relating to cats
, some folk were enquiring as to whether it works or could work for locations outside of the UK (see original quotes given below for the context, if useful). I agree that it would be nice to provide support for countries not in Britain, assuming of course we can find and use APIs for other national electricity system grids comparable to the National Grid ESO API we have made use of so far.
The first step would be to research whether there are other such APIs we could make use of. Then we can get a feel for how multi-national in scope cats
could be. Alternatively, we could decide to limit our scope solely to the UK, to avoid the complications. What do people think? It would be especially useful to hear from those with more knowledge of electricity systems than I (I have very little!).
Either way, we should clarify the location scope of cats
in the documentation. Since our README doesn't mention explicitly that it only works for places in GB, we should add some brief text to clarify that straight away (to be updated if we eventually wider our location scope in line with this Issue). (I'll do that shortly in a commit.)
(See also the link above for original source.)
Does it work outside the U.S. ? I mean because of the grid data it depends on ?
It uses the UK @NationalGridESO API, but presumably other countries have equivalent ones?
Sorry for assuming U.S. was the default ๐. Probably, there's something equivalent elsewhere, too. Would be cool to add some resources to the http://README.md
Yeah, people from other countries should definitely raise issues with info for their grid.
The carbon intensity data we are given is in UTC. But system time could be in a different timezone. We need to translate this when producing the output from cats, currently we don't. We should also write unit tests to check this and to check what happens when the clocks change as this is a common way systems like this break!
At our next catch up we should have a discussion about a contributor code of conduct. One approach would be to adopt the Contributor Covenant: https://www.contributor-covenant.org/version/2/1/code_of_conduct/code_of_conduct.md
We would need to include a reporting method (probably two people in case one of us manages to mess up).
From a Twitter thread relating to cats
, some folk were enquiring as to whether it works or could work for locations outside of the UK (see original quotes given below for the context, if useful). I agree that it would be nice to provide support for countries not in Britain, assuming of course we can find and use APIs for other national electricity system grids comparable to the National Grid ESO API we have made use of so far.
The first step would be to research whether there are other such APIs we could make use of. Then we can get a feel for how multi-national in scope cats
could be. Alternatively, we could decide to limit our scope solely to the UK, to avoid the complications. What do people think? It would be especially useful to hear from those with more knowledge of electricity systems than I (I have very little!).
Either way, we should clarify the location scope of cats
in the documentation. Since our README doesn't mention explicitly that it only works for places in GB, we should add some brief text to clarify that straight away (to be updated if we eventually wider our location scope in line with this Issue). (I'll do that shortly in a commit.)
(See also the link above for original source.)
Does it work outside the U.S. ? I mean because of the grid data it depends on ?
It uses the UK @NationalGridESO API, but presumably other countries have equivalent ones?
Sorry for assuming U.S. was the default ๐. Probably, there's something equivalent elsewhere, too. Would be cool to add some resources to the http://README.md
Yeah, people from other countries should definitely raise issues with info for their grid.
config.yml
to the main branch--jobinfo
from documentationImplement the best strategy identified in #40
At some point it would be nice to use carbon intensity to help schedule tasks on HPC clusters. In principle the 'backend' of cats could help with this and the obvious approach is to somehow plug into SLURM. For example, on an under used cluster, it may be best to run user jobs only during low carbon intensity times and let the queue build up when carbon intensity is high. We would presumably need to build a SLURM plugin (https://slurm.schedmd.com/plugins.html) and work with a team managing a cluster. This issue is to keep track of ideas around this.
To avoid calling the API more than once every 30min (as data doesn't change in between).
Options include:
If api is carbonintensity.org.uk then we should not try to optimise start time for a task with duration > 48 hours
We probably need to make this optional (assuming we don't need it in all cases) or create in on install (is that something we can do)?
I'll have a bash at step 1.
As suggested in #63, allow customisation of date format. Introduce a --dateformat
option that takes strftime(3) syntax and outputs the date.
This is intended for customization for users into their existing workflows. We expect most usage through the supported --scheduler
options, that will auto set appropriate formatting options.
calculate the average carbon intensity over the duration of the job. For now this can be with the forecast data when the job starts, but we could later switch it to looking up the real data at the end of the job.
Here and within the next meeting we should discuss versioning and how we want to approach it, in particular:
cats
, including alpha and/or beta candidates towards that;So it would be a good idea to try to think a bit about this in advance.
falls over on first try:
INSTALL via
pip install git+https://github.com/GreenScheduler/catsthub.com/GreenScheduler/cats
but...
$ ~/.local/bin/cats -d 10
WARNING:root:config file not found
WARNING:root:Unspecified carbon intensity forecast service, using carbonintensity.org.uk
WARNING:root:location not provided. Estimating location from IP address: M3.
Traceback (most recent call last):
File "/home/staff/banem/.local/bin/cats", line 8, in
sys.exit(main())
File "/home/staff/banem/.local/lib/python3.9/site-packages/cats/init.py", line 218, in main
config, CI_API_interface, location, duration = get_runtime_config(args)
ValueError: too many values to unpack (expected 4)
wich doesn't mean much to me! There are WARNINGs (not ERRORs) but then some unclear "ValueError" failure...
Currently a new request to carbonintensity.org.uk is made each time cats
is run. In cats/__init__.py
:
def findtime(postcode, duration):
tuples = get_tuple(postcode) # API request
result = writecsv(tuples, duration) # write intensity data to disk
# as csv timeseries
# ...
Although the carbon intensity data obtained from the API is written on disk, this not taken advantage of. Instead, if the relevant carbon intensity data is already on disk, we'd like to reuse this data instead of making a new request each time.
The local carbon intensity forecast data is reusable if the last forecast datetime is beyond the expected finish datetime of the application, i.e. forecast_end > now() + runtime
.
A possible approach is to reshuffle the responsabilites of both top-level functions api_query.get_tuple
and parsedata.writecsv
.
get_tuple
could be responsible for ensuring that the right data is present on disk, and download it if not.writecsv
only cares about computing the best job start time, assuming correct intensity data is available. For instance,# cats/__init__.py
def findtime(postcode, duration):
tuples = get_tuple(postcode)
result = writecsv(tuples, duration)
then becomes
# cats/__init__.py
def findtime(postcode, duration):
# Check if cached carbon intensity data goes beyond
# now() + duration, download new forecast if not
# formerly `get_tuple()`
ensure_cached_intensity_data(postcode, duration)
# Then -- assuming data is available on disk -- compute
# the best time to start the job.
# formerly `writecsv()`
result = get_best_start_time(duration)
This approach has the benefit is maitaining a good separation between talking to the API โ and caching intensity data โ and the calculation of the start time. We currently do almost have this, expect that the function returning the start time is also responsible for writing the intensity data on disk.
Another possible approach is to push the API query and data caching down to the current writecsv
function:
def writecsv(data_path: str, duration=None) -> dict[str, int]:
try:
return cat_converter(data_path, method, duration)
except MissingItensityDataError:
cache_latest_intensity_forecast(postcode)
return cat_converter(data_path, method, duration)
Deploy CATSv2 on a real cluster(s) that was/were offered to us at the scoping workshop.
From a Twitter thread relating to cats
, some folk were enquiring as to whether it works or could work for locations outside of the UK (see original quotes given below for the context, if useful). I agree that it would be nice to provide support for countries not in Britain, assuming of course we can find and use APIs for other national electricity system grids comparable to the National Grid ESO API we have made use of so far.
The first step would be to research whether there are other such APIs we could make use of. Then we can get a feel for how multi-national in scope cats
could be. Alternatively, we could decide to limit our scope solely to the UK, to avoid the complications. What do people think? It would be especially useful to hear from those with more knowledge of electricity systems than I (I have very little!).
Either way, we should clarify the location scope of cats
in the documentation. Since our README doesn't mention explicitly that it only works for places in GB, we should add some brief text to clarify that straight away (to be updated if we eventually wider our location scope in line with this Issue). (I'll do that shortly in a commit.)
(See also the link above for original source.)
Does it work outside the U.S. ? I mean because of the grid data it depends on ?
It uses the UK @NationalGridESO API, but presumably other countries have equivalent ones?
Sorry for assuming U.S. was the default ๐. Probably, there's something equivalent elsewhere, too. Would be cool to add some resources to the http://README.md
Yeah, people from other countries should definitely raise issues with info for their grid.
In order to provide carbon savings estimates the GreenAlgorithmsCalculator needs to now about
(2) is currently returned by parsedata.writecsv
If I'm not mistaken (1) is currently not computed to we'd have to do it additionally - but all the ingredients are there.
I think this is the last remaining piece to allow cats to display carbon emissions savings? (see #20 )
Once cats is in PyPi, we could look into packaging for distributions such as Fedora, Debian and other channels such as Homebrew and conda-forge. Not a priority until after 1.0. Suggested in #29
As suggested in #63 discussion, introduce a --format
option that will support machine readable output in a specified JSON schema.
(just saving it as a small job for later if anyone has time)
For now, the tool assumes the carbon intensity forecasts are regular intervals throughout (e.g. every 30min). It would be good to either keep that assumption but check it by testing thee forecasts, or make the integration method work for any kind of intervals (which would be cleaner).
(discussed here briefly)
We want to have a cats
executable at the minimum taking
cats myprog --dur 00:08 --loc E14
Submit CATS to the Journal of Open Source Software
We need the repository to act as an advertisement for the overall problem as well as the 'normal' stuff (contribution guide, code of conduct, getting started documentation reference documentation etc.) so this needs particular thought
This should make it easy to install
TestPyPI: https://test.pypi.org/project/climate-aware-task-scheduler/0.1.0/
We need to pull the next 48 hours of data for the current location.
We have been discussing with @tlestang about what the data collected by the carbon intensity.org.uk represents, and therefore how best to calculate the average CI over a long period of time.
The API sends data for 30min periods, each value having a from
and a to
parameter (e.g. 50 gCO2e from 7:00am to 7:30am).
There are at least two ways these values can have been obtained (figure below).
Assuming the blue line is the real (continue) CI forecast:
Probably best to have both options implemented for when/if we add now APIs, but also good to have a good understanding of what's best.
I'm emailing the people in charge to ask about that, but good to hear everyone's thoughts about that! (as it's quite an important part of the tool!)
Useful to have CATS say when CI is lowest (over 48h forecast) but it would also be useful to have CATS state what that value is. Then, for example, I can take my known energy to solution and directly calculated the CO2eq (rather than have to set up a config file and use an average power which is very approximate). Ta, M
Once we have a test or two committed we should turn on testing. We may want to be clever about how we do this to minimise the carbon use of our tests and use this as an example for the turing way book chapter (see GreenScheduler/env-impact-of-open-research-chapter#1).
Currently, cats
works fine if run from the git checkout as it finds config.yml
and fixed_parameters.yml
. When run out of tree, for example by using a pipx
install, cats
fails as it does not find parameter files.
Steps to reproduce:
$ cd <directory where cats is cloned>
$ pipx install .
$ cd <any other directory>
$ cats -d 5 --loc OX1 --jobinfo=cpus=2,gpus=0,memory=8,partition=CPU_partition
WARNING:root:config file not found
Traceback (most recent call last):
File "/Users/abhidg/.local/bin/cats", line 8, in <module>
sys.exit(main())
^^^^^^
File "/Users/abhidg/.local/pipx/venvs/climate-aware-task-scheduler/lib/python3.12/site-packages/cats/__init__.py", line 285, in main
args.jobinfo, expected_partition_names=config["partitions"].keys()
~~~~~~^^^^^^^^^^^^^^
KeyError: 'partitions'
Expected output:
cats should show a user-friendly error in this case and suggest a config file.
Other points:
If the fixed_parameters.yaml
file is fixed and non-configurable by the user, then it makes sense to inline it in the carbonFootprint
module. Data files can be installed through pyproject.toml
, but this may not be needed for this use case.
See #51 (comment).
I can't find information for this in the README. The code seems to imply the accepted format is an integer representing seconds, but possibly some datetime formats are accepted too (I don't have much time to investigate). Please can someone specify this in more detail in the README.
An example command introduced with a sentence explaining what it does might be really instructive to illustrate the use of both present command-line options (e.g. postcode as a string).
Looking at the code, I think it would be worth refactoring it to clean up a bit from the hackaday rush and keep each module separately: in particular it would make it easier to address communications between functions, as raised by #25, and facilitate future expansion to new countries by replacing the API part for #22. It would also facilitate asynchronous tasks when some modules need to be bypassed.
An example of what can be confusing at the moment: tracking back where the optimal start time is being computed. In __init__.py
, the optimal time to run the job is provided by writecsv
from the parsedata.py
file (not immediately clear from the name). writecsv
itself calls cat_converter
from timeseries_conversion
which in turns calls the function get_lowest_carbon_intensity
. The last function makes sense, but the steps in between would be really difficult to guess!
My suggestion is that each component is a separate class in a separate file, for now these are:
And each component is called directly from the main
function, something like:
def main(arguments=None):
parser = parse_arguments()
args = parser.parse_args(arguments)
# ... some stuff about config file
CI = APIcarbonIntensity_UK(args).getCI()
optimal_time = runtimeOptimiser(args).findRuntime(CI)
print(f"Best job start time: {optimal_time['start_time']}")
if carbonFootprint:
estim = greenAlgorithmsCalculator(...).get_footprint()
print(f"carbon footprint: {estim}")
I'm keen to start working on a branch in this direction, but would like to hear people's thoughts on that as I've probable forgotten some aspects! ๐
Find a suitable system for testing Slurm on. This could be:
Given a timeseries in the form of a CSV file with columns timestamp
, carbon_intensity
and a runtime, write a function that returns a timestamp that will minimize total carbon intensity.
I've just done a demo of this and it turned out that the lowest carbon intensity predicted for Oxford was in 48 hrs time (the last half hour returned by the API). Currently we choose to schedule the start of the task then. Is this what we want to happen?
Probably worth thinking about some of this kind of edge case, and cooking up some example csv files so we can test them. But deciding what to do isn't obvious to me.
Files like fixed_parameters.yaml
are not being used anymore and should be removed.
This is to continue the discussion @tlestang started with PR #43
From my comments there:
I see two different use cases here:
We or other contributors will want to add other CI APIs (e.g. for other countries), and we ideally want to make them part of CATS so that these new APIs are available to the whole community. In this case, it would be good to have all the URL/parsing codes in the same place and api_inferface is a good place for that (it also makes it easier to add things by copy-pasting). And in terms of how much hassle it is to add it, it's equivalent now and with the new code (api_interface needs to be modified either way, and current code requires messing with init.py as well), but the existing code doesn't allow user to easily pick an API, this is what the new argument --api-carbonintensity introduces.
Second use case is if users want to pass their own API wrapper directly to CATS without having to modify the code. And in this case I agree, it would be good to make it possible in an easier way. But how would that work in practice? It would be good to have an idea of how the user would do it if we want to implement it.
This issue is to discuss whether we want to implement (2) and how it would work in practice from the user's point of view.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.