graphcore / examples-utils Goto Github PK
View Code? Open in Web Editor NEWUtils and common code for Graphcore's example applications
License: Other
Utils and common code for Graphcore's example applications
License: Other
Associated with https://graphcore.atlassian.net/browse/ACS-13
This is for defining exactly what is needed for this JIRA ticket to be completed and tracking distribution of sub-issues/tasks among assignees.
Link all relevant PRs to this issue to track.
Currently the code under platform assessment has three issues:
This task is for making the code more organised and integrated neatly into examples-utils benchmaring (or splitting it off into something else entirely) and converting it to use docker containers.
Previous round of testing + feedback from SysOps and PSE was very useful and led to the improvment and promotion of examples utils benchmarking.
Round 2 will be performed with the AI-Engineering cloud SDK team.
Currently, the examples-utils benchmarking sub-module has no documentation aside from a brief README that is provided to users. This task will cover all efforts up to a V1.0 of documentation.
The current implementation of finding checkpoints for upload to wands/ s3 assumes:
--checkpoint-output-dir
contains subdirectories. However it may be the case that the output directory itself holds the checkpoint files.We need to identify all the expected scenarios across all apps for checkpoint outputs:
Could it be easier to control the format of the output directories in each application instead?
I've extracted the functionality into: requirements_utils.py
. I can move it further into its own folder if you think that's valuable but figured I'd get some feedback on this first. There are a few things that I want to improve before merging:
environment_setup.log
pip freeze
after installing each requirements--help
for platform assessmentIdeally I don't plan to merge with the platform_assessment script yet but I can open a follow up issue for that.
Originally posted by @payoto in #46 (comment)
@joshlk told cppimport introduced a fix for the issue arising in parallel compilation from version 22.07.17
Is it possible to upgrade the cppimport dependency to 22.07.17 and remove custom workarounds?
Some CI tests were still failing this week (they don't fail everytime though) (test attention in Bert https://jenkins.sourcevertex.net/job/public_examples/job/public_examples_ci_ubuntu_18_04_hw_pod_mk2/316/testReport/junit/(root)/(empty)/tests_integration_layer_test_attention/), and we could possibly solve the issue upgrading the popxl-addons dependency using cppimport 22.07.17. However, if examples utils depends on a previous version, there is a dependency conflict.
There are already some upgrades that we can make, even before feedback round 2:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.