GithubHelp home page GithubHelp logo

jonocx / dobble-jetson-zenml-competition Goto Github PK

View Code? Open in Web Editor NEW

This project forked from fuzzylabs/dobble-jetson-zenml-competition

0.0 0.0 0.0 782 KB

License: Apache License 2.0

Python 85.05% HCL 14.75% Dockerfile 0.20%

dobble-jetson-zenml-competition's Introduction

The Dobblegängers Month of MLOps Submission

This repository contains The Dobblegängers (a.k.a., Fuzzy Labs) submission to ZenML Month of MLOps competition

Contents

The Dobblegängers

  1. Misha Iakovlev
  2. Shubham Gandhi
  3. Oscar Wong
  4. Christopher Norman
  5. Jon Carlton

What have we done?

At Fuzzy Labs we're trying to become Dobble world champions. So, we came up with a plan - we've trained an ML model to recognise the common symbol between two cards, and what better way to make it than with a ZenML pipeline.

If you're reading this and wondering: what on earth is Dobble? Let us explain. It's a game of speed and observation where the aim is to be the quickest to identify the common symbol between two cards. If you're the first to find it and name it, then you win the card. Simple, right? It essence, it's a more sophisticated version of snap.

Now that you're all caught up, let's go into a little more detail about what we've done. Obviously as we're wanting to win the world championships, we need a concealable device. So, to also provide an extra challenge, we decided to deploy our model to a NVIDIA Jetson Nano.

Code & Repository Structure

This repository contains all the code and resources to set up and run a data pipeline, training pipeline, and inference on the Jetson. It's structured as follows:

.
├── LICENSE
├── pyproject.toml
├── README.md
├── requirements.txt                        # dependencies required for the project
├── docs                                    # detailed documentation for the project
├── pipelines                               # all pipelines inside this folder
│   └── training_pipeline
        └── training_pipeline.py
        └── config_training_pipeline.yaml   # each pipeline will have one config file containing information regarding step and other configuration
├── run.py                                  # main file where all pipelines can be run
└── steps                                   # all steps inside this folder
    └── data_preprocess                     # each step is in its own folder (as per ZenML best practises)
        └── data_preprocess_step.py
    └── src                                 # extra utilities that are required by steps added in this folder
└── zenml_stack_recipes                     # contains the modified aws-minimal stack recipe

As we've also used some cloud resources to store data and host experiment tracking, we used one of the ZenML stack recipes. There's more information on this here.

Project Overview

To give an overview of our solution (see here for an in-depth description), we've broken this challenge down into three stages, with two pipelines:

This downloads the labelled data, processes it into the correct format for training, and uploads to an S3 bucket.

This pipeline downloads the data, validates the data, trains and evaluates a model, and exports to the correct format for deployment.

Deployment Stage

Here, the trained model is loaded onto the device and inference is performed in real-time

Setup

The first step is creating a virtual environment and install the project requirements, we've used conda but feel free to use whatever you prefer (as long as you can install a set of requirements):

conda create -n dobble_venv python=3.8 -y
conda activate dobble_venv
pip install -r requirements.txt

The next step is to setup ZenML, with the first step being to install the required integrations:

zenml integrations install -y pytorch mlflow

Initialise the ZenML repository

zenml init

Start the ZenServer

zenml up

Note Visit ZenML dashboard is available at 'http://127.0.0.1:8237'. You can connect to it using the 'default' username and an empty password. If there's a TCP error about port not being available. Run fuser -k port_no/tcp to close an open port and run zenml up command again, for MacOS, run kill $(lsof -t -i:8237).

By default, ZenML comes with a stack which runs locally. Next, we add MLflow as an experiment tracker to this local stack, which is we'll run the pipelines:

zenml experiment-tracker register mlflow_tracker --flavor=mlflow
zenml stack register fuzzy_stack \
    -a default \
    -o default \
    -e mlflow_tracker \
    --set

You're now in a position where you can run the pipelines locally.

Running the Pipelines

We have a couple of options for running the pipelines, specified by flags:

python run.py -dp       # run the data pipeline only
python run.py -tp       # run the training pipeline only
python run.py -dp -tp   # run both the data and training pipelines

Setup using the Stack Recipe

Please see here for a detailed guide on what we've modified in the aws-minimal stack recipe and how to run it

Blog Posts & Demo

As part of our submission, we've written a series of blogs on our website. Each of the blogs has an accompanying video.

Introduction

https://www.youtube.com/watch?v=j9TAVpM5NRQ

About the Edge

https://www.youtube.com/watch?v=djliB4QnuoQ

The Data Science

Video: https://www.youtube.com/watch?v=gCAzpyE0Zr8 Blog: https://www.fuzzylabs.ai/blog-post/zenmls-month-of-mlops-data-science-edition

Pipelines on the Edge

Blog: https://www.fuzzylabs.ai/blog-post/mlops-pipeline-on-the-edge

dobble-jetson-zenml-competition's People

Contributors

christopher-norman avatar d-lowl avatar dudeperf3ct avatar jonocx avatar osw282 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.