GithubHelp home page GithubHelp logo

mushroom-classification's Introduction

MLOps Zoomcamp Project — Mushroom Classification

My final project for MLOps Zoomcamp

Objective

The goal of this project is to develop and build a MLOps pipeline to build and deploy a predictive model to determine the edibility of mushrooms based on their characteristics.

Dataset

The dataset used in this project has been downloaded from Kaggle. This dataset includes descriptions of hypothetical samples corresponding to 23 species of gilled mushrooms in the Agaricus and Lepiota Family Mushroom drawn from The Audubon Society Field Guide to North American Mushrooms (1981). Each species is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended. This latter class was combined with the poisonous one.

Tools used

  • Poetry — Python depedency manager
  • Pyenv — Python version manager
  • Prefect — Workflow orchestrator
  • MLFlow — Experiment tracker and model register
  • FastAPI — Web API
  • dotenv — environment variable loader
  • pre-commit — pre-commit hooks
  • AWS — Cloud service
  • Docker — Containerization
  • htmx - Better html interactivity.

Pre-requisites

Credentials

To change the default behaviour or use a cloud server, copy .env.example to .env with

cp .env.example .env

And change the default values to your needs.

Build Docker Image

It is possible to build the image with docker compose or docker build

Docker Compose

To build and run the image run

docker compose up

Docker Build

To build the Docker Image run

make build

To launch the application run

docker run -it --rm -p 8000:8000 mushroom-classification

Using the API

The application works on POST requests, to send a request with CURL:

curl -X 'POST' \
  'http://127.0.0.1:8000/api/predict' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "cap_shape": "x",
  "cap_surface": "s",
  "cap_color": "n",
  "bruises": "t",
  "odor": "a",
  "gill_attachment": "f",
  "gill_spacing": "c",
  "gill_size": "n",
  "gill_color": "b",
  "stalk_shape": "e",
  "stalk_root": "e",
  "stalk_surface_above_ring": "f",
  "stalk_surface_below_ring": "f",
  "stalk_color_above_ring": "b",
  "stalk_color_below_ring": "b",
  "veil_type": "p",
  "veil_color": "n",
  "ring_number": "n",
  "ring_type": "e",
  "spore_print_color": "k",
  "population": "a",
  "habitat": "g"
}
'

The features and its possible values to be used in the API can be seen in docs/data.md.

The response object is a json object with the probability of the mushroom be poisonous, the response for the object above is

{"poisonous-probability":0.0}

You can also navigate to the url http://127.0.0.1:8000 and select the mushroom characteristics.

The page has a submit button, which return the probability of the mushroom with the given characteristics be poisonous.

Build locally

Activate environment:

# if using poetry
poetry shell

# if using venv
source venv/bin/activate
  • Install with poetry:

    poetry install
  • Install with pip

    Activate the environment and run:

    pip install .

Set prefect api to local:

prefect config set PREFECT_API_URL="http://127.0.0.1:4200/api"

Start prefect server:

prefect server start

Start mlflow server in another window (also reactivate the python environment):

mlflow server --backend-store-uri sqlite:///mlflow.db

Train model:

python src/train.py --input-path data/mushrooms.csv

Start web-service:

uvicorn src.api:app --reload

Further improvements

  • Add a monitoring service
  • Create a Frontend for the API
  • Implement IaC
  • Use CI/CD
  • Create tests

Disclaimer

The prediction model was created solely with the purpose in create a MLOps pipeline and is not advisable to use the deployed model with unknown mushrooms.

mushroom-classification's People

Contributors

alvaro-kothe avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.