GithubHelp home page GithubHelp logo

yuv4r4j / kaggle_leaf_disease_classification Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kozodoi/kaggle_leaf_disease_classification

0.0 0.0 0.0 502.97 MB

Cassava leaf disease classification with CNNs and Vision Transformers (top-1% solution)

Python 0.04% Jupyter Notebook 99.96%

kaggle_leaf_disease_classification's Introduction

Cassava Leaf Disease Classification

The top-1% solution to the Cassava Leaf Disease Classification Kaggle competition.

sample

Summary

Cassava is one of the key food crops grown in Africa. Plant diseases are major sources of poor yields. To diagnose diseases, farmers require the help of agricultural experts to visually inspect the plants, which is labor-intensive and costly. Deep learning helps to automate this process.

This project works with a dataset of 21,367 cassava images. The pictures are taken by farmers on mobile phones and labeled as healthy or having one of the 4 common disease types. Main data-related challenges are poor image quality, inconsistent background conditions and label noise.

We develop a stacking ensemble with CNNs and Vision Transformers implemented in PyTorch. Our solution reaches the test accuracy of 91.06% and places 14th out of 3,900 competing teams. The diagram below overviews the ensemble. The detailed summary of our solution is provided this writeup.

cassava

Project structure

The project has the following structure:

  • functions/: .py scripts with training, inference and data processing functions
  • notebooks/: .ipynb notebooks performing training of CNN/ViT models and ensembling
  • data/: input data (images are not included due to size constraints and can be downloaded here)
  • output/: model configurations, weights and diagrams exported from notebooks
  • pretraining/: model configurations and weights pretrained on external datasets

Working with the repo

Our solution can be reproduced in the following steps:

  1. Downloading competition data and adding it into the data/ folder.
  2. Running all training notebooks pytorch-model to obtain weights of 33+2 base models for the ensemble.
  3. Running the ensembling notebook lightgbm-stacking to obtain the final prediction.

All pytorch-model notebooks have the same structure and differ in model/data parameters. Different versions are included to ensure reproducibility. If you only wish to get familiar with our solution, it is enough to inspect one of the PyTorch modeling codes and go through the functions/ folder to understand the training process. The stacking ensemble reproducing our submission is also provided in this Kaggle notebook.

The notebooks are designed to run on Google Colab. More details are provided in the documentation within the notebooks.

kaggle_leaf_disease_classification's People

Contributors

kozodoi avatar lizzzi111 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.