GithubHelp home page GithubHelp logo

hdr2030 / plant_pathology_2021-fgvc8_kaggle_challenge Goto Github PK

View Code? Open in Web Editor NEW

This project forked from antonindurieux/plant_pathology_2021-fgvc8_kaggle_challenge

0.0 0.0 0.0 30.69 MB

Kaggle competition : identify the category of foliar diseases in apple trees. Implementation of a CNN with Keras and TensorFlow, on TPU hardware.

Jupyter Notebook 100.00%

plant_pathology_2021-fgvc8_kaggle_challenge's Introduction

Plant Pathology 2021 - FGVC8

This repository contains the code for the solution I submitted for the Kaggle Plant Pathology 2021 challenge, which took place from March 15 2021 to May 27 2021. This competition was part of the Fine-Grained Visual Categorization FGVC8 workshop at the Computer Vision and Pattern Recognition Conference CVPR 2021.

A corresponding article can be found on my website here.

This competition was a good opportunity to explore some technical topics related to Convolutional Neural Networks and computer vision such as :

  • How to implement a CNN taking advantage of TPUs to speed up the computing steps ;
  • How to build an efficient TensorFlow input pipeline with the tf.data API ;
  • What loss could be suitable for optimizing the F1-score ;
  • What are Vision Transformer neural networks.

My solution ranked 11th out of 626 teams on the public leaderboard, and 36th on the private leaderboard (top 6%).

Task

As stated on the competition description page :

"Apples are one of the most important temperate fruit crops in the world. Foliar (leaf) diseases pose a major threat to the overall productivity and quality of apple orchards. The current process for disease diagnosis in apple orchards is based on manual scouting by humans, which is time-consuming and expensive."

The task of this challenge was thus to develop a machine learning-based model to identify diseases on images of apple tree leaves.

Each leaf could be healthy, or present a combination of various diseases. As each image could potentially be associated with several labels (in case of multiple diseases), this was a multi-label classification task.

Data

For the purpose of the competition, a dataset of 18632 labeled apple tree leaf images was provided.

The test set used to evaluate the participant submissions was constituted of roughly 2700 images.

The pictures were provided in jpeg format of relatively high resolution, lots of them being 2676 x 4000 pixels, but the resolution and aspect ratio could somewhat vary for some images.

Performance metric

The evaluation metric for this competition was the Mean F1-score.

General approach

My best score was reached by averaging the output of 3 different models :

On top of the training process optimizations, significant results improvements were brought by :

  • Suitable image augmentation,
  • Handling the cases were no label has been predicted by the model (probability of every label inferior to the chosen threshold),
  • Test Time Augmentation (TTA) (see this article for a brief explanation on how it works).

Usage

4 Notebooks are available in this repository :

plant_pathology_2021-fgvc8_kaggle_challenge's People

Contributors

antonindurieux avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.