GithubHelp home page GithubHelp logo

bolaik / invasivespeciesmonitoring Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 22.04 MB

Apply transfer learning to predict if an input image contain invasive species

Jupyter Notebook 99.99% Python 0.01%

invasivespeciesmonitoring's Introduction

InvasiveSpeciesMonitoring

Introduction

The motivation of this Kaggle competition is to apply machine learning and deep learning technique to automatically detect the presence or absence of invasive species, which can have damaging effect on the local environment and economy.

In the Kaggle Invasive Species Monitoring competition, the data set contains pictures taken in a Brazilian national forest. In some of the pictures there is Hydrangea, a beautiful invasive species original of Asia. The rest are simple background images, such as jungles, houses or even local people and animals like horses. Based on the training pictures and the labels provided, the participant should predict the presence of the invasive species in the testing set of pictures. The dataset contains 2295 training images and 1531 test images. All are color images with 866 x 1154 pixels in size. Typical examples of positive and negative examples of invasive species are shown below:

Main method

Transfer Learning

We applied VGG-16 trained on ImageNet as our pretrained model and replace the last fc-layers of the original VGG-16 net with a new fc-layer and finally a final output node with sigmoid activation, the score of which gives the prediction probability of the input images.

How to train?

  • Train only the added layers
  • Train the added layers first and then fine-tuning a few top layers of the pretrained VGG-16 model
  • Train the entire model

How to preprocess the training images?

It might be important to include as much information of the original input as possible. Since for some amount of the training images, the invasive plant only takes a small part of the entire image. If we take random crop before resizing, it may accidentally leave out the important info.

We apply the preprocessing technique of this Kaggle post [1]1. Basically, the images are down-sized to 256 x 256 pixels, filling the short edges by zeros. The resized images are then cropped to 224 x 224 before feed into the VGG-16 model.

Downsampling method

area, bilinear, Lanczos

How to preprocess images at the test phase?

This is motivated by the fact that some of the hard to predict images contain invasive plants hidden far in the background of the images. Thus it may be helpful to crop the test images at different positions and scales, and enhance the prediction by ensemble them.

  • predict once with the entire image, same preprocessing as train images
  • crop 3x3 blocks, ensemble 9 predictions with mean or max
  • crop 2x2 blocks, plus center crop , ensemble 5 predictions with mean or max
  • center crops at various scale, [0, 0.1, 0.2, 0.3] (portion cropped from 4 edges), ensemble with mean of max-2 or mean of 4
  • 3 crops of half-width, [left, center, right], plus 1 prediction with entire images, take mean or max

Results

Saliency map

Downsampling method

Downsize method Val Acc Val AUC LB Score (AUC)
cv2.INTER_LINEAR (default) 0.984749 0.998239 0.98536
cv2.INTER_AREA 0.989107 * 0.998742 * 0.98457
cv2.INTER_LANCZOS4 0.986928 0.998065 0.98813 *

Different cropping methods at test phase (Bilinear)

Cropping method Val Acc Val AUC LB AUC
entire 0.984749 * 0.99824 * 0.98536
centeredx4 mean 0.967320 0.99445 0.98402
centeredx4 max-2 0.958606 0.99540 0.98442
lmr+entire mean 0.976035 0.99644 0.98666 *
lmr+entire max 0.943355 0.99576

Different training strategy (Lanczos)

Training Strategy Val Acc Val AUC LB AUC
train whole 0.986928 * 0.998065 0.98813 *
train added layers 0.982571 0.998452
train added layers + fine tune top layers 0.978214 0.998220
train added layers + fine tune all 0.986928 * 0.998742 * 0.98578

Conclusion

  • Simply train the entire model
  • Use Lanczos downsampling
  • Use 3 + 1 (left, center, right + whole) at the test image prediction stage
  • Try bagging and ensembling

invasivespeciesmonitoring's People

Contributors

bolaik avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.