GithubHelp home page GithubHelp logo

yurayli / shopee-product-categorization Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 7.59 MB

Shopee Code League 2020 image competition 7th place solution

Jupyter Notebook 100.00%
e-commerce shopee computer-vision efficientnet noisy-labels tpu-acceleration tensorflow

shopee-product-categorization's Introduction

Shopee Code League - Product Categorization

This is the 7th place solution to the competition Shopee Code League image competition.

Overview

The final submission is ensemble of efficientnet-b4, efficientnet-b5, their checkpoints, and test-time augmentations.

drawing

TPU/XLA data pipeline

Based on the previous experience in Kaggle flower competition, we decided to use the power of tpu/xla engine for training neural nets in this competition. The tools for data preprocessing is a bit different from training in normal gpu, but luckily we can still use tf.data api.

  • To train lots of images, it is better firstly translate images to the format filename.tfrecords.
  • We cannot use normal pre-processing package such as albumentations, but we need to implement the functions of image transformations by ourselves. Here we use image shift/rotate/scale and Cutout for image augmentations.
  • For a ResNet50, ~990 sec/epoch on tesla p100; ~70 sec/epoch on tpu v3-8.

Noisy labels

drawing

After checking the data and the results of the baseline model, we found out the labels are noisy, which may be the reason that we cannot get high accuracy in this dataset. Also there is certain ambiguity between some categories, i.e., some images are hard to categorize even checked by our own eyes. Thus here we adopt label smoothing to address these problems.

Model

We start from ResNet then change to EfficientNet as our model architecture. The better pretrained architecture can also be more robust to noisy labels [1]. Also, the proper image scale is important to the corresponding network.

Training

drawing
The warmup-annealing learning rate schedule helps the network training more stable and converge faster [2]. We further extended the annealing part to cyclic annealing, which gives the network higher chances to find better optima. Also we can use the optima checkpoints during training for final ensembling. [3, 4]

Dependencies

tensorflow 2.2.0
tensorflow-addons 0.9.1
image-classifiers 1.0.0
efficientnet 1.1.0

References

[1] Understanding Deep Learning on Controlled Noisy Labels (link)
[2] Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates (link)
[3] Snapshot Ensembles: Train 1, get M for free (link)
[4] Averaging Weights Leads to Wider Optima and Better Generalization (link)

shopee-product-categorization's People

Contributors

yurayli avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.