GithubHelp home page GithubHelp logo

deep-prior-pp-pytorch's Introduction

Deep-Prior-PP-PyTorch

Deep Prior PP implementation in PyTorch

Many util functions based on: https://github.com/moberweger/deep-prior-pp

Many general (and util) functions based on: https://github.com/dragonbook/V2V-PoseNet-pytorch

Setup

$> conda create -n deep-prior python=3.7.1
$> conda activate deep-prior
$> conda install -c anaconda opencv=3.4.2 numpy=1.15.4 matplotlib=3.0.1
$> ### only choose one of the two below
$> conda install -c pytorch pytorch-cpu=1.0.0 torchvision-cpu
$> conda install -c pytorch pytorch=1.0.0 torchvision

Instructions

All script calls must be made whilst in the root folder location.

# for debugging/testing model with random data
python src/dp_model.py

# for actual training using MSRA dataset
python src/dp_main.py

Status

Implemented

Features present in this code:

  • MSRA loading + transformer util func
  • ResNet-50 based model
  • refined CoM support (untested)
  • training for 'P1' gesture
  • training for all gestures (untested)
  • validation -- (based on test_set_id only)
  • testing -- based on test_set_id
  • avg 3D error calculation on test_set (using abs dist btw target & output keypoints)
  • Data Augmentation for training
    • Using Rotation
    • Using Translation
    • Using Scaling
  • Data augmentation for PCA

Not Implemented

These features are present in original soruce code but not yet implemented here:

  • CoM detection aka Hand Detector + RefineNet as pipeline
  • ScaleNet (aka multi-scale training)
  • % Error frames vs max 3D error
  • NYU, ICVL datasets

Results

  • 30 Epochs unless otherwise specified
  • PCA ~70k unless otherwise specified
  • Everything else as orig paper

Experimented

CoM PCA Aug Train Aug Error Notes
RefineNet None None 14.6952mm
RefineNet None Rot+None 13.1496mm
RefineNet None Scale+None 13.4824mm
RefineNet None Rot+Trans+None 13.4938mm
RefineNet None Rot+Scale+None 13.9754mm
RefineNet Rot+Scale+Trans+None Rot+Scale+Trans+None 13.2108mm
RefineNet Rot+Scale+Trans+None Rot+Scale+Trans+None 12.64mm 50 epoch training
RefineNet Rot+Scale+Trans+None Rot+Scale+Trans+None 13.4766mm pca-200k_ep-30
RefineNet Rot+Scale+Trans+None Rot+Scale+Trans+None 11.9229mm pca-200k_ep-100
RefineNet Rot+Scale+Trans+None Rot+None 13.3798mm pca-200k_ep-30
RefineNet Rot+Scale+Trans+None Rot 17.1169mm pca-200k_ep-30
RefineNet Rot+Scale+Trans+None Rot+None ~11.5mm pca-200k_ep-100
RefineNet Rot+Scale+Trans+None Rot+None ~12.6mm pca-1M_ep-100
RefineNet Rot+Trans+None Rot+Trans+None ??mm pca-1M_ep-100

Note PCA sampling is not repeatable, thus some inconsistencies with results

Target

~9mm with PCA augmentation (rot+trans+none; 1e6 samples) + rot+trans+none augmentation for MSRA dataset

Dataset

See datasets/README.md for details on the required datasets.

Other

See doc/notes.md for more details (currently in rough / needs cleanup)

Progress Doc

Eval

See eval/README.md for more details.

New

We will evaluate results without the use of refineNet pre-computed CoM because they have used a slightly different dataset for their evaluation so the CoM cannot be used from their directly i.e. they are neither all the same nor a subset (some don't exist in either datasets).

Using GT_CoM FINAL_AVG_3D_ERROR: 10.4478mm PCA_AUG_sz = 200k With Config: {GT_CoM: True, Aug: ['AUG_ROT', 'AUG_TRANS', 'AUG_NONE'], Full_Dataset: True} Test id 5 same config for pca augmentation

deep-prior-pp-pytorch's People

Contributors

dragonbook avatar hsed avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.