GithubHelp home page GithubHelp logo

1eooyang / poseestimationformobile Goto Github PK

View Code? Open in Web Editor NEW

This project forked from littlegnal/single-person-pose-estimation-for-mobile

0.0 1.0 0.0 52.55 MB

Single-person pose estimation for smartphone (Android and IOS).

License: Apache License 2.0

CMake 0.06% C 3.32% C++ 46.26% Objective-C 0.11% Kotlin 0.62% Java 13.81% Lua 0.27% MATLAB 0.79% Jupyter Notebook 33.49% Makefile 0.01% Python 1.28%

poseestimationformobile's Introduction

This repository currently implemented the Convolutional Pose Machine (CPM) using TensorFlow. Instead of normal convolution, inverted residuals (also known as Mobilenet V2) module has been used inside the CPM for faster inference. More experimental models will release as time goes by.

Hence, the respository contains:

  • Code of training model
  • Code of converting model to TensorFlow Lite
  • Android Demo
  • IOS Demo (TODO)

Below Gif is catch on Mi Mix2s (5 FPS)

image

Download the apk of demo.

You can buy me a coke if you think my work is helpful for you.
ETH address: 0x8fcF32D797968B64428ab2d8d09ce2f74143398E

Training


Dependencies:

  • Python3
  • TensorFlow >= 1.4

Dataset:

Training dataset available through google driver.

Unzip it will obtain the following file structure

# root @ ubuntu in ~/hdd/ai_challenger
$ tree -L 1 .
.
├── ai_challenger_train.json
├── ai_challenger_valid.json
├── train
└── valid

The traing dataset only contains single person images and it come from the competition of AI Challenger. I transfer the annotation into COCO format for using the data augument code from tf-pose-estimation respository.

Hyper-parameter

In training procedure, we use cfg file on experiments folder for passing the hyper-parameter.

Below is the content of mv2_cpm.cfg.

[Train]
model: 'mv2_cpm'
checkpoint: False
datapath: '/root/hdd/ai_challenger'
imgpath: '/root/hdd/'
visible_devices: '0, 1, 2'
multiprocessing_num: 8
max_epoch: 1000
lr: '0.001'
batchsize: 5
decay_rate: 0.95
input_width: 224
input_height: 224
n_kpoints: 14
scale: 2
modelpath: '/root/hdd/trained/mv2_cpm/models'
logpath: '/root/hdd/trained/mv2_cpm/log'
num_train_samples: 20000
per_update_tensorboard_step: 500
per_saved_model_step: 2000
pred_image_on_tensorboard: True

The cfg not cover all the parameters of the model, there still have some parameters in the network_mv2_cpm.py.

Train by nvidia-docker

Build the docker by the following command:

cd training/docker
docker build -t single-pose .

or

docker pull edvardhua/single-pose

Then run the following command to train the model:

nvidia-docker run -it -d \
-v <dataset_path>:/data5 -v <training_code_path>/training:/workspace \
-p 6006:6006 -e LOG_PATH=/root/hdd/trained/mv2_cpm/log \
-e PARAMETERS_FILE=experiments/mv2_cpm.cfg edvardhua/single-pose

Also, it will create the tensorboard on port 6006. Beside, make sure you install the nvidia-docker.

Train by ordinary way

  1. install the dependencies.
cd training
pip3 install -r requirements.txt

Beside, you also need to install cocoapi

  1. Edit the parameters files in experiments folder, it contains almost all the hyper-parameters and other configuration you need to define in training. After that, passing the parameters file to start the training:
cd training
python3 src/train.py experiments/mv2_cpm.cfg

It's take 12 hour to training the model on 3 Nvidia 1080Ti graphics cards, below is the corresponding plot on tensorboard.

image

Pretain model

Can be download here.

Android Demo


After you training the model, the following command can transfer the model into tflite.

# Convert to frozen pb.
cd training
python3 src/gen_frozen_pb.py \
--checkpoint=<you_training_model_path>/model-xxx --output_graph=<you_output_model_path>/model-xxx.pb \
--size=256 --model=mv2_cpm_2

# Convert to tflite.
# See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/docs_src/mobile/tflite/devguide.md for more information.
bazel-bin/tensorflow/contrib/lite/toco/toco \
--input_file=<you_output_model_path>/model-xxx.pb \
--output_file=<you_output_tflite_model_path>/mv2-cpm.tflite \
--input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE \
--inference_type=FLOAT \
--input_shape="1,224,224,3" \
--input_array='image' \
--output_array='Convolutional_Pose_Machine/stage_5_out'

Then, place the tflite file in android_demo/app/src/main/assets and modify the parameters in ImageClassifierFloatInception.kt.

......
......
    // parameters need to modify in ImageClassifierFloatInception.kt
    /**
     * Create ImageClassifierFloatInception instance
     *
     * @param imageSizeX Get the image size along the x axis.
     * @param imageSizeY Get the image size along the y axis.
     * @param outputW The output width of model
     * @param outputH The output height of model
     * @param modelPath Get the name of the model file stored in Assets.
     * @param numBytesPerChannel Get the number of bytes that is used to store a single
     * color channel value.
     */
    fun create(
      activity: Activity,
      imageSizeX: Int = 224,
      imageSizeY: Int = 224,
      outputW: Int = 112,
      outputH: Int = 112,
      modelPath: String = "mv2-cpm.tflite",
      numBytesPerChannel: Int = 4
    ): ImageClassifierFloatInception =
      ImageClassifierFloatInception(
          activity,
          imageSizeX,
          imageSizeY,
          outputW,
          outputH,
          modelPath,
          numBytesPerChannel)
......
......

Finally, import the project to Android Studio and run in you smartphone.

IOS Demo (TODO)


If you are an IOS enthusiast who are interested in this project and want to migrate to ios, we welcome to submit a pull request.

Reference


[1] Paper of Convolutional Pose Machines
[2] Paper of MobileNet V2
[3] Repository of tf-pose-estimation
[4] Devlope guide of TensorFlow Lite

License


Apache License 2.0

poseestimationformobile's People

Contributors

edvardhua avatar littlegnal avatar

Watchers

yangliu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.