GithubHelp home page GithubHelp logo

aliprf / asmnet Goto Github PK

View Code? Open in Web Editor NEW
58.0 1.0 13.0 143.18 MB

a Lightweight Deep Neural Network for Face Alignment and Pose Estimation

License: MIT License

Python 86.51% HTML 13.49%
deep-learning computer-vision pose-estimation cnn face-alignment facial-landmarks cvpr2021 face-points-detection face-pose-estimators

asmnet's Introduction

PWC

! plaese STAR the repo if you like it.

a Lightweight Deep Neural Network for Face Alignment and Pose Estimation

Link to the paper:

https://scholar.google.com/scholar?oi=bibs&cluster=3428857185978099736&btnI=1&hl=en

Link to the paperswithcode.com:

https://paperswithcode.com/paper/asmnet-a-lightweight-deep-neural-network-for

Link to the article on Towardsdatascience.com:

https://aliprf.medium.com/asmnet-a-lightweight-deep-neural-network-for-face-alignment-and-pose-estimation-9e9dfac07094

Please cite this work as:

      @inproceedings{fard2021asmnet,
            title={ASMNet: A Lightweight Deep Neural Network for Face Alignment and Pose Estimation},
            author={Fard, Ali Pourramezan and Abdollahi, Hojjat and Mahoor, Mohammad},
            booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
            pages={1521--1530},
            year={2021}
      }

Introduction

ASMNet is a lightweight Convolutional Neural Network (CNN) which is designed to perform face alignment and pose estimation efficiently while having acceptable accuracy. ASMNet proposed inspired by MobileNetV2, modified to be suitable for face alignment and pose estimation, while being about 2 times smaller in terms of number of the parameters. Moreover, Inspired by Active Shape Model (ASM), ASM-assisted loss function is proposed in order to improve the accuracy of facial landmark points detection and pose estimation.

ASMnet Architecture

Features in a CNN are distributed hierarchically. In other words, the lower layers have features such as edges, and corners which are more suitable for tasks like landmark localization and pose estimation, and deeper layers contain more abstract features that are more suitable for tasks like image classification and image detection. Furthermore, training a network for correlated tasks simultaneously builds a synergy that can improve the performance of each task.

Having said that, we designed ASMNe by fusing the features that are available if different layers of the model. Furthermore, by concatenating the features that are collected after each global average pooling layer in the back-propagation process, it will be possible for the network to evaluate the effect of each shortcut path. Following is the ASMNet architecture:

ASMNet architecture

The implementation of ASMNet in TensorFlow is provided in the following path: https://github.com/aliprf/ASMNet/blob/master/cnn_model.py

ASM Loss

We proposed a new loss function called ASM-LOSS which utilizes ASM to improve the accuracy of the network. In other words, during the training process, the loss function compares the predicted facial landmark points with their corresponding ground truth as well as the smoothed version the ground truth which is generated using ASM operator. Accordingly, ASM-LOSS guides the network to first learn the smoothed distribution of the facial landmark points. Then, it leads the network to learn the original landmark points. For more detail please refer to the paper. Following is the ASM Loss diagram:

ASM Loss

Evaluation

As you can see in the following tables, ASMNet has only 1.4 M parameters which is the smallets comparing to the similar Facial landmark points detection models. Moreover, ASMNet designed to performs Face alignment as well as Pose estimation with a very small CNN while having an acceptable accuracy.

num of params

Although ASMNet is much smaller than the state-of-the-art methods on face alignment, it's performance is also very good and acceptable for many real-world applications: 300W Evaluation

WFLW Evaluation

As shown in the following table, ASMNet performs much better that the state-of-the-art models on 300W dataseton Pose estimation task: Pose Evaluation

Following are some samples in order to show the visual performance of ASMNet on 300W and WFLW datasets: 300W visual wflw visual

The visual performance of Pose estimation task using ASMNet is very accurate and the results also are much better than the state-of-the-art pose estimation over 300W dataset:

pose sample visual


Installing the requirements

In order to run the code you need to install python >= 3.5. The requirements and the libraries needed to run the code can be installed using the following command:

  pip install -r requirements.txt

Using the pre-trained models

You can test and use the preetrained models using the following codes which are available in the following file: https://github.com/aliprf/ASMNet/blob/master/main.py

  tester = Test()
  tester.test_model(ds_name=DatasetName.w300,
                      pretrained_model_path='./pre_trained_models/ASMNet/ASM_loss/ASMNet_300W_ASMLoss.h5')

Training Network from scratch

Preparing Data

Data needs to be normalized and saved in npy format.

PCA creation

you can you the pca_utility.py class to create the eigenvalues, eigenvectors, and the meanvector:

pca_calc = PCAUtility()
    pca_calc.create_pca_from_npy(dataset_name=DatasetName.w300,
                                 labels_npy_path='./data/w300/normalized_labels/',
                                 pca_percentages=90)

Training

The training implementation is located in train.py class. You can use the following code to start the training:

 trainer = Train(arch=ModelArch.ASMNet,
                    dataset_name=DatasetName.w300,
                    save_path='./',
                    asm_accuracy=90)

Please cite this work as:

  @inproceedings{fard2021asmnet,
        title={ASMNet: A Lightweight Deep Neural Network for Face Alignment and Pose Estimation},
        author={Fard, Ali Pourramezan and Abdollahi, Hojjat and Mahoor, Mohammad},
        booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
        pages={1521--1530},
        year={2021}
  }
@@plaese STAR the repo if you like it.@@

asmnet's People

Contributors

aliprf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

asmnet's Issues

Using landmarks to crop the images

From looking through your code it seems like you have to know the landmarks before you can even crop the images appropriately. For a new image not found in the training set, how are you going about getting the landmarks to use to crop new images? Are you using dlib? Or am I totally missing something?

This is the part I am talking about inside image_utility.py:

class ImageUtility:

    def crop_and_save(self, _image, _label, file_name, num_of_landmarks, dataset_name):
        try:
            '''crop data: we add a small margin to the images'''

            xy_points, x_points, y_points = self.create_landmarks(landmarks=_label,
                                                                      scale_factor_x=1, scale_factor_y=1)

Data normalization tutorial

Hi creator,

Could you please provide a tutorial or a script for data normalization? I have been struggling for a while but couldn't figure out how to do it.

Details about getting pose data?

Hey! Thanks for your great work!
How to use HopeNet to calculate 300W pose data, can you give some guidance or code? Because the yaw angle of the pose data I calculated with HopeNet is very large.
Looking forward to your reply~

Similarity transformation for ASM Loss?

Dear Ali,
Thanks for sharing your interesting work here.
I am a little bit confused about the way you compute ASM loss.

For computation of landmarks using ASM in other work (or manuscript like Tim Cootes), we are required to do similarity transformation.

However, in your ASM loss calculation, such transformation is neglected. Am I missing something here?
Thanks

ASMLoss is empty

Thank you very much for making your code public, but there is no specific implementation in ASMLoss, when will this part be public?

Normalized Dataset

Dear Mr Ali,
I am a student, can you please share the normalized dataset with the structure below?
Thank you very much!
image

How to get real (x,y) coordinates from the normalized output?

I have done this but some of landmark coordinates are in negative values. how to get real (x,y) coordinates?

model = load_model("ASMNet_300W_MESLoss.h5")
X = np.random.random((1,224,224,3)) # input image
l, p = model.predict(X)
l = l.reshape(68,2)
l, p  = l*224, p*224

print(l.min())

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.