GithubHelp home page GithubHelp logo

lsibabnikz / diffiqa Goto Github PK

View Code? Open in Web Editor NEW
17.0 1.0 2.0 6.6 MB

Official repository of the paper "DifFIQA: Face Image Quality Assessment Using Denoising Diffusion Probabilistic Models" in proceedings of IEEE International Joint Conference on Biometrics (IJCB) 2023.

Python 100.00%

diffiqa's Introduction


Table of Contents


1. Paper Overview

Abstract

Modern face recognition (FR) models excel in constrained scenarios, but often suffer from decreased performance when deployed in unconstrained (real-world) environments due to uncertainties surrounding the quality of the captured facial data. Face image quality assessment (FIQA) techniques aim to mitigate these performance degradations by providing FR models with sample-quality predictions that can be used to reject low-quality samples and reduce false match errors. However, despite steady improvements, ensuring reliable quality estimates across facial images with diverse characteristics remains challenging. In this paper, we present a powerful new FIQA approach, named DifFIQA, which relies on denoising diffusion probabilistic models (DDPM) and ensures highly competitive results. The main idea behind the approach is to utilize the forward and backward processes of DDPMs to perturb facial images and quantify the impact of these perturbations on the corresponding image embeddings for quality prediction. Because the diffusion-based perturbations are computationally expensive, we also distill the knowledge encoded in DifFIQA into a regression-based quality predictor, called DifFIQA(R), that balances performance and execution time. We evaluate both models in comprehensive experiments on 7 diverse datasets, with 4 target FR models and against 10 state-of-the-art FIQA techniques with highly encouraging results

Idea and Overview

DifFIQA teaser image

High-level idea behind the proposed DifFIQA face image quality assessment (FIQA) approach. The quality of face images corresponds to a considerable degree to the stability of the respective representations in the embedding space of a given face recognition (FR) model. DifFIQA utilizes a diffusion framework to explore the embedding stability through image perturbations caused by the noising and denoising processes. The intuition behind this approach is that the forward (noising) $\mathcal{F}_d$ and backward (denoising) $\mathcal{B}_d$ diffusion processes lead to larger embedding perturbations for lower-quality images ($x^l$) compared to facial images of higher quality ($x^h$). By analyzing the impact of both the forward and backward processes on the representation of a given image, DifFIQA is able to infer the corresponding quality and/or generate (FR model specific) quality rankings, as shown on the right.

DifFIQA overview image

Overview of DifFIQA. The proposed quality assessment pipeline consists of two main parts: the Diffusion Process and the Quality-Score Calculation. The diffusion process uses an encoder-decoder UNet model ($D$), trained using an extended DDPM training scheme that helps to generate higher-quality (restored) images. The custom DDPM model is used in the Diffusion Process, which generates noisy $x_t$ and reconstructed $\hat{x}$ images using the forward and backward diffusion processes, respectively. To capture the effect of facial pose on the quality estimation procedure, the process is repeated with a horizontally flipped image $x^f$. The Quality Score Calculation part then produces and compares the embeddings of the original images and the images generated by the diffusion part.

Results

DifFIQA overview image

Comparison to the state-of-the-art in the form of (non-interpolated) EDC curves. Results are presented for seven diverse datasets, four FR models and in comparison to ten recent FIQA competitors. Observe how the distilled model performs comparably to the non-distilled version, especially at low discard rates. DifFIQA and DifFIQA(R) most convincingly outperform the competitors on the most challenging IJB-C and XQLFW datasets.

DifFIQA overview image

Average performance over all seven test datasets and four FR models at a drop rate of $\mathbf{0.2}$. The results are reported in terms of average pAUC score at the FMR of $10^{-3}$. The proposed DifFIQA(R) approach is overall the best performer. The best result is colored green, the second-best blue and the third-best red.

DifFIQA overview image

Average performance over all seven test datasets and four FR models at a drop rate of $\mathbf{0.3}$. The results are reported in terms of average pAUC score at the FMR of $10^{-3}$. The proposed DifFIQA(R) approach is overall the best performer. The best result is colored green, the second-best blue and the third-best red.

DifFIQA overview image

Illustration of the quality scores produced by the proposed FIQA techniques. The scores on the top shows results for DifFIQA and the scores at the bottom for DifFIQA(R). While the concrete scores differ, both models generate similar rankings.


2. Use of repository

  • This repository consists of two main parts:

    • DifFIQA codebase: Contains training and inference code for the base DifFIQA method as well as the pretrained model weights.
    • DifFIQA(R) codebase: Contains training and inference code for the extended DifFIQA regressor as well as the pretrained model weights.
  • Detailed instructions on how to setup and use both methods are included in each respective README file.

2.1. Environment setup

We recommend using conda to setup the environment.

  • Create and activate a new conda environment:

    conda create -n diffiqa_r python=3.10

    conda activate diffiqa_r

  • Install PyTorch (use appropriate version of CUDA):

    conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

  • Install necessary Python libraries:

    accelerate, numpy, scipy, pillow, tqdm, wandb, einops

  • Using pip install also:

    pip install opencv-python

    pip install ema-pytorch

3. Citation

If you find this repository useful, please cite the following paper:

 @inproceedings{babnikIJCB2023,
  title={{DifFIQA: Face Image Quality Assessment Using Denoising Diffusion Probabilistic Models}},
  author={Babnik, {\v{Z}}iga and Peer, Peter and {\v{S}}truc, Vitomir},
  booktitle={Proceedings of the IEEE International Joint Conference on Biometrics (IJCB)},
  year={2023},
}

diffiqa's People

Contributors

lsibabnikz avatar

Stargazers

 avatar Lingdong Wang avatar PedroOrtix avatar Jan Niklas Kolf avatar  avatar Victoria Yue Chen avatar Finch Ou avatar  avatar  avatar Wang avatar Marek Vaško avatar Jana Jovičić avatar Davor Jordacevic avatar yuyingmelody avatar Lei Zhao avatar ChuRuaNho avatar Hasan Nasrallah avatar

Watchers

 avatar

diffiqa's Issues

Occurrence of quality scores of 1.0 or more

Hello, I have found that by re-training the diffusion model on a different a high quality face dataset and after training inference using the regression model, around 10 out of around 5000 or so images of the data will have a quality score result greater than 1.0. This is most likely due to the nature of the regression model and the initial high resolution of the images, although in the very majority of cases this will not be the case. What should we do when this happens, and is it possible to default to recognizing images with quality scores greater than 1.0 as perfect image.

About the label of image quality

Hello, thank you for your great work! I am a new student in the field of image quality analysis. I think this is a very meaningful work.

I am curious whether DifFIQA operates under a supervised or unsupervised learning paradigm. Does the original dataset used for training the model contain quality labels for the images?

Thank you very much for your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.