GithubHelp home page GithubHelp logo

Comments (19)

davidsandberg avatar davidsandberg commented on April 30, 2024 5

@jrabary The MTCNN model was originally a Caffe model which has been imported to tensorflow. And since Caffe uses a different ordering of the dimensions it required some reshaping of model inputs/outputs.

from facenet.

scotthong avatar scotthong commented on April 30, 2024 2

Hi David,

I finally got a chance to play with MTCNN again, and the following is the code to fix the image stretching and improvement on how the margin (padding) is added to the image.

Change the align_dateset_mtcnn.py of the following lines
From:

                            bb[0] = np.maximum(det[0]-args.margin/2, 0)
                            bb[1] = np.maximum(det[1]-args.margin/2, 0)
                            bb[2] = np.minimum(det[2]+args.margin/2, img_size[1])
                            bb[3] = np.minimum(det[3]+args.margin/2, img_size[0])

To:

                            # To prevent the resized image from been skewed, the bounding box needs to be
                            # adjusted to become a square. Also, the size of the margin needs to be mapped
                            # from the the target scaled image space to the size in the original image space.

                            dx = det[2] - det[0]
                            dy = det[3] - det[1]
                            margin = 0
                            if(dy >= dx):
                               margin =  dy * args.margin / (args.image_size - args.margin)
                                bb[0] = np.maximum(det[0] - (dy-dx)/2 - margin/2, 0)
                                bb[2] = np.minimum(det[2] + (dy-dx)/2 + margin/2, img_size[1])
                                bb[1] = np.maximum(det[1] - margin/2, 0)
                                bb[3] = np.minimum(det[3] + margin/2, img_size[0])
                            else:
                                margin =  dx * args.margin / (args.image_size - args.margin)
                                bb[0] = np.maximum(det[0] - margin/2, 0)
                                bb[2] = np.minimum(det[2] + margin/2, img_size[1])
                                bb[1] = np.maximum(det[1]-(dx-dy)/2 - margin/2, 0)
                                bb[3] = np.minimum(det[3] + (dx-dy)/2 + margin/2, img_size[0])

After this adjustment, the original image won't get stretched along x or y directions (except for these clamped by min/max) .

--Scott Hong

from facenet.

astorfi avatar astorfi commented on April 30, 2024 1

Thanks for your great implementation. I had some issue running it.
Assume the following things:
1- FaceNet repository is cloned and located in /home/username/facenet
2- The dataset of LFW is downloaded and located in /home/username/datasets/lfw/raw
* As required a sub-folder is dedicated to the images of each ID.

Now for running the file align_dataset_mtcnn.py I have to do the following:
1- Modify the file "align_dataset_mtcnn.py" in the line 54 as:
old: pnet, rnet, onet = align.detect_face.create_mtcnn(sess, '../../data/')
new: pnet, rnet, onet = align.detect_face.create_mtcnn(sess, '../data/')
2- Go to the folder of /home/username/facenet/src/.
3- Run the following:
for N in {1..4}; do python align/align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 -- image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done

Now I can run it!!

Did I do something wrong from the begining because I cannot simply run the following:
for N in {1..4}; do python align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done

Can I do it in a simpler way without modifying file "align_dataset_mtcnn.py" in the /home/username/facenet/src/align folder??

Thanks

from facenet.

melgor avatar melgor commented on April 30, 2024

Hi. Thanks about great implementation of MTCNN at TF.
I have a question about second part of issue:
How did you test this case? I meant that this second issue maybe a follow up of first (so different resize function cause different scores).
I think that you should create a artificial input image in both implementation (MatLab and TF) and then compare the activation function.

About the first case, as I understand, their model learned their way of resampling the image and this is why the score is lower using OpenCV. I would be nice to compute FDDB score on both implementation and see what is the difference.

from facenet.

scotthong avatar scotthong commented on April 30, 2024

Hi David,
I tried to run align_dataset_mtcnn.py on the LFW dataset and the casia.samples dataset to take a look at mtcnn face alignment. Here are what I've found:

  1. After comparing the images in the input_dir and output_dir, The images are not aligned (casia samples dataset)?
  2. The images are squeezed vertically
  3. On LFW dataset, only 1 image is not successful (excellent result)

from facenet.

jrabary avatar jrabary commented on April 30, 2024

Hi @davidsandberg. Thank for sharing this implementation of MTCNN. I'm trying to play with it and I'm wondering why do you transpose the image before feeding them in each network. For example in the detect function, in the first stage you do the following:

   im_data = imresample(img, (hs, ws))
   im_data = (im_data-127.5)*0.0078125
   img_x = np.expand_dims(im_data, 0)
   img_y = np.transpose(img_x, (0,2,1,3))
   out = pnet(img_y)

which swap the width and the height of the image

from facenet.

AlvinZhu avatar AlvinZhu commented on April 30, 2024

Hi @davidsandberg Thank you for sharing your implementation of MTCNN
I wrote a Python wrapper for imResample of Piotr's Computer Vision Matlab Toolbox that MTCNN use.
it gives same result compared to the matlab implementation.
Maybe you need it.
https://github.com/AlvinZhu/matlabtb

from facenet.

davidsandberg avatar davidsandberg commented on April 30, 2024

@AlvinZhu Thanks alot!! I think this will be very helpful when ironing out the last differences between the implementations.

from facenet.

StevenLOL avatar StevenLOL commented on April 30, 2024

Hi, @davidsandberg why there is a N in {1..4} in https://github.com/davidsandberg/facenet/wiki/Validate-on-LFW ?

from facenet.

hadikazemi avatar hadikazemi commented on April 30, 2024

The parameter N is for parallel processing ... N is arbitrary ... Just make sure it doesn't exceed the maximum supported cores(physically or virtually) ... As an example N can be 2 for preventing memory problem!

from facenet.

pribadihcr avatar pribadihcr commented on April 30, 2024

Hi @AlvinZhu ,

I tried using imResample from matlabtb
The speed of imResample is slow than opencv.
are you?

from facenet.

AlvinZhu avatar AlvinZhu commented on April 30, 2024

Hi @pribadihcr
I haven't tested the speed.

from facenet.

 avatar commented on April 30, 2024

@davidsandberg Do you have a script to reproduce the difference ? The file mtcnn_test_pnet_dbg.py seems to have some code which appears to do this, but testing with (py)caffe seems to show that the logits feature-map differs only on the order of 10^-4.

from facenet.

davidsandberg avatar davidsandberg commented on April 30, 2024

Hi @akssri,
It was a while since I was running these tests but as I remember it the difference was quite small. But even a small difference in the probability metric could make different bounding boxes to be selected in the two implementations.
I guess this mismatch could be quite difficult to fix so the best thing is probably to not require identical results but instead make sure that the performance is equivalent for the two implementations.

from facenet.

 avatar commented on April 30, 2024

@davidsandberg That makes sense. Thanks for the clarification ! I guess additional training (over weights imported from Caffe) should make up for any performance difference.

from facenet.

gudandenanhaier avatar gudandenanhaier commented on April 30, 2024

Hi!,@hadikazemi. I tried according to the steps you said, but encountered mistakes:
import facenet
ImportError: No module named facenet
the reason is?

from facenet.

davidsandberg avatar davidsandberg commented on April 30, 2024

Closing this since the Python/Tensorflow implementation performs as well as the Matlab/Caffe implementation.

from facenet.

Victoria2333 avatar Victoria2333 commented on April 30, 2024

Hi,david,when i execute:
for N in {1..4}; do python src/align/align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done
it ends like:
......
/home/han/Desktop/haq_project/facenet-face-cluster-chinese-whispers--master/datasets/lfw/raw/Debra_Shank/Debra_Shank_0001.jpg
/home/han/Desktop/haq_project/facenet-face-cluster-chinese-whispers--master/datasets/lfw/raw/David_Bisbal/David_Bisbal_0001.jpg
Total number of images: 13233
Number of successfully aligned images: 3332

but it cannot skip to the next commmand,and i open another console to try next step,it didn't work...
the reason is?

from facenet.

ReemAlsafty avatar ReemAlsafty commented on April 30, 2024

Hi, David
Can I run the test case using tensorflow under windows ? would it work properly?

from facenet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.