Comments (19)
@jrabary The MTCNN model was originally a Caffe model which has been imported to tensorflow. And since Caffe uses a different ordering of the dimensions it required some reshaping of model inputs/outputs.
from facenet.
Hi David,
I finally got a chance to play with MTCNN again, and the following is the code to fix the image stretching and improvement on how the margin (padding) is added to the image.
Change the align_dateset_mtcnn.py of the following lines
From:
bb[0] = np.maximum(det[0]-args.margin/2, 0)
bb[1] = np.maximum(det[1]-args.margin/2, 0)
bb[2] = np.minimum(det[2]+args.margin/2, img_size[1])
bb[3] = np.minimum(det[3]+args.margin/2, img_size[0])
To:
# To prevent the resized image from been skewed, the bounding box needs to be
# adjusted to become a square. Also, the size of the margin needs to be mapped
# from the the target scaled image space to the size in the original image space.
dx = det[2] - det[0]
dy = det[3] - det[1]
margin = 0
if(dy >= dx):
margin = dy * args.margin / (args.image_size - args.margin)
bb[0] = np.maximum(det[0] - (dy-dx)/2 - margin/2, 0)
bb[2] = np.minimum(det[2] + (dy-dx)/2 + margin/2, img_size[1])
bb[1] = np.maximum(det[1] - margin/2, 0)
bb[3] = np.minimum(det[3] + margin/2, img_size[0])
else:
margin = dx * args.margin / (args.image_size - args.margin)
bb[0] = np.maximum(det[0] - margin/2, 0)
bb[2] = np.minimum(det[2] + margin/2, img_size[1])
bb[1] = np.maximum(det[1]-(dx-dy)/2 - margin/2, 0)
bb[3] = np.minimum(det[3] + (dx-dy)/2 + margin/2, img_size[0])
After this adjustment, the original image won't get stretched along x or y directions (except for these clamped by min/max) .
--Scott Hong
from facenet.
Thanks for your great implementation. I had some issue running it.
Assume the following things:
1- FaceNet repository is cloned and located in /home/username/facenet
2- The dataset of LFW is downloaded and located in /home/username/datasets/lfw/raw
* As required a sub-folder is dedicated to the images of each ID.
Now for running the file align_dataset_mtcnn.py I have to do the following:
1- Modify the file "align_dataset_mtcnn.py" in the line 54 as:
old: pnet, rnet, onet = align.detect_face.create_mtcnn(sess, '../../data/')
new: pnet, rnet, onet = align.detect_face.create_mtcnn(sess, '../data/')
2- Go to the folder of /home/username/facenet/src/.
3- Run the following:
for N in {1..4}; do python align/align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 -- image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done
Now I can run it!!
Did I do something wrong from the begining because I cannot simply run the following:
for N in {1..4}; do python align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done
Can I do it in a simpler way without modifying file "align_dataset_mtcnn.py" in the /home/username/facenet/src/align folder??
Thanks
from facenet.
Hi. Thanks about great implementation of MTCNN at TF.
I have a question about second part of issue:
How did you test this case? I meant that this second issue maybe a follow up of first (so different resize function cause different scores).
I think that you should create a artificial input image in both implementation (MatLab and TF) and then compare the activation function.
About the first case, as I understand, their model learned their way of resampling the image and this is why the score is lower using OpenCV. I would be nice to compute FDDB score on both implementation and see what is the difference.
from facenet.
Hi David,
I tried to run align_dataset_mtcnn.py on the LFW dataset and the casia.samples dataset to take a look at mtcnn face alignment. Here are what I've found:
- After comparing the images in the input_dir and output_dir, The images are not aligned (casia samples dataset)?
- The images are squeezed vertically
- On LFW dataset, only 1 image is not successful (excellent result)
from facenet.
Hi @davidsandberg. Thank for sharing this implementation of MTCNN. I'm trying to play with it and I'm wondering why do you transpose the image before feeding them in each network. For example in the detect function, in the first stage you do the following:
im_data = imresample(img, (hs, ws))
im_data = (im_data-127.5)*0.0078125
img_x = np.expand_dims(im_data, 0)
img_y = np.transpose(img_x, (0,2,1,3))
out = pnet(img_y)
which swap the width and the height of the image
from facenet.
Hi @davidsandberg Thank you for sharing your implementation of MTCNN
I wrote a Python wrapper for imResample of Piotr's Computer Vision Matlab Toolbox that MTCNN use.
it gives same result compared to the matlab implementation.
Maybe you need it.
https://github.com/AlvinZhu/matlabtb
from facenet.
@AlvinZhu Thanks alot!! I think this will be very helpful when ironing out the last differences between the implementations.
from facenet.
Hi, @davidsandberg why there is a N in {1..4} in https://github.com/davidsandberg/facenet/wiki/Validate-on-LFW ?
from facenet.
The parameter N is for parallel processing ... N is arbitrary ... Just make sure it doesn't exceed the maximum supported cores(physically or virtually) ... As an example N can be 2 for preventing memory problem!
from facenet.
Hi @AlvinZhu ,
I tried using imResample from matlabtb
The speed of imResample is slow than opencv.
are you?
from facenet.
Hi @pribadihcr
I haven't tested the speed.
from facenet.
@davidsandberg Do you have a script to reproduce the difference ? The file mtcnn_test_pnet_dbg.py seems to have some code which appears to do this, but testing with (py)caffe seems to show that the logits feature-map differs only on the order of 10^-4.
from facenet.
Hi @akssri,
It was a while since I was running these tests but as I remember it the difference was quite small. But even a small difference in the probability metric could make different bounding boxes to be selected in the two implementations.
I guess this mismatch could be quite difficult to fix so the best thing is probably to not require identical results but instead make sure that the performance is equivalent for the two implementations.
from facenet.
@davidsandberg That makes sense. Thanks for the clarification ! I guess additional training (over weights imported from Caffe) should make up for any performance difference.
from facenet.
Hi!,@hadikazemi. I tried according to the steps you said, but encountered mistakes:
import facenet
ImportError: No module named facenet
the reason is?
from facenet.
Closing this since the Python/Tensorflow implementation performs as well as the Matlab/Caffe implementation.
from facenet.
Hi,david,when i execute:
for N in {1..4}; do python src/align/align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done
it ends like:
......
/home/han/Desktop/haq_project/facenet-face-cluster-chinese-whispers--master/datasets/lfw/raw/Debra_Shank/Debra_Shank_0001.jpg
/home/han/Desktop/haq_project/facenet-face-cluster-chinese-whispers--master/datasets/lfw/raw/David_Bisbal/David_Bisbal_0001.jpg
Total number of images: 13233
Number of successfully aligned images: 3332
but it cannot skip to the next commmand,and i open another console to try next step,it didn't work...
the reason is?
from facenet.
Hi, David
Can I run the test case using tensorflow under windows ? would it work properly?
from facenet.
Related Issues (20)
- error Found array with 0 sample(s) (shape=(0, 512)) while a minimum of 1 is required HOT 1
- command not found error (while trying align the LFW dataset)
- ValueError: Node 'gradients/InceptionResnetV1/Bottleneck/BatchNorm/cond/FusedBatchNorm_1_grad/FusedBatchNormGrad' has an _output_shapes attribute inconsistent with the GraphDef for output #3: Dimension 0 in both shapes must be equal, but are 0 and 512. Shapes are [0] and [512]. HOT 4
- TypeError: true_fn and false_fn arguments to tf.cond must have the same number, type, and overall structure of return values
- does facenet support docker HOT 1
- About tensorflow -v2.8.0, I have a error. HOT 1
- Validation loss
- ve
- How to make inference on a single image? HOT 1
- Bounding box is inaccurate HOT 2
- Error in Loading checkpoint file for facenet512
- Unable to open file (file signature not found) HOT 1
- CASIA Webface Dataset Link Needs Updating
- Incorrect bounding box
- Issue with Tensor Names in DeepSORT Integration with FaceNET Model
- Unable to use .pb in tensorflow's java api
- Unable to convert onnx model to TRT model
- ValueError: Node 'gradients/InceptionResnetV1/Bottleneck/BatchNorm/cond/FusedBatchNorm_1_grad/FusedBatchNormGrad' has an _output_shapes attribute inconsistent with the GraphDef for output #3: Dimension 0 in both shapes must be equal, but are 0 and 512. Shapes are [0] and [512]. HOT 1
- Request for Weight Files in faceswap-GAN Project
- How to add visualization to train_tripletloss.py
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from facenet.