patrikhuber / superviseddescent Goto Github PK

View Code? Open in Web Editor NEW

402.0 402.0 188.0 47.36 MB

C++11 implementation of the supervised descent optimisation method

Home Page: http://patrikhuber.github.io/superviseddescent/

License: Apache License 2.0

CMake 5.62% C 20.59% C++ 73.80%

computer-vision landmark-detection machine-learning modern-cpp

superviseddescent's People

Contributors

Stargazers

Watchers

Forkers

zhouzhenghui kiwifig amos-zq zsn828 twnming leezivin bowrein cmxnono zhoushiwei shownx andyhx jwmneu ericchen2013 jassonvia harveyliufly albertofernandezvillan fleogefyr leiyu2 githubzqk miaomiao1989 zjchuyp tempbottle mrgloom haolinwei runauto fwzhuangcg wenli-vision alfredtofu realzheng oioii1999 dengcy028 mbilasco hanchy tfy1028 jackyspeed mydude keeganren ladislavsopko lijian8 mfzhang silasxue nagyistge caomw conansherry jzd2010 tpys milestonesvn arthacker123 s0302102 vlam3d airyym onlysang af258963 shirleyyim leizi007 chaffeechen aiilab shengxingdong xhniu xiexianhai hourifeng facear hardold deepxkn foolslove 23119841 cwd0610 hyer sunxingxingtf saggita lazymike javongchang agethub saltsuger goofysong wangshaobiao tang1485 mkutny tfwu ilibx bigdig justdolearning zhengfangwu yuen33 xuanhan863 wshenx coocoky hdc1104 dan1900 lyimage cheruidonglu yan92 seekever txh6634125 eternalnation fogelton zgsxwsdxg yafeizhao gp1313 viperlab

superviseddescent's Issues

Run project on Codeblock on Ubuntu

Hi Patrik,

I'm trying to run and test your project under Codeblocks. Is that possible?
I've ran examples .cpp files in terminal but don't know how to run whole project in Codeblocks.

Can you help me on that?

Error reading the RCR model "data/rcr/face_landmarks_model_rcr_22.bin": Failed to read 8 bytes from input stream! Read 0

Hi Patrick,

Awesome project, thanks for sharing it. I cloned the repository, and compiled it with cmake (with all the required libraries already installed). The compilation went fine, but when I try to run it, I get the following error:

Error reading the RCR model "data/rcr/face_landmarks_model_rcr_22.bin": Failed to read 8 bytes from input stream! Read 0

I am using Ubuntu 14.04. Also, I am testing it on OpenCV 3.0, but that does not seem to be problem here, as far as I can see.

Thanks,
Avi

Support the me17 landmark set

I'd like to support the popular me17 landmark set that's used in a lot of papers (e.g. Zhou et al. and ours). However, some landmarks are different from the ibug-68 set and above papers also use the original LFPW set (or a subset of it), and not the ibug release of LFPW.

We should thus support training in the original LFPW as well.

Replace vl-hog with a modern C++ solution

vl-hog has really good performance and it's hard to replace. But I don't like to include C headers and I would really like to replace it with a header-only, modern C++ solution. It's just hard to find one, and rewriting vl-hog would be a lot of work.

I managed to include vl-hog as "header-only" by #include'ing vl-hog's .c file, and I had to make some casts in its code explicit in the process to get it to compile. I hope what I did is not problematic.

Training model with videos

Hi Patrick,

I'm trying to estimate different algorithm's tracking error. For that, I'm using two own databases; one with real faces (maybe you known it) and other with synthetic faces (generated by the 3D Basel Face Model) who mimic the movement of the real one (this one is not published).

Using the Supervised Descent Model on the synthetic database, I have a tracking error in 3 videos with high roll angles (I attach the videos with the landmarks obtained by SDM here). I think it's interesting to stand out that SDM's tracking error without these problematic videos is similar to IntraFace's tracking error.

I think this tracking error may be due to the algorithm training. I have thought about doing a new training with videos with high roll angles. I also think it would be interesting detecting face landmarks using previous frame's landmarks.

What do you think about training with videos?

If we train with synthetic faces, we have true knowledge about landmarks' position every frame. But BFM's synthetic faces don't modify their expression and don't close their eyes. So, maybe training with a mix of synthetic faces generated by your 3DMM (using different expressions) and real videos?

Best regards,

Andoni.

Training files and parameters for the 4dface RCR model

I ran your code on VS.
I used 28 images(with 5 landmarks) for training. during test, if the test images are included in training images, the accuracy is very good. otherwise, the accuracy is very bad.
I then used 3283 images to train the algorithms. The accuracy for some images are good, but some are not good.

I am wondering there is b0/bk mentioned in the author's paper, but I cannot find the implementation of b0/bk in you code.

Did you ever test your algorithms with more images? how about the performance?
Could you please help to tell me about b0/bk question?

Implementation limitation

The original SDM demo was released in Intraface project (Matlab and C++ versions)
Comparing to Intraface, your implementation is not as robust and stable. Of course the reasons may be (1) descriptor type (Intraface use modified SIFT), (2) Training database (Intraface use movies for tracking) , (3) Implementation details.
Have you compared your SDM implementation with Intraface?

Improve the error message when a given RCR model file doesn't exist

At the moment, it prints Failed to read 8 bytes from input stream! Read 0, which is not that intuitive.

memory consumption

Hi huber,
Thanks for reply！I have viewed your updating codes. Actually I have solved my problems by rewriting codes in many places, please forgive my behaviors. Besides, I also encountered new problems: I use helen database with 2000 labeled pictures on iBug website, and scale 15 times and rotate 5 times , totally about 30000 training images. I train it on 64G rams server. Unfortunately it consumes all rams, that is , it needs very large memory in training step. Can you optimize this problem? please forgive me my pool english, happy everyday!

Purpose of post estimation and simple function in superviseddescent

Hi Patrik,

I builded the code and generated executable files for landmarkdetection, poseestimation and simplefunction

Can you please tell me the usage of poseestimation.cpp and simplefunction.cpp and how to execute the files

Landmark Detection:
I gave the inputs and got the output with some points drawn on the image.

Pose Estimation:
when i run these, getting residual points alone

For, Simple function no idea about that.

Run rcr-train on my own data

I have a question in regards to rcr-train, can I run this code for my own data I mean for different images and feature points?

Also, how do you calculate the mean? Is it the mean for the feature points of the images of the training set?

Regards

Details about pose estimation

Hello patrihhuber,

I have a question about your code.
firstly,Mat landmarks = (cv::Mat_(1, 20) << 498.0f, 504.0f, 479.0f, 498.0f, 529.0f, 553.0f, 489.0f, 503.0f, 527.0f, 503.0f, 502.0f, 513.0f, 457.0f, 465.0f, 471.0f, 471.0f, 522.0f, 522.0f, 530.0f, 536.0f);
These landmarks How to get？
I use other way get landmarks but can not get the right pose_estimation。
secondly, Mat facemodel; // The point numbers are from the iBug landmarking scheme
Will this time dedicated to other image file of the test need to change it？
Thanks！

facial landmark points from a video file

Can I use superviseddescent to save facial landmark points from a video file ?

landmark detection.cpp trained result looks like mean shape.

Hi,
Patrik, thank you for your code of SDM, I have a problem when I use the landmark detection.cpp, I have used about 3000 images in the IBug for training in landmark detetion.cpp, but when I track the face using the trained model, il looks like the mean shape and when I close the eye, the landmark didn't follow, it keeps always the same form, I don't understand the reason.

Waiting for your answer.

Yours sincerely
Anto

Questions: how to optimize the speed of landmark detection code

Hi Patrikhuber,

I would like to run only landmark detection on your SDM code, and I would like to make the prediction code run faster as long as the landmark accuracy is good for me.

as you might know that, the most time needed for SDM landmark is for HOG calculation, and the time is mostly related with the size of ROI image calculated in adaptive_vlhog.hpp file.

my questions are:

in your rcr-train.cpp file, you set hog_params as follows:
std::vectorrcr::HoGParam hog_params{
{ VlHogVariant::VlHogVariantUoctti, 5, 11, 4, 1.0f },
{ VlHogVariant::VlHogVariantUoctti, 5, 10, 4, 0.7f },
{ VlHogVariant::VlHogVariantUoctti, 5, 8, 4, 0.4f },
{ VlHogVariant::VlHogVariantUoctti, 5, 6, 4, 0.25f } };
why is num_cells 5? shouldn't it be an even number?
in order to keep the accuracy and promote the speed, what are the best hog_params according to your experience?
in rcr-train.cpp file, the regressor level is 4. should it be smaller? what is the best regressor level according to your experience?

Minimize Landmark detection Jitter

Hi Patrik,

I have run a landmark detection with your pre-trained model "face_landmarks_model_rcr_68.bin" to a Still Face video (done synthetically) and I have seen that in each frame, the Landmarks' position is different. I want to minimize the diference between frames.

I think detect landmarks 'x' times and average each frame should minimize jitter.

So, what do yo think about this solution? Do you think there is a better way to minimize it?
Is it possible to train a model with less jitter?

Regards,

Andoni.

Confidence measure of landmark detection?

Hi Patrik!

Is there any way of measuring confidence in landmark detection? No ground truth data.
A naive approach I see so far is using facedetect rectangle area ratio to area of a bounding box around detected landmarks, but, are there better ways?

Details about calculating the mean

Hello patrihhuber,

I have a question about your code. I am pretty new to this field and I am quite amazed with your implementation of SDM,

I have read articles about SDM and I am wondering how you generated "mean_ibug_lfpw_68.txt" file. I have read question in solved issues How to calculate the mean-face ? but I didn't understand the answer quite well.

Can you explain me in more details or give me some pseudo code or the code which generated that file.
Can you tell me what are the steps, for example, if I have 100 pictures and 100 files which have landmark positions, what I must do to generate "mean_ibug_lfpw_68.txt". In article that I have read this step is mentioned but I haven't found explanation how to do it, so if you could provide me pseudo code on how to generate "mean_ibug_lfpw_68.txt" or the code which
you have used i would be very thankful :)

How can I convert code to client web based code

Dear Patrik

I want to use facial landmark detection in web based application. How can I convert code to client side web based code. face_landmarks_model_rcr_68.bin is about 85M and it is heavy to load in web applications. How can I use it?

Regards.
Morteza

reference to "Fitting 3D Morphable Models using Local Features" seems to be misleading

Honestly, I'm not deep at research papers but it seems to me that the "Fitting 3D Morphable Models using Local Features" describes approach where landmarks are detected through 3DMM fitting: "In essence, the proposed method can also be seen as a 3D model based landmark detection,
steering the 3D model parameters to converge to the ground truth location."

At the same time robust cascaded regression landmark detection is described in another paper: "Random Cascaded-Regression Copse for Robust Facial Landmark Detection".

which dataset was used for pre-trained "face_landmarks_model_rcr_68.bin" model.

Implementation a bodies of functions in header files

I ask to to move a implementation of bodies non template functions from hpp-files to cpp-files.
Now at assembly of your code there are errors, for example: multiple definition of `rcr::align_mean(cv::Mat, cv::Rect_, float, float, float, float)'

Training headpose with multiple models

Hi,

I successfully adapted pose_estimation for one of our 3D models.
Now I would like to be able to train on more than one model but I am having trouble modifying the code.
How should I proceed to adapt the ModelProjection variable for training and testing?

Cheers

How to generate points for landmark detection for my source image

Hi Patrick,

I build your code successfully and tested the following codes:

Landmark detection - Output got ,but taking more than an hour.
How to generate the points for my source image

Way to detect whether the detected landmarks are actual landmarks on a face

Hello Patrik,

first I want to thank you for making available such a great piece of software.

I did a few tests on multiple images and the position of the found landmarks is usually quite good. I have however encountered a problem when using the OpenCV face detector.

Sometimes, the OpenCV V&J face detector detects things that aren't actually faces. The supervised descent code still runs and things that aren't landmarks on a face are detected.

My question is :

By looking at the way the optimization goes after each steps or some other method, is there a way to tell that the function is not being optimized properly? In the case of a face, that could be used to decide whether the landmarks were found successfully of not. What I mean is when doing the supervised descent, can we look at the update_step or some other method and tell if things are going the right way or not? The problem with faces is as you mentioned that we have no ground truth to compare our results with because every face is different. Is there a way to tell that we are close or very far from the best solution?

Sorry, I am just an engineer and not a computer vision scientist but I would really appreciate if you could give me some useful pointers.

Regards,
Guillaume

Question: about pose estimation

Hi Patrikhuber,

I would like to run pose estimation in your code. should I run the project under examples/pose_estimation or some other one?

Thanks!

question: CPU time needed for hog feature in SDM code

Hi Patrikhuber,

when we run the SDM test(predict) code, the CPU time needed for each image is so much. For different size of images, the time needed is much different.

My qestions are:

is there any way to optimize the time needed to calculate hog feature?
for different faces with different size, the CPU time needed is much different. can we resize the image to small size for hog calculation? will this method lead to the bad accuracy of SDM?

Many thanks for your answer!

facial landmark tracking

Hi,
Do this approach allow for real-time landmark tracking in videos? Does this program has one function for facial landmark detection and tracking ? I want to track the landmark .What should I do ?Do I need to write the tracking function myself?

Question about inter eye distance (IEM)

Hi Patrik,

I want to ask about rcr-train and rcr-detect files.
Algorithm calculates inter eye distance, understand that it's needed to better calculate eye landmarks.
But if it's really required to use IEM to train model in rcr-train? For example, I only want to detect nose or mouth to get better speed but don't know if I can easily rebuild rcr-train without using IEM functions.
To sum up, I want to ask:
If inter eye distance is needed ONLY to calculate eye landmarks or whole algorithm depends on that(in some way)?

Thank you in advance for your time.

Separating the learning (optimisation) from the model

I think it would be better to separate the optimisation from the actual learned model (the regressors).

Reason: What's not so nice right now is that we store the SupervisedDescentOptimiser to the disk, with all its type information about the solver (e.g. LinearRegressor<PartialPivLUSolver>), and we store the regularisation as well.
Both the solver and the regularisation are only relevant at training time, and there's neither need to store them nor should we need to know the type of the solver when we load the model.
Also, it would mean a user that just wants to use the landmark detection (and not train a model) wouldn't need Eigen, because Eigen is only needed in LinearRegressor::learn() and not in predict().

Regarding the regulariser, we could just choose to exclude it from the serialisation, but I don't think it's very intuitive to only serialise half of a class. I think there must be a better solution that solves the other shortcomings as well. Maybe we can even just make SupervisedDescentOptimiser::train() a free function and get rid of the class.

A related project, tiny-cnn, doesn't separate the model from the optimisation, but I kind of feel like we should.

save trained model with non-C++11 syntax

Hi patrikhuber,

I noticed that you used C++11 lib cereal to serialize the trained model and save it to binary file.
you use the following code to save the model:
void save_detection_model(detection_model model, std::string filename)
{
std::ofstream file(filename, std::ios::binary);
cereal::BinaryOutputArchive output_archive(file);
output_archive(model);
};

My question is:
do you have any other substitute solution that doesn't use C++11 syntax about how to save/load the model to/from files?
can you share non-C++11 code about how to save/load the model?

Question about "data/rcr/face_landmarks_model_rcr_22.bin"

Hi Patrik,

I want to ask data/rcr/face_landmarks_model_rcr_22.bin.
The algorithm uses only 22 key points to compute the regression and predicts 68 key points. I have tried to train the 68 feature points directly, and the result is not very good. Can you tell me why? Is it because of the regress's regularization parameter lamda?

Thanks a lot.

building error with VS2012

superviseddescent-master\include\superviseddescent/utils/ThreadPool.h(49): error C2332: “class”: 缺少标记名
superviseddescent-master\include\superviseddescent/utils/ThreadPool.h(49): error C2011: “”:“enum”类型重定义
1> D:\Microsoft Visual Studio 11.0\VC\include\thr/xthreads.h(41) : 参见“”的声明
superviseddescent-master\include\superviseddescent/utils/ThreadPool.h(49): error C2143: 语法错误 : 缺少“,”(在“...”的前面)

Missing verbose_solver.hpp

Hi Patrik,

It seems that include/verbose_solver.hpp file is missing. I am not sure whether this file is needed in the current version, but basically I cannot compile your project as it is.

Many thanks,

Use doxygen's @tparam to document the template parameters

Doxygen has a @tparam command to document template parameters (https://www.stack.nl/~dimitri/doxygen/manual/commands.html#cmdtparam). We should probably use it in the library documentation.

Confidence score

Hi, Patrik!

I'm fascinated by your work!

Did you see confidence score in Intraface face alignment? Is it possible to implement it here?

I'll appreciate your answer.
Thanks.

Error Building in Visual Studio RC2015 on Windows 10

I have visual studio RC2015 on windows 10. I have downloaded Eigen 3.3, opencv3.2 (Since the 2.4.3 does not support visual studio RC2015), I downloaded boost 1_63 and the supervised descent project. I included the necessary directories in visual studio.
When trying to build rcr-detect.cpp I got the following error:
C4244, 'initializing': conversion from 'double' to 'int', possible loss of data.
Finally, what are the proper versions for Eigen, opencv, boost?

Speed of the facial landmark detection on a phone

I transplanted your program to the phone and found that facial landmark detection of the speed is very slow.On a 640*480 image, the processing time is about 450ms( not include the time to find the face).How can I speed up ? Where is the problem ?

rcr-train issue: No such file or directory

Hi Patrik,

I've tried to run rcr-train to prepare a model for tests.
Unfortunately, it's not working. Prints the residual but also got this msg:

boost::filesystem::directory_iterator::construct: No such file or directory: "home/user/superviseddescent/examples/data/ibug_lfpw_trainset/"

Given arguments are below:

--data "/home/user/superviseddescent/examples/data/ibug_lfpw_trainset/"  
--mean "/home/user/superviseddescent/examples/data/mean_ibug_lfpw_68.txt"   
--facedetector "/home/user/opencv_files/haarcascade_frontalface_alt.xml"   
--config "/home/user/superviseddescent/apps/rcr/data/rcr_training_22.cfg"   
--evaluation "/home/user/superviseddescent/apps/rcr/data/rcr_eval.cfg"   
--output "/home/user/superviseddescent/apps/rcr/trained_output_model.bin"   
--test-data "home/user/superviseddescent/examples/data/ibug_lfpw_trainset/"

if I choose one pic (as below), the result is the same

--test-data "home/user/superviseddescent/examples/data/ibug_lfpw_trainset/image_0002.png"

Do you know where I can find the source of this issue?
Thanks in advance for your time!

run under Visual Studio 2013

hi patrikhuber
thank you for your code. I'm following your code and work, could you mind send me a copy of Visual Studio 2013 of this project?

How to calculate the mean-face ?

Hi patrikhuber,could you please release the code of calculating the "mean_ibug_lfpw_68.txt"?The result is exciting! I'm just a beginner in this area.

Pose estimation for an input image

Hello Patrik,

I am trying to use pose_estimation.cpp example code to detect yaw, pitch and roll for an input image.
What is the size of the image from which you have taken the landmarks (Mat landmarks) ? Can I take image of any size and identify the landmark points using opencv/dlib and use it for pose estimation ?

What do you mean by image origin (500) ? How do we get approximate focal length (1800) ?

Any help is really appreciated. Thanks.

Speed up examples/landmark_detection by only training 5 or 10 landmarks

examples/landmark_detection is a first-run example and it should run fairly quick to give the user some results. I should change it to only train 5 or 10 landmarks, so the training will be quicker than with 68 landmarks.

eigen function runs very slow

Hi Prof.,
I run your code to train the model.
however, when I run the funtion solve in VerbosePartialPivLUSolver class, the following code runs very slow:

RowMajorMatrixXf AtA_Eigen = A_Eigen.transpose() * A_Eigen;

A_Eigen matxi is 20*8801.
it means that to do the transpose of matrix it needs 76588ms which is a very big value.
my system configurations are:
windowns10, CPU Intel i5-6200U, RAM 8G.

the QR of matrix cannot be finished at all in the solve function.

Do you have any idea about this issue?

Detecting face landmarks using previous frame's landmarks

Hi Patrik,

I have been testing your code for some time now for facial landmark training and tracking.

I modified rcr-track to use the landmark points from the previous frame as initialization, instead of the mean. Face detection is performed in the first frame. After that, I use the landmark points from the previous frame. But on doing this, the landmarks get scattered after just a few frames. This happens even if the face remains more or less stationary. I am pasting the code snippet that I used below. Am I doing something wrong?

(current_landmarks holds the landmarks form the previous frame, as also mentioned in your comments in the code)

if(!have_face) { ...
...

have_face=true;
}
else{
cv::Mat prev_landmarks = to_row(current_landmarks);

current_landmarks = rcr_model.detect(image, prev_landmarks);

rcr::draw_landmarks(image, current_landmarks);
have_face=true; // simply setting it to true for now for testing purposes.
}

I would greatly appreciate any help or insight from you.

Cheers!

Crashing after a face is detected with rcr_track.cpp

I have tried to make the rcr-track.cpp code working, I copied the same file found inside: apps/rcr. the code detected a face with the landmarks in the first frame only and the program crashes!! I got the following:

Any idea why that is happening?

Thank you!!

CMake error on OpenCV set_target_properties

Dear all,
I build the code in this github, and find the cmake build error both on linux and Windows. Could help me to fix it ? Thanks very much!
PS: I used to build the previous one successfully。

Linux Error:
Centos +Cmake 3.3.0

Windows Error:
Windows 8.1+ VS2013+Cmake3.3.0
Cmake Error

That's very grateful to help me get though this bug.
Thanks very Much!

Parallelise CalculateHogDescriptor

I am referring to this issue about speed up the process #31

One point is about Hog feature extraction can be parallel. I make a simple time measurement of that function, which took about 60ms for 4 regressions. I am thinking to improve it using GCD on iOS

__block std::vector<int> hogDescriptorsIdx;
dispatch_apply(LandmarkIndexs.size(), c_queue, ^(size_t k) {
     int i = (int)k;
    ....
     hogDescriptor = hogDescriptor.t(); // now a row-vector
     dispatch_sync(s_queue, ^{
        hogDescriptors.push_back(hogDescriptor);
        hogDescriptorsIdx.push_back(i);
    });
});
    
cv::Mat sortedHog;
for( int i = 0; i < hogDescriptorsIdx.size(); i++) {
    sortedHog.push_back(hogDescriptors.row(hogDescriptorsIdx.at(i)));
}
hogDescriptors = sortedHog.reshape(0, sortedHog.cols * sortedHog.rows).t();

I only parallelise the outer loop as I think it is thread safe within its loop. However, the result is wrong and I don't know where the problem is. May I have any advise?

About wrong tracking result of tracking-without-fd

Hi patrikhuber,
I build the tracking-without-fd, it usually has good tracking result. But sometimes, the tracking rectangle become small, and tracking result is wrong. I also test Intraface http://www.humansensing.cs.cmu.edu/intraface/download_functions_cpp.html, its tracking results more robust. How can program find when superviseddescent has a wrong tracking? thanks a lot!

pose estimate

@patrikhuber , hi , I want to know that how do you get 3-d face coordinates of "Mat facemodel;"(in pose_estimation.cpp), and three angles(pitch,yaw,roll) of "Mat predicted_params" are relative to given 3-d face, is it right?

patrikhuber / superviseddescent Goto Github PK

superviseddescent's People

Contributors

Stargazers

Watchers

Forkers

superviseddescent's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs