zisianw / faceboxes.pytorch Goto Github PK

View Code? Open in Web Editor NEW

841.0 19.0 211.0 207 KB

A PyTorch Implementation of FaceBoxes

License: MIT License

Python 92.03% Shell 0.16% C++ 0.22% Cuda 7.60%

faceboxes face-detection pytorch

faceboxes.pytorch's Introduction

FaceBoxes in PyTorch

By Zisian Wong, Shifeng Zhang

A PyTorch implementation of FaceBoxes: A CPU Real-time Face Detector with High Accuracy. The official code in Caffe can be found here.

Performance

Dataset	Original Caffe	PyTorch Implementation
AFW	98.98 %	98.55%
PASCAL	96.77 %	97.05%
FDDB	95.90 %	96.00%

Citation

Please cite the paper in your publications if it helps your research:

@inproceedings{zhang2017faceboxes,
  title = {Faceboxes: A CPU Real-time Face Detector with High Accuracy},
  author = {Zhang, Shifeng and Zhu, Xiangyu and Lei, Zhen and Shi, Hailin and Wang, Xiaobo and Li, Stan Z.},
  booktitle = {IJCB},
  year = {2017}
}

Installation
Training
Evaluation
References

Installation

Install PyTorch >= v1.0.0 following official instruction.
Clone this repository. We will call the cloned directory as $FaceBoxes_ROOT.

git clone https://github.com/zisianw/FaceBoxes.PyTorch.git

Compile the nms:

./make.sh

Note: Codes are based on Python 3+.

Training

Download WIDER FACE dataset, place the images under this directory:

$FaceBoxes_ROOT/data/WIDER_FACE/images

Convert WIDER FACE annotations to VOC format or download our converted annotations, place them under this directory:

$FaceBoxes_ROOT/data/WIDER_FACE/annotations

Train the model using WIDER FACE:

cd $FaceBoxes_ROOT/
python3 train.py

If you do not wish to train the model, you can download our pre-trained model and save it in $FaceBoxes_ROOT/weights.

Evaluation

Download the images of AFW, PASCAL Face and FDDB to:

$FaceBoxes_ROOT/data/AFW/images/
$FaceBoxes_ROOT/data/PASCAL/images/
$FaceBoxes_ROOT/data/FDDB/images/

Evaluate the trained model using:

# dataset choices = ['AFW', 'PASCAL', 'FDDB']
python3 test.py --dataset FDDB
# evaluate using cpu
python3 test.py --cpu
# visualize detection results
python3 test.py -s --vis_thres 0.3

Download eval_tool to evaluate the performance.

References

Official release (Caffe)
A huge thank you to SSD ports in PyTorch that have been helpful:
- ssd.pytorch, RFBNet
Note: If you can not download the converted annotations, the provided images and the trained model through the above links, you can download them through BaiduYun.

faceboxes.pytorch's People

Contributors

Stargazers

Watchers

Forkers

sfzhang15 skyneta xjsxujingsong ml-lab laidy3 h-jia jjdblast templeblock wuyunxiangwyx dapenggg gdfishhannah guidewsp trantorrepository luckynote ieee820 zgsxwsdxg yiming-lzx asetsuna northeast250 irentang pinglmlcv zzmcdc yuan776 jianlong-yuan tfygg xuehaouwa shiyongde daisy2202 charleschen20 andyliu93 alixing tianyu06030020 nikolayvoronchikhin gyanachand1 sakthivelsivasankar anusornc shantanu0304 tensorflow-pool soulempty hassanabbas92 jiangbin216 pubfork pyhustsong discipleofhamilton slbinilkumar clcarwin sunnyln mjanddy vipermdl tmaila baiyuang leo-xxx 51w xsacha miwaliu c1a1o1 caiyueliang 5059 cholihao datapowerful lunar-r sruthi-racharla cosmoshua wenjieyan1997 faizwhb amirunpri2018 bluesliuf justusschock nitintushir0048 jockeypan balaprasanna kingsley851102 fengzhenhua amir22010 emailhy wangerwei01 caothetoan hkzhang95 ggsggs samson-wang sahanduiuc candeiasalexandre riheng xzmbb hhy5277 tchernitski leipang0817 zhixiech 875798590 manoharj yangdeqieyingli jacke121 garynth41 wapxmas arnox612 liaorongfan digitalimagep serendipityforzhch caikw0602 lllfx

faceboxes.pytorch's Issues

the error in ./make.sh

Traceback (most recent call last):
File "build.py", line 59, in
CUDA = locate_cuda()
File "build.py", line 45, in locate_cuda
raise EnvironmentError('The nvcc binary could not be '
OSError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOM

do you know why? thanks

num_workers

@zisianw HI

在DataLoader中，有个参数num_workers，发现这个参数严重影响着数据读取速度．

Epoch:1 || epochiter: 0.0/402.0|| Totel iter 0 || L: 9.9270 C: 23.2169||Batch time: 13.3025 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 1.0/402.0|| Totel iter 1 || L: 9.5204 C: 20.6763||Batch time: 0.8111 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 2.0/402.0|| Totel iter 2 || L: 9.7094 C: 14.3963||Batch time: 0.8572 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 3.0/402.0|| Totel iter 3 || L: 9.2358 C: 12.4466||Batch time: 0.7305 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 4.0/402.0|| Totel iter 4 || L: 8.7333 C: 8.5753||Batch time: 0.6582 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 5.0/402.0|| Totel iter 5 || L: 9.3984 C: 7.7155||Batch time: 0.6882 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 6.0/402.0|| Totel iter 6 || L: 8.3960 C: 6.0018||Batch time: 0.7805 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 7.0/402.0|| Totel iter 7 || L: 9.2684 C: 6.2540||Batch time: 0.6133 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 8.0/402.0|| Totel iter 8 || L: 9.3614 C: 5.6496||Batch time: 6.2326 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 9.0/402.0|| Totel iter 9 || L: 8.6283 C: 5.4357||Batch time: 0.7247 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 10.0/402.0|| Totel iter 10 || L: 8.4319 C: 4.9669||Batch time: 1.4879 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 11.0/402.0|| Totel iter 11 || L: 8.4364 C: 4.1630||Batch time: 0.6911 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 12.0/402.0|| Totel iter 12 || L: 8.4593 C: 4.6689||Batch time: 0.5316 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 13.0/402.0|| Totel iter 13 || L: 8.7163 C: 4.6978||Batch time: 0.6517 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 14.0/402.0|| Totel iter 14 || L: 8.6776 C: 4.4363||Batch time: 0.7663 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 15.0/402.0|| Totel iter 15 || L: 8.2896 C: 4.4251||Batch time: 0.6142 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 16.0/402.0|| Totel iter 16 || L: 8.6511 C: 4.2516||Batch time: 8.1611 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 17.0/402.0|| Totel iter 17 || L: 8.6909 C: 4.2532||Batch time: 0.5894 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 18.0/402.0|| Totel iter 18 || L: 8.6228 C: 4.1899||Batch time: 0.5962 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 19.0/402.0|| Totel iter 19 || L: 8.3660 C: 5.0475||Batch time: 0.7253 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 20.0/402.0|| Totel iter 20 || L: 8.3578 C: 3.9863||Batch time: 0.6487 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 21.0/402.0|| Totel iter 21 || L: 8.0489 C: 3.8586||Batch time: 0.6625 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 22.0/402.0|| Totel iter 22 || L: 7.7491 C: 4.0214||Batch time: 0.6410 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 23.0/402.0|| Totel iter 23 || L: 8.2842 C: 3.7695||Batch time: 0.8866 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 24.0/402.0|| Totel iter 24 || L: 8.5159 C: 3.8839||Batch time: 6.0781 sec. ||LR: 0.00100000
Epoch:1 || epochiter: 25.0/402.0|| Totel iter 25 || L: 8.3251 C: 3.8723||Batch time: 0.6119 sec. ||LR: 0.00100000

配置：单GPU(1080ti)，batch_size=32，num_workers=8．
发现：每迭代8的倍数，当前迭代时间就会增加尽10倍．
自己的理解是num_workers=8，应该就是一次8个线程处理数据．所以每到8的倍数，要开始处理，所以时间增加．
不知道这样的理解对否？？

修改num_workers为其他数值，也发现同样的规律：只要当前迭代数是num_workers值的倍数，时间都会增加尽10倍．
这个问题有什么好的解决办法吗？？
另外修改num_workers值，会影响数据的读取顺序，那么理论上也会对最终的结果有影响吧？

Why predict one point (xmin,ymin = xmax,ymax) in other picture?

Hi:
i am trying to do an inference in my picture(not from dataset), the image preprocess (orginal size input, not resize), model load, results post process is basically follow your test.py.
But the predict point is funny, which shows following:
there just one point, xmin,ymin equal xmax ymax

dets [[2.1120000e+03 1.3440000e+03 2.1120000e+03 1.3440000e+03 9.8217058e-01]]

Just curious, Any ideal why this happen?

So far, there are two thing i noticed:
1, the predict point is closer to the centre point of the box that mtcnn predict, Very Strange.
facebox: [2240,960,2240,960]
mtcnn: [1626,503,2737,1882]

2, the orginal predict is just one point
because i trace every step, found in decode step, before boxes[:, :2] -= boxes[:, 2:] / 2, boxes was like [0.x,0.x,0,0], the last two is zero.

def decode(loc, priors, variances):
   boxes = torch.cat((
        priors[:, :2] + loc[:, :2] * variances[0] * priors[:, 2:],
        priors[:, 2:] * torch.exp(loc[:, 2:] * variances[1])), 1)
    boxes[:, :2] -= boxes[:, 2:] / 2
    boxes[:, 2:] += boxes[:, :2]
    return boxes

How to draw face box correctly on original images?

I ran test.py and output the txt file. But I wanted to see the face boxes, drawing rectangles to mark the faces on the original image. Here comes the problem. No matter in which datasets, there always draw the wrong part. and the rate is high. By the way, I can not use numpy.ndarray(original image) instead of tensor type.

It is PASCAL dataset.

Train on Another Data Set

@zisianw,
Thank you for your nice work. I have two questions:

Can one train your model (with the provided training code) on another data set, with different aspect ratio for anchor boxes? (If yes, please explain how we can do that.)
Have you ever trained this model for pedestrian detection (for example on Caltech data set)? [If no, do you have a future plan for it?]

Thank you very much

Default Confidence

parser.add_argument('--confidence_threshold', default=0.05, type=float, help='confidence_threshold')

The default condifence in pytorch and caffe is different.

Is it too low in pytorch?

How to convert txt annotations to VOC format ?

Thanks for the great work.
I want to train a face detector based on my local dataset.

However, my data annotation is like this: path/image.png num_face x11 y11 x12 y12 x21 y21 x22 y22...

# path/name num_face x_top left, y_top left, x_bottom right, y_bottom right...
data/image01.jpg 3 599 438 605 447 789 435 801 455 698 437 703 443 
data/image02.jpg 1 9 39 83 100

Could you tell me how can I convert this annotation format to voc format?

which dataset you use for pretrained model?

ImportError:__cudaPopCallConfiguration

After Compile the nms:./make.sh
run python3 test.py:
Traceback (most recent call last):
File "mytest.py", line 9, in
from utils.nms_wrapper import nms
File "/home/nnir712/project/FaceBoxes.PyTorch-master/utils/nms_wrapper.py", line 9, in
from .nms.gpu_nms import gpu_nms
ImportError: /home/nnir712/project/FaceBoxes.PyTorch-master/utils/nms/gpu_nms.cpython-36m-x86_64-linux-gnu.so: undefined symbol: __cudaPopCallConfiguration

eval_tool doesn't work very well

Are there any other eval_tools?
the eval_tool you recommend don't work very well,and the version is too old.
thanks!

Weights for loc and clf during training

@zisianw Hi author! I am trying to train the model. And I have some questions about the weights for loc and clf during training.
The weights for loc is 2 and for clf is 1 in both the Caffe and Pytorch official code currently. I am wondering about the weights for loc and clf during training, do they need to be changed? In other words, is there any tricks during training when setting these two weights?
Thank you!

data_argument.py速度优化

在data_argument.py中，将distort放在了_crop的前面，这将导致速度慢很多倍。改这个也很简单，建议抽空改一下。

Evaluate on WiderFace val

Seems not good

Easy Val AP: 0.8404888178724867
Medium Val AP: 0.7661765951105204
Hard Val AP: 0.3942927175197917

Strange loss values

While training the model, training loss looks strange

Epoch:1 || epochiter: 0/316|| Total iter 316 || L: 5.69 C: 0.17 ||Btime: 26.588 s. ||LR: 0.001085
Epoch:1 || epochiter: 1/316|| Total iter 317 || L: 6.12 C: 0.23 ||Btime: 1.360 s. ||LR: 0.001085
Epoch:1 || epochiter: 2/316|| Total iter 318 || L: 6.23 C: 0.44 ||Btime: 3.178 s. ||LR: 0.001086
Epoch:1 || epochiter: 3/316|| Total iter 319 || L: 6.27 C: 0.35 ||Btime: 1.331 s. ||LR: 0.001086
Epoch:1 || epochiter: 4/316|| Total iter 320 || L: 5.74 C: 0.19 ||Btime: 1.145 s. ||LR: 0.001086
Epoch:1 || epochiter: 5/316|| Total iter 321 || L: 6.13 C: 0.29 ||Btime: 2.172 s. ||LR: 0.001086
Epoch:1 || epochiter: 6/316|| Total iter 322 || L: 5.08 C: 0.20 ||Btime: 1.199 s. ||LR: 0.001087
Epoch:1 || epochiter: 7/316|| Total iter 323 || L: 5.68 C: 0.24 ||Btime: 0.900 s. ||LR: 0.001087
Epoch:1 || epochiter: 8/316|| Total iter 324 || L: 5.54 C: 0.62 ||Btime: 1.470 s. ||LR: 0.001087
Epoch:1 || epochiter: 9/316|| Total iter 325 || L: 5.91 C: 0.31 ||Btime: 0.917 s. ||LR: 0.001088

while validation loss is like this:

Epoch:1 || L: 4.5863 C: 45.4624||Batch time: 14.5824 sec. ||
Epoch:1 || L: 4.7383 C: 44.9612||Batch time: 0.9316 sec. ||
Epoch:1 || L: 4.6935 C: 44.3159||Batch time: 1.3099 sec. ||
Epoch:1 || L: 4.4057 C: 45.0097||Batch time: 0.8248 sec. ||
Epoch:1 || L: 4.2893 C: 45.4343||Batch time: 0.7287 sec. ||

I know that the model is kind of collapsing so I run inference on an image to see the behavior of the model. I observe that it predicts almost all anchors are faces (thousands of boxes on one image) on images from either train or validation dataset. This complies to the validation loss above but my question is why the classification loss of training is so low while the model predicts thousands of boxes for an image. Below is the inference run on a training image:

The Speed

the fps in the CPU?

Extension compiling error

I have the next issue when compiling cpu extension

 % ./make.sh 
running build_ext
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
building 'nms.cpu_nms' extension
{'gcc': ['-Wno-cpp', '-Wno-unused-function']}
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/local/lib/python3.7/dist-packages/numpy/core/include -I/usr/include/python3.7m -c nms/cpu_nms.c -o build/temp.linux-x86_64-3.7/nms/cpu_nms.o -Wno-cpp -Wno-unused-function
nms/cpu_nms.c: In function ‘__Pyx_PyCFunction_FastCall’:
nms/cpu_nms.c:8431:13: error: too many arguments to function ‘(PyObject * (*)(PyObject *, PyObject * const*, Py_ssize_t))meth’
     return (*((__Pyx_PyCFunctionFast)meth)) (self, args, nargs, NULL);
            ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
nms/cpu_nms.c: In function ‘__Pyx__ExceptionSave’:
nms/cpu_nms.c:8892:21: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_type’; did you mean ‘curexc_type’?
     *type = tstate->exc_type;
                     ^~~~~~~~
                     curexc_type
nms/cpu_nms.c:8893:22: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_value’; did you mean ‘curexc_value’?
     *value = tstate->exc_value;
                      ^~~~~~~~~
                      curexc_value
nms/cpu_nms.c:8894:19: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
     *tb = tstate->exc_traceback;
                   ^~~~~~~~~~~~~
                   curexc_traceback
nms/cpu_nms.c: In function ‘__Pyx__ExceptionReset’:
nms/cpu_nms.c:8901:24: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_type’; did you mean ‘curexc_type’?
     tmp_type = tstate->exc_type;
                        ^~~~~~~~
                        curexc_type
nms/cpu_nms.c:8902:25: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_value’; did you mean ‘curexc_value’?
     tmp_value = tstate->exc_value;
                         ^~~~~~~~~
                         curexc_value
nms/cpu_nms.c:8903:22: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
     tmp_tb = tstate->exc_traceback;
                      ^~~~~~~~~~~~~
                      curexc_traceback
nms/cpu_nms.c:8904:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_type’; did you mean ‘curexc_type’?
     tstate->exc_type = type;
             ^~~~~~~~
             curexc_type
nms/cpu_nms.c:8905:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_value’; did you mean ‘curexc_value’?
     tstate->exc_value = value;
             ^~~~~~~~~
             curexc_value
nms/cpu_nms.c:8906:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
     tstate->exc_traceback = tb;
             ^~~~~~~~~~~~~
             curexc_traceback
nms/cpu_nms.c: In function ‘__Pyx__GetException’:
nms/cpu_nms.c:8961:24: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_type’; did you mean ‘curexc_type’?
     tmp_type = tstate->exc_type;
                        ^~~~~~~~
                        curexc_type
nms/cpu_nms.c:8962:25: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_value’; did you mean ‘curexc_value’?
     tmp_value = tstate->exc_value;
                         ^~~~~~~~~
                         curexc_value
nms/cpu_nms.c:8963:22: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
     tmp_tb = tstate->exc_traceback;
                      ^~~~~~~~~~~~~
                      curexc_traceback
nms/cpu_nms.c:8964:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_type’; did you mean ‘curexc_type’?
     tstate->exc_type = local_type;
             ^~~~~~~~
             curexc_type
nms/cpu_nms.c:8965:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_value’; did you mean ‘curexc_value’?
     tstate->exc_value = local_value;
             ^~~~~~~~~
             curexc_value
nms/cpu_nms.c:8966:13: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
     tstate->exc_traceback = local_tb;
             ^~~~~~~~~~~~~
             curexc_traceback
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

Python version: 3.7
Cython version: 0.29.2
GCC version: 8.3.0

scale

@zisianw @sfzhang15 HI
关于C.Relu模块，在Fig2(a)中，有1个scale层．
但在code并没有体现出该层．
请问如何理解？？

does this project has pretrained model to finetune?

I want to use faceboxes to fine tune my dataset, does this project has pretrained model to finetune?

预训练模型

您好，能发一份预训练模型吗？[email protected]，我进不去Google

training loss curve

I tried to implement faceboxes from scratch, I took this repo as a reference. And I wondered what's the training loss curve looks like. My training curve has many impulses after some epochs and decreases to 5 or 4 again. Could you give me some suggestions?

result detect face fail

i use your weights and detected image in FDDB then show result detection face. I saw fail result. Example:

run the code on Windows 10

Hi,

The code couldn't be run on Windows 10.
Do you have any solution to run the code on Windows 10?

Thanks and Best Regards,
Ardeal

Draw Box

How to draw box of face?

about Face-box filter

@zisianw
@sfzhang15
HI

根据paper的描述：
Face-box filter: We keep the overlapped part of the face box if its center is in the above processed image, then filter out these face boxes whose height or width is less than 20 pixels.

对于过滤height/width小于20这一步．code中的操作是：

b_w_t = (boxes_t[:, 2] - boxes_t[:, 0] + 1) / w * img_dim
b_h_t = (boxes_t[:, 3] - boxes_t[:, 1] + 1) / h * img_dim
mask_b = np.minimum(b_w_t, b_h_t) > 16.0

此时，boxes_t的坐标还是相对于＂原始image＂的坐标，只不过排除了＂中点不在crop后的square patch的box＂．
自认为，直接排除就行，为什么还要除以w或h，乘以img_dim呢？？

另外，为什么不是在最后1步进行排除操作呢？即

boxes_t[:, :2] = np.maximum(boxes_t[:, :2], roi[:2])
boxes_t[:, :2] -= roi[:2]
boxes_t[:, 2:] = np.minimum(boxes_t[:, 2:], roi[2:])
boxes_t[:, 2:] -= roi[:2]

之后．此时，boxes_t的坐标是相对于＂crop后的square patch＂的坐标．为什么不是在此时进行排除？

bbox process(data_augment.py)

You ignore the bbox less than 16 pixel in height or width. But why you not masking the corresponding face region in the image?

尝试使用FPN将后层的特征图上采样，然后再与前面的分支进行相加，发现训练时分类损失特别小，请问这是因为什么？

``Epoch:5 || epochiter: 229/403|| Totel iter 1841 || L: 1.4869 C: 0.7733||Batch time: 0.5901 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 230/403|| Totel iter 1842 || L: 1.5193 C: 0.2210||Batch time: 0.6800 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 231/403|| Totel iter 1843 || L: 1.3612 C: 0.3711||Batch time: 0.4681 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 232/403|| Totel iter 1844 || L: 1.4300 C: 0.2709||Batch time: 0.6333 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 233/403|| Totel iter 1845 || L: 1.9756 C: 0.5358||Batch time: 0.4813 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 234/403|| Totel iter 1846 || L: 1.5387 C: 1.5844||Batch time: 0.5340 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 235/403|| Totel iter 1847 || L: 1.5266 C: 0.8917||Batch time: 0.5414 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 236/403|| Totel iter 1848 || L: 1.2795 C: 0.7289||Batch time: 0.6268 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 237/403|| Totel iter 1849 || L: 1.5269 C: 0.3984||Batch time: 0.6899 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 238/403|| Totel iter 1850 || L: 1.3545 C: 0.2246||Batch time: 0.6136 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 239/403|| Totel iter 1851 || L: 1.3029 C: 0.2058||Batch time: 0.6137 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 240/403|| Totel iter 1852 || L: 1.3967 C: 0.2361||Batch time: 0.6172 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 241/403|| Totel iter 1853 || L: 1.3786 C: 1.1413||Batch time: 0.6570 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 242/403|| Totel iter 1854 || L: 1.5217 C: 0.1562||Batch time: 0.6213 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 243/403|| Totel iter 1855 || L: 1.2962 C: 0.3875||Batch time: 0.6527 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 244/403|| Totel iter 1856 || L: 1.4585 C: 0.1666||Batch time: 0.6850 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 245/403|| Totel iter 1857 || L: 1.4246 C: 0.6494||Batch time: 0.6752 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 246/403|| Totel iter 1858 || L: 1.4275 C: 0.4858||Batch time: 0.7416 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 247/403|| Totel iter 1859 || L: 1.3946 C: 0.2901||Batch time: 0.5193 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 248/403|| Totel iter 1860 || L: 1.5584 C: 0.1369||Batch time: 0.5752 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 249/403|| Totel iter 1861 || L: 1.5241 C: 0.3476||Batch time: 0.5399 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 250/403|| Totel iter 1862 || L: 1.3083 C: 0.3837||Batch time: 0.4910 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 251/403|| Totel iter 1863 || L: 1.5277 C: 0.1997||Batch time: 0.5678 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 252/403|| Totel iter 1864 || L: 1.2095 C: 0.1155||Batch time: 0.6526 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 253/403|| Totel iter 1865 || L: 1.5689 C: 0.4727||Batch time: 0.6525 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 254/403|| Totel iter 1866 || L: 1.4297 C: 0.2737||Batch time: 0.6253 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 255/403|| Totel iter 1867 || L: 1.3079 C: 0.0714||Batch time: 0.7641 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 256/403|| Totel iter 1868 || L: 1.6251 C: 0.2636||Batch time: 0.6445 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 257/403|| Totel iter 1869 || L: 1.9823 C: 0.5266||Batch time: 0.6920 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 258/403|| Totel iter 1870 || L: 1.4698 C: 0.6581||Batch time: 0.6837 sec. ||LR: 0.00100000
Epoch:5 || epochiter: 259/403|| Totel iter 1871 || L: 1.7391 C: 0.4285||Batch time: 0.7007 sec. ||LR: 0.00100000

Speed fps in real time is very slow

cpu测试

你好，能否在CPU上测试，编译需要改那些？

Evaluation on FDDB

Did you implement the elliptical regressor for transforming the predicted rectangle bounding box to eclipsed bounding box?
Best Regards.

training loss

what is the training loss look like？how many epochs have you trained and what is the loss value？I trained it for about one day in a 1080ti machine，x 0.1 learning rate in the middle of the day，but the loss for L is 2.5，the loss for C is 1.5，is the loss too big？

给priors分配ground truth的一些疑惑

box_utils.py脚本里的match函数在给priors分配ground truth的时候，进行了ignore hard ground truth的步骤，也就是和某个ground truth最大的iou小于0.2时，该ground truth 会被舍弃，如果其中有被舍弃的gt，那么ground truth的index就会改变，后面再做# ensure every gt matches with its prior of max overlap的时候是不是就不对了？这部分有点不理解

Could u upload your annotation file here

When I train ：
File "train.py", line 152, in
train()
File "train.py", line 126, in train
loss_l, loss_c = criterion(out, priors, targets)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/research/pytorch-faceboxes/FaceBoxes.PyTorch/layers/modules/multibox_loss.py", line 107, in forward
loss_c = F.cross_entropy(conf_p, targets_weighted, size_average=False)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 1970, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 1790, in nll_loss
ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: invalid argument 2: non-empty vector or matrix expected at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:32

rgb取值反了

作者你好，在data_augment.py文件中，167行 image -= mean这里是不是不对，mean为RGB模式的均值，但是代码截至运行到该部分image的形式还是OpenCV读进来的BGR模式，这部分我想确认下你是不是写错了

how to import gpu_nms.pyx?

i don't know how to import pyx file
error is No module gpu_nms , cpu_nms

C:\Users\jungho\FaceBoxes.PyTorch>python test.py
Traceback (most recent call last):
File "test.py", line 9, in
from utils.nms_wrapper import nms
File "C:\Users\jungho\FaceBoxes.PyTorch\utils\nms_wrapper.py", line 10, in
from cpu_nms import cpu_nms, cpu_soft_nms
ImportError: No module named 'cpu_nms'

train speed.

你好，我在训练该代码时发现，每隔几次迭代，会出现卡顿。为什么会出现这种情况，你训练时有遇到类似情况吗？我的pytorch版本是0.4.1。

Image Pyramid or Anchor Boxes Pyramid?

Correct me if I'm wrong:
Based on the research paper of SSD:
It's using anchor boxes pyramid to detect object with various scales.

Faceboxes:
It's only using anchor boxes pyramid to detect object with various scales without using image pyramid right?

Based on faster-rcnn they are using RPN (region proposal network) to predict region of interest.
How is faceboxes and ssd doing that to detect object?

Are the outputs (feature map) for each inception and conv in-channel and out-channel( default anchor boxes) the loc, confidence and box dimension in faceboxes.py?

How is the default anchor boxes is used inside the coding part (faceboxes.py)? Can't find any anchor initialization in faceboxes.py

ymin += 0.2 * (ymax - ymin + 1)

Hello, can you tell me why you use the following code when calculating ymin in test.py:
ymin += 0.2 * (ymax - ymin + 1)
Thank you!

How to get the MAP?

Do you have the tool code to get your accuracy, can you provide it?Thanks！

Accuracy higher when training myself

I followed the instructions in the README and trained faceboxes against WIDER-FACES and after 300 epochs (14 hrs, Titan V) I end up with 98.55% vs. the pretrained model with 98.47%.

For Pascal, I get 97.05%, which is higher than both your pretrained model and the original paper.

I also noticed the yoffset you use (20% of the box height) affects the accuracy a fair bit. Changing it to 30% of box height increased Pascal to 97.10%. I notice other variants of faceboxes use + 4, which doesn't work as well. Why is this offset in there?

Is this explained solely by a newer version of Pytorch (v1.0.1) or you did not get to 300 epochs?

模型测试

你好，为什么你的模型需要对原图放大三倍后做测试检测率才会高一点？这样对1080p来说确实是一个挑战，请问有什么好的改进建议吗？

make errer

error when running ./make.sh, any methods ？ :
^
nms/cpu_nms.c: In function ‘__Pyx__GetException’:
nms/cpu_nms.c:8961:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’
tmp_type = tstate->exc_type;
^
nms/cpu_nms.c:8962:23: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’
tmp_value = tstate->exc_value;
^
nms/cpu_nms.c:8963:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’
tmp_tb = tstate->exc_traceback;
^
nms/cpu_nms.c:8964:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’
tstate->exc_type = local_type;
^
nms/cpu_nms.c:8965:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’
tstate->exc_value = local_value;
^
nms/cpu_nms.c:8966:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’
tstate->exc_traceback = local_tb;
^
error: command 'gcc' failed with exit status 1

Set CPU-only mode

Hi,

I tried to run the demo.py file in python, but I got the following error:

F0311 15:58:18.148229  6000 common.cpp:76] Cannot use GPU in CPU-only Caffe: check mode.

however, I have modified the mode to CPU-only mode. Should I modified some other code to disable GPU mode but enable CPU-only mode?

By the way, the faceboxes model was downloaded from the page you publicized your source code:
https://github.com/sfzhang15/FaceBoxes

os.chdir(caffe_root)
sys.path.insert(0, 'python')
import caffe

caffe.set_device(0)
# caffe.set_mode_gpu()
caffe.set_mode_cpu()

model_def = 'D:/code_cv/face_detection/FaceBoxes/models/faceboxes/deploy.prototxt'
model_weights = 'D:/code_cv/face_detection/FaceBoxes/models/faceboxes/faceboxes.caffemodel'
net = caffe.Net(model_def, model_weights, caffe.TEST)

image = caffe.io.load_image('examples/images/1.jpg')

Thanks and Best Regards,
Ardeal

Train on 512x512 images

Hi, is it possible to train FaceBoxes on 512x512 images? If so, what would be the drawbacks?

detection_dimension in model is redundant

The extra output, detection_dimension is redundant as it can be calculated in prior_box based on width and height already.

It also fails to trace, if you attempt to use the model through JIT as it cannot follow the logic flow.

Can it be used for vehicle testing?

Hello author! I am a beginner, maybe my problem is relatively low. I hope I can get an answer. Thank you very much.

Can faceboxes be used for vehicle testing? I tried to use yolo for vehicle detection, but it was very slow. Yolo is a multi-target detector with a large network structure. If a single target detector, such as a faceboxes face detector, is trained with vehicle data, is it possible to get a vehicle detector?

FDDB test result

@zisianw
@sfzhang15
HI

配置：pytorch0.4 + python2
因为你提提供的代码是基于python3的，所以在相应位置函数加入from future import division

利用提供的训练好的检测模型，基于测试代码，得到了2845幅图片的检测结果
然后送入官方提供的检测器．
标签文件利用的是S3FD中提供的FDDB_annotation_ellipseList_new.txt
结果形式是rectangle，并没有转ellipse
检测结果已发至2位的邮箱

最终结果在False Positives=1000时，对于DiscROC，AP只有93.9915

正负样本匹配策略

作者你好，论文里面负样本比正样本是3，但是代码里面是7，不知道你有没有比例为3的实验结果，检测精度差别大么。

why the function preproc do not resize box?

in the function preproc
height, width, _ = image_t.shape image_t = preproc_for_test(image_t, self.img_dim, self.rgb_means) boxes_t[:, 0::2] /= width boxes_t[:, 1::2] /= height

where preproc_for_test has resize image size ,but why there not have the same deal with box?

About testing procedure

Why do you upscale images in test.py?

# testing scale
    if args.dataset == "FDDB":
        resize = 3
    elif args.dataset == "PASCAL":
        resize = 2.5
    elif args.dataset == "AFW":
        resize = 1

if resize != 1:
    img = cv2.resize(img, None, None, fx=resize, fy=resize, interpolation=cv2.INTER_LINEAR)

It looks to me that resizing images to the training scale (1024x1024) is more logical. But the performance on benchmarks is worse (-2.3%) for some reason.

Height values are bogus

Height values returned by the model will always be wrong for testing. This is possibly due to the image aspect ratio not being 1:1.
Regardless, the width values appear to be correct and you can fix the height values by setting them equal to the width.
By removing height scores, you still get the same result in AFW and PASCAL because these tests ignore width and height (using only the centre positions). Hence, [width, height] should probably be replaced by a 'size' value, which would allow for faster training/converging.

I did a test where i changed prior_box to x,y,size and ignored the fourth value of loc returned by the model. AFW and PASCAL results are unaffected.