GithubHelp home page GithubHelp logo

drryanhuang / mcnn_paddlepaddle Goto Github PK

View Code? Open in Web Editor NEW
7.0 1.0 1.0 14.81 MB

使用paddlepaddle复现人群计数论文MCNN

License: GNU General Public License v3.0

Jupyter Notebook 97.34% Python 2.66%

mcnn_paddlepaddle's Introduction

- MCNN论文复现 -



author



MCNN是CVPR2016年的一篇论文, 作者提出了一种简单而有效的多列卷积神经网络架构(Multi-column Convolutional Neural Network, MCNN),通过使用大小不同的卷积核去适应人/头部大小的变化,以将图片映射为人群密度图。作者在文章中使用了几何自适应高斯核去计算人群密度图(Ground Truth),同时收集并标记了一个大型的新数据集(ShanghaiTech数据集),其中包括1198幅图像,数据集可以在AI Studio上下载到: https://aistudio.baidu.com/aistudio/datasetdetail/10675 .



MCNN受MDNNs的启发,由三列并行的CNN组成,每列CNN卷积核大小不同。为了简化,所有列使用相同的网络结构(即conv-pooling-conv-pooling)。每次池化都会使用2*2的Max Pooling,而激活函数全部选择Relu。堆叠三列CNN的输出特征图,并使用1*1的卷积核将其映射为密度图。MCNN的整体架构图如Figure 1所示:


archit

图 1:用于人群密度图估计的多列卷积神经网络(MCNN)的结构
Figure 1:The structure of the proposed multi-column convolutional neural network for crowd density map estimation.



MCNN在训练时,存在数据样本少和梯度消失的问题,受预训练模型RBM的启发,作者将三列CNN单独进行预训练,将这些预训练的CNN参数初始化为对应的MCNN参数并微调。需要补充的是,MCNN使用了最简单的均方误差作为损失函数。


论文中使用几何自适应高斯核去计算数据图片的Ground Truth:

公式

Figure 2中,显示了两张图片的人群密度图。值得说明的是,由于经过了两次下采样,所以预测出人群密度图的分辨率变为原来的1/4.

figure2

图 2:原始图像和通过几何自适应高斯核进行卷积获得的相应的人群密度图。
Figure 2:Original images and corresponding crowd density maps obtained by convolving geometry-adaptive Gaussian kernels.

MCNN几乎可以从任何观察角度准确估计单个图像中的人群数,在2016年,取得了人群计数领域state-of-art的成绩。同时作者还指出,仅需要对模型最后几层进行微调,便可以将模型轻松迁移到目标问题,验证了模型的鲁棒性。

在论文中,还有很多细节,本篇不再赘述,可以查看原论文MCNN

1.基于飞桨开源框架(Paddlepaddle)复现MCNN

环境依赖:

paddlepaddle >= 1.7.0
numpy
matplotlib
PIL 
opencv-python
2. 训练策略
MCNN受到预训练模型的启发,先将三列CNN单独训练,之后将CNN的参数初始化为MCNN对应的参数之后,整体再进行训练,在原论文中,训练策略为批随机梯度下降法。
3.模型复现效果

我们可以对比使用飞桨的训练效果和原论文的训练效果,可以看出在AI Studio平台的算力加持下,基于飞桨的训练效果更加精确,多了更多细节.

figure3

图 3:原论文中两张测试集图片的结果对比,从左到右分别是原图, Ground True,原论文复现图和基于飞桨的复现结果图
Figure 3:Comparing results

mcnn_paddlepaddle's People

Contributors

drryanhuang avatar gt-acerzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar hhtlsl20 avatar

Watchers

 avatar

Forkers

gt-acerzhang

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.