paddlepaddle / paddle Goto Github PK

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice （『飞桨』核心框架，深度学习&机器学习高性能单机、分布式训练和跨平台部署）

Home Page: http://www.paddlepaddle.org/

License: Apache License 2.0

CMake 1.06% Shell 0.58% Python 41.46% C++ 49.80% C 0.27% Cuda 6.63% Dockerfile 0.01% Batchfile 0.06% R 0.01% Go 0.05% Java 0.02% Jinja 0.08%

deep-learning distributed-training efficiency machine-learning neural-network paddlepaddle python scalability

paddle's People

Contributors

Stargazers

Watchers

Forkers

superjomn qingqing01 hitflame wuwusi frankfqchen hengqujushi timerope phecy zhoujingbo fulquan xcbat wellwang dawume maxiao001 shashankg7 rtvt123 msommer claudiouzelac sandbreaker xurantju yong93 nikhilagrima fox20160901 spazquest szubair-nara weixiaee chuuchhackz yanleirex zhoujialinmumu qjay612 ljzzju superpoca kfroger gzzgz segmond tybxiaobao poisonbox cuijianzhu coocoky tsingcoo anjiang2016 zrachel idkwim happynear buddyhello blackoon vivovip xiaohu2015 zhangshuangjun fancyisbest dongxf369 liujshi mejn woodstone121 pseudoprogrammer zhimingz aozhang tongming zhengfangwu chengat1314 xjsxuexing stevenstoner hitluobin wangyikai xreki helloweishi terencecooper lilonghua1987 zhangruichang rsarxiv sunting78 thesby unipus-ai lebronyxm shqcandy fanfannothing fanan kiralzn wall0p guomin andyseaer cloudsdocker chaoswork wutx12 parksangkil empty16 yiiwood wangsouc a562703299 listentowindy zjzstu kercker loveq369 jzd2010 followheart yiz76 qcjxberin limeng05 jevans12 cyrinabeta

paddle's Issues

Semantic role labeling demo

Is the setup in demo/semantic_role_labeling/train.sh a full replication of the ACL 2015 paper End-to-end Learning of Semantic Role Labeling Using Recurrent Neural Networks? Thanks!

./train.sh: line 21: paddle: command not found

I try to run quick_start demo.
get_data.sh and preprocess.sh is worked but train.sh is not worked.
./train.sh: line 21: paddle: command not found

How can I run train.sh?

paddle*.whl is installed in
./paddle/build/python/dist/paddle-0.8.0b0-py2-none-any.whl
./paddle/opt/paddle/share/wheels/paddle-0.8.0b0-py2-none-any.whl
and my python version is 2.7.12

Illegal instruction on centos7.0

I built paddle from source code on centos7.0 successfully.
But when I try to run paddle_trainer or other binary files, it raise:
[root@iZ28zy4uk59Z bin]# ./paddle_trainer
Illegal instruction
What is wrong?

安装指南中的小问题

在安装过程中mkdir build文件夹，然而在这个文件夹下make 是会出错的，而在主文件夹下make即可成功（我个人的经历），希望做一些补充说明

Compatiblity issue with CUDA 8.0

Hi, there,

I can successfully build Paddle on my machine installed Linux 14.04 LTS and CUDA 8.0 as the official guide. And for sure, the CPU version runs well except the speed....

When I ran the image classification demo with the script train.sh in GPU mode (see train.sh for more details), it unfortunately failed and threw out the following info:

I0901 19:18:02.916951 31272 Util.cpp:144] commandline: ../../build/paddle/trainer/paddle_trainer --config=vgg_16_cifar.py --dot_period=10 --log_period=100 --test_all_data_in_one_period=1 --use_gpu=1 --gpu_id=0 --trainer_count=1 --num_passes=200 --save_dir=./cifar_vgg_model
I0901 19:18:09.213749 31272 Util.cpp:113] Calling runInitFunctions
I0901 19:18:09.428228 31272 Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-01 19:18:09,580 layers.py:1430] channels=3 size=3072
[INFO 2016-09-01 19:18:09,580 layers.py:1430] output size for __conv_0__ is 32
[INFO 2016-09-01 19:18:09,583 layers.py:1430] channels=64 size=65536
[INFO 2016-09-01 19:18:09,583 layers.py:1430] output size for __conv_1__ is 32
[INFO 2016-09-01 19:18:09,586 layers.py:1490] output size for __pool_0__ is 16*16
[INFO 2016-09-01 19:18:09,587 layers.py:1430] channels=64 size=16384
[INFO 2016-09-01 19:18:09,587 layers.py:1430] output size for __conv_2__ is 16
[INFO 2016-09-01 19:18:09,590 layers.py:1430] channels=128 size=32768
[INFO 2016-09-01 19:18:09,590 layers.py:1430] output size for __conv_3__ is 16
[INFO 2016-09-01 19:18:09,592 layers.py:1490] output size for __pool_1__ is 8*8
[INFO 2016-09-01 19:18:09,593 layers.py:1430] channels=128 size=8192
[INFO 2016-09-01 19:18:09,594 layers.py:1430] output size for __conv_4__ is 8
[INFO 2016-09-01 19:18:09,596 layers.py:1430] channels=256 size=16384
[INFO 2016-09-01 19:18:09,597 layers.py:1430] output size for __conv_5__ is 8
[INFO 2016-09-01 19:18:09,599 layers.py:1430] channels=256 size=16384
[INFO 2016-09-01 19:18:09,599 layers.py:1430] output size for __conv_6__ is 8
[INFO 2016-09-01 19:18:09,601 layers.py:1490] output size for __pool_2__ is 4*4
[INFO 2016-09-01 19:18:09,602 layers.py:1430] channels=256 size=4096
[INFO 2016-09-01 19:18:09,603 layers.py:1430] output size for __conv_7__ is 4
[INFO 2016-09-01 19:18:09,605 layers.py:1430] channels=512 size=8192
[INFO 2016-09-01 19:18:09,605 layers.py:1430] output size for __conv_8__ is 4
[INFO 2016-09-01 19:18:09,608 layers.py:1430] channels=512 size=8192
[INFO 2016-09-01 19:18:09,608 layers.py:1430] output size for __conv_9__ is 4
[INFO 2016-09-01 19:18:09,610 layers.py:1490] output size for __pool_3__ is 2*2
[INFO 2016-09-01 19:18:09,611 layers.py:1490] output size for __pool_4__ is 1*1
[INFO 2016-09-01 19:18:09,615 networks.py:960] The input order is [image, label]
[INFO 2016-09-01 19:18:09,615 networks.py:963] The output order is [__cost_0__]
I0901 19:18:09.653937 31272 Trainer.cpp:169] trainer mode: Normal
F0901 19:18:09.658243 31272 hl_gpu_matrix_kernel.cuh:181] Check failed: cudaSuccess == err (0 vs. 8) [hl_gpu_apply_unary_op failed] CUDA error: invalid device function
*** Check failure stack trace: ***
    @     0x7efd1b172daa  (unknown)
    @     0x7efd1b172ce4  (unknown)
    @     0x7efd1b1726e6  (unknown)
    @     0x7efd1b175687  (unknown)
    @           0x78b159  hl_gpu_apply_unary_op<>()
    @           0x753edf  paddle::BaseMatrixT<>::applyUnary<>()
    @           0x753ac9  paddle::BaseMatrixT<>::applyUnary<>()
    @           0x73e04f  paddle::BaseMatrixT<>::zero()
    @           0x62af8e  paddle::Parameter::enableType()
    @           0x6272ec  paddle::parameterInitNN()
    @           0x62975b  paddle::NeuralNetwork::init()
    @           0x62eda3  paddle::GradientMachine::create()
    @           0x6a84e5  paddle::TrainerInternal::init()
    @           0x6a4907  paddle::Trainer::init()
    @           0x543935  main
    @     0x7efd1a37ef45  (unknown)
    @           0x54efd5  (unknown)
    @              (nil)  (unknown)
Aborted (core dumped)
No data to plot. Exiting!

It seems that Paddle still does not support the latest version of CUDA....

Appended my train.sh as a clue:

(omitted the original copyright info)
#!/bin/bash

set -e
config=vgg_16_cifar.py
output=./cifar_vgg_model
log=train.log

../../build/paddle/trainer/paddle_trainer \
--config=$config \
--dot_period=10 \
--log_period=100 \
--test_all_data_in_one_period=1 \
--use_gpu=1 \
--gpu_id=0 \
--trainer_count=1 \
--num_passes=200 \
--save_dir=$output \
2>&1 | tee $log

python -m paddle.utils.plotcurve -i $log > plot.png

Could NOT find PythonLibs (missing: PYTHON_LIBRARIES PYTHON_INCLUDE_DIRS)

I have installed python2.7 using anaconda, but i have the following error:
Could NOT find PythonLibs (missing: PYTHON_LIBRARIES PYTHON_INCLUDE_DIRS)

sxi_sock.h is missing during RDMA build

[ 86%] Building CXX object paddle/pserver/CMakeFiles/paddle_network.dir/LightNetwork.cpp.o
In file included from /usr/local/src/Paddle/paddle/pserver/LightNetwork.cpp:32:0:
/usr/local/src/Paddle/paddle/pserver/RDMANetwork.h:18:22: fatal error: sxi_sock.h: No such file or directory
#include "sxi_sock.h"
^
compilation terminated.
make[2]: *** [paddle/pserver/CMakeFiles/paddle_network.dir/LightNetwork.cpp.o] Error 1
make[1]: *** [paddle/pserver/CMakeFiles/paddle_network.dir/all] Error 2
make: *** [all] Error 2

支持mac os吗

测试vgg_16_cifar.py报错

ubuntu 14.04， cuda 7.5, cudnn 5.1.5 安装成功
但是跑demo/image_classification/train.sh时报错，错误信息如下：

[INFO 2016-08-31 17:20:21,497 layers.py:1430] channels=512 size=8192
[INFO 2016-08-31 17:20:21,497 layers.py:1430] output size for conv_8 is 4
[INFO 2016-08-31 17:20:21,498 layers.py:1430] channels=512 size=8192
[INFO 2016-08-31 17:20:21,499 layers.py:1430] output size for conv_9 is 4
[INFO 2016-08-31 17:20:21,501 layers.py:1490] output size for pool_3 is 2_2
[INFO 2016-08-31 17:20:21,502 layers.py:1490] output size for pool_4 is 1_1
[INFO 2016-08-31 17:20:21,507 networks.py:960] The input order is [image, label]
[INFO 2016-08-31 17:20:21,507 networks.py:963] The output order is [cost_0]
I0831 17:20:21.523936 13974 Trainer.cpp:169] trainer mode: Normal
I0831 17:20:21.546594 13974 PyDataProvider2.cpp:219] loading dataprovider image_provider::processData
[INFO 2016-08-31 17:20:21,682 image_provider.py:52] Image size: 32
[INFO 2016-08-31 17:20:21,682 image_provider.py:53] Meta path: data/cifar-out/batches/batches.meta
[INFO 2016-08-31 17:20:21,682 image_provider.py:58] DataProvider Initialization finished
I0831 17:20:21.682675 13974 PyDataProvider2.cpp:219] loading dataprovider image_provider::processData
[INFO 2016-08-31 17:20:21,682 image_provider.py:52] Image size: 32
[INFO 2016-08-31 17:20:21,682 image_provider.py:53] Meta path: data/cifar-out/batches/batches.meta
[INFO 2016-08-31 17:20:21,682 image_provider.py:58] DataProvider Initialization finished
I0831 17:20:21.683006 13974 GradientMachine.cpp:134] Initing parameters..
I0831 17:20:22.312453 13974 GradientMachine.cpp:141] Init parameters done.
.........
I0831 17:20:52.894659 13974 TrainerInternal.cpp:162] Batch=100 samples=12800 AvgCost=2.35864 CurrentCost=2.35864 Eval: classification_error_evaluator=0.833906 CurrentEval: classification_error_evaluator=0.833906
.........
I0831 17:21:00.884374 13974 TrainerInternal.cpp:162] Batch=200 samples=25600 AvgCost=2.15774 CurrentCost=1.95684 Eval: classification_error_evaluator=0.792148 CurrentEval: classification_error_evaluator=0.750391
.........
I0831 17:21:08.731333 13974 TrainerInternal.cpp:162] Batch=300 samples=38400 AvgCost=2.01417 CurrentCost=1.72705 Eval: classification_error_evaluator=0.753672 CurrentEval: classification_error_evaluator=0.676719
.........I0831 17:21:15.873359 13974 TrainerInternal.cpp:179] Pass=0 Batch=391 samples=50048 AvgCost=1.90795 Eval: classification_error_evaluator=0.71814
F0831 17:21:18.497601 13974 hl_cuda_cudnn.cc:779] Check failed: CUDNN_STATUS_SUCCESS == cudnnStat (0 vs. 5) Cudnn Error: CUDNN_STATUS_INVALID_VALUE
*** Check failure stack trace: ***
@ 0x7f609f255daa (unknown)
@ 0x7f609f255ce4 (unknown)
@ 0x7f609f2556e6 (unknown)
@ 0x7f609f258687 (unknown)
@ 0x8a98d4 hl_convolution_forward()
@ 0x5c66fc paddle::CudnnConvLayer::forward()
@ 0x62305c paddle::NeuralNetwork::forward()
@ 0x6b54af paddle::Tester::testOneBatch()
@ 0x6b5dc2 paddle::Tester::testOnePeriod()
@ 0x69a28c paddle::Trainer::trainOnePass()
@ 0x69d687 paddle::Trainer::train()
@ 0x53b0b3 main
@ 0x7f609e461ec5 (unknown)
@ 0x546695 (unknown)
@ (nil) (unknown)

更改cudnn版本，5.0.5， 4.0.4错误都一样~
求助！

building error

Hi,
When building paddle with gcc4.8.2, there is an error:
Building CXX object paddle/utils/CMakeFiles/paddle_utils.dir/PythonUtil.cpp.o
/usr/local/include/python2.7/pyport.h:871:76: error: 'PyArg_ParseTuple' is an unrecognized format function type [-Werror=format=]

Thanks.

read data from hdfs

"Different node should owns different parts of all Train data. This simple script did not do this job, so you should prepare it at last. " I saw this in cluster training wiki. So, could paddle read data from hdfs and distribute data to each node automatically?

Follow the build instructions, but fails in the build directory

I tried:
git clone https://github.com/baidu/Paddle paddle
cd paddle
mkdir build
cd build
cmake -DWITH_GPU=OFF -DWITH_DOC=OFF
CMake Error: The source directory "/home/mwang/paddle_src/build" does not appear to contain CMakeLists.txt.
Specify --help for usage, or press the help button on the CMake GUI.

I can only do cmake in the parent folder and build the package succesfully.

Floating point exception (overflow)

Used same config as in Issue #44 with CPU (changed only batch_size=45)
Looks like floating point overflow, but I can't figure what causing it. Maybe incorrect layer connection?

I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/utils/Util.cpp:144] commandline: /opt/paddle/bin/../opt/paddle/bin/paddle_trainer --config=trainer_config.py --save_dir=./model_output --job=train --use_gpu=false --trainer_count=4 --num_passes=100000 --log_period=10 --dot_period=1 --show_parameter_stats_period=1000 --test_all_data_in_one_period=1 --saving_period=100 
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/utils/Util.cpp:113] Calling runInitFunctions
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/utils/Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-08 02:48:21,778 networks.py:1122] The input order is [input, label]
[INFO 2016-09-08 02:48:21,778 networks.py:1129] The output order is [__cost_0__]
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/Trainer.cpp:169] trainer mode: Normal
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/dataproviders/PyDataProvider2.cpp:219] loading dataprovider dataprovider::process
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/dataproviders/PyDataProvider2.cpp:219] loading dataprovider dataprovider::process
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/gradientmachines/GradientMachine.cpp:134] Initing parameters..
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/gradientmachines/GradientMachine.cpp:141] Init parameters done.
....I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/TrainerInternal.cpp:179]  Pass=0 Batch=4 samples=178 AvgCost=15884.9 Eval: classification_error_evaluator=0.993309 
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/Tester.cpp:111]  Test samples=2 cost=172604 Eval: classification_error_evaluator=0.995207 
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/gradientmachines/GradientMachine.cpp:112] Saving parameters to ./model_output/pass-00000
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/utils/Util.cpp:219] copy trainer_config.py to ./model_output/pass-00000
....I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/TrainerInternal.cpp:179]  Pass=1 Batch=4 samples=178 AvgCost=14954.9 Eval: classification_error_evaluator=0.980111 
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/Tester.cpp:111]  Test samples=2 cost=159166 Eval: classification_error_evaluator=0.975115 
....I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/TrainerInternal.cpp:179]  Pass=2 Batch=4 samples=178 AvgCost=14009.3 Eval: classification_error_evaluator=0.935489 
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/Tester.cpp:111]  Test samples=2 cost=135530 Eval: classification_error_evaluator=0.871007 
... some steps ....

I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/Tester.cpp:111]  Test samples=2 cost=97979 Eval: classification_error_evaluator=0.676567 
....I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/TrainerInternal.cpp:179]  Pass=46 Batch=4 samples=178 AvgCost=8838.92 Eval: classification_error_evaluator=0.705236 
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/Tester.cpp:111]  Test samples=2 cost=102650 Eval: classification_error_evaluator=0.730931 
....I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/TrainerInternal.cpp:179]  Pass=47 Batch=4 samples=178 AvgCost=8806.03 Eval: classification_error_evaluator=0.701997 
I /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/Tester.cpp:111]  Test samples=2 cost=91264.2 Eval: classification_error_evaluator=0.659892 
/opt/paddle/bin/paddle: line 46:  4547 Floating point exception(core dumped) ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

Error repeats after ~40 passes each time I run training
Backtrace:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/opt/paddle/bin/../opt/paddle/bin/paddle_trainer --config=trainer_config.py --s'.
Program terminated with signal SIGFPE, Arithmetic exception.
#0  0x00007f53d32d8a15 in __ieee754_exp_avx (x=<optimized out>) at ../sysdeps/ieee754/dbl-64/e_exp.c:214
214     ../sysdeps/ieee754/dbl-64/e_exp.c: No such file or directory.
[Current thread is 1 (Thread 0x7f53cec29700 (LWP 4548))]
(gdb) bt
#0  0x00007f53d32d8a15 in __ieee754_exp_avx (x=<optimized out>) at ../sysdeps/ieee754/dbl-64/e_exp.c:214
#1  0x00007f53d329847f in __GI___exp (x=711.2794189453125) at ../sysdeps/ieee754/dbl-64/w_exp.c:26
#2  0x0000000000e2c4dd in hppl::tanh (a=-355.639709) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/cuda/src/hl_cpu_functions.cc:33
#3  0x0000000000a3dd22 in hppl::forward::lstm::operator() (this=0x7f53cec28050, valueIn=@0x7f53cec28014: -0.940858305, valueIg=@0x7f53cec28010: 0.999997735, valueFg=@0x7f53cec2800c: 0.999997735, valueOg=@0x7f53cec28008: 0.999997735, prevState=@0x7f53cec27ff4: -354.699646, 
    state=@0x7f53cec27ff8: -355.639709, stateAtv=@0x7f53cec27ff0: 0.368853271, output=@0x7f53cec27fec: 0.165656254, checkI=@0x7f53cec28004: -0.0588896535, checkF=@0x7f53cec28000: -0.0764867961, checkO=@0x7f53cec27ffc: -0.0473404899, actInput=0xe2c4ba <hppl::tanh(float)>, 
    actGate=0xe2c431 <hppl::sigmoid(float)>, actState=0xe2c4ba <hppl::tanh(float)>) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/cuda/include/hl_lstm_ops.cuh:65
#4  0x0000000000a3ec6c in hl_naive_lstm_forward_one_sequence<hppl::forward::lstm> (op=..., value=..., frameSize=6, active_node=HL_ACTIVATION_TANH, active_gate=HL_ACTIVATION_SIGMOID, active_state=HL_ACTIVATION_TANH)
    at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/cuda/include/hl_cpu_lstm.cuh:60
#5  0x0000000000a3e662 in hl_cpu_lstm_forward<hppl::forward::lstm> (op=..., value=..., frameSize=6, active_node=HL_ACTIVATION_TANH, active_gate=HL_ACTIVATION_SIGMOID, active_state=HL_ACTIVATION_TANH) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/cuda/include/hl_cpu_lstm.cuh:348
#6  0x0000000000a3d94f in paddle::LstmCompute::forwardOneSequence<false> (this=0x2e423a8, value=..., frameSize=6) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/layers/LstmCompute.cpp:32
#7  0x0000000000a3da0f in paddle::LstmCompute::forwardBatch<false> (this=0x2e423a8, value=..., frameSize=6, batchSize=10) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/layers/LstmCompute.cpp:47
#8  0x0000000000a3b75d in paddle::LstmLayer::forwardBatch (this=0x2e42010, batchSize=37105, numSequences=11, starts=0x7f53b805bb40, inputValue=std::shared_ptr (count 2, weak 0) 0x7f53b80d1a10) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/layers/LstmLayer.cpp:501
#9  0x0000000000a38c8c in paddle::LstmLayer::forward (this=0x2e42010, passType=paddle::enumeration_wrapper::PASS_TRAIN) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/layers/LstmLayer.cpp:172
#10 0x0000000000ac2334 in paddle::NeuralNetwork::forward (this=0x2e1a3e0, inArgs=std::vector of length 2, capacity 2 = {...}, outArgs=0x2e10d08, passType=paddle::enumeration_wrapper::PASS_TRAIN)
    at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/gradientmachines/NeuralNetwork.cpp:242
#11 0x0000000000ad620c in paddle::TrainerThread::forward (this=0x2e10be0) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/gradientmachines/MultiGradientMachine.cpp:581
#12 0x0000000000ad5ef2 in paddle::TrainerThread::computeThread (this=0x2e10be0) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/gradientmachines/MultiGradientMachine.cpp:519
#13 0x0000000000ad5abd in paddle::TrainerThread::<lambda()>::operator()(void) const (__closure=0x2ef45f8) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/gradientmachines/MultiGradientMachine.cpp:465
#14 0x0000000000adb9b2 in std::_Bind_simple<paddle::TrainerThread::start()::<lambda()>()>::_M_invoke<>(std::_Index_tuple<>) (this=0x2ef45f8) at /opt/gcc/include/c++/4.9.4/functional:1700
#15 0x0000000000adb6ed in std::_Bind_simple<paddle::TrainerThread::start()::<lambda()>()>::operator()(void) (this=0x2ef45f8) at /opt/gcc/include/c++/4.9.4/functional:1688
#16 0x0000000000adb4d2 in std::thread::_Impl<std::_Bind_simple<paddle::TrainerThread::start()::<lambda()>()> >::_M_run(void) (this=0x2ef45e0) at /opt/gcc/include/c++/4.9.4/thread:115
#17 0x00007f53d363d380 in std::execute_native_thread_routine_compat (__p=<optimized out>) at ../../../../../libstdc++-v3/src/c++11/thread.cc:110
#18 0x00007f53d7578454 in start_thread (arg=0x7f53cec29700) at pthread_create.c:333
#19 0x00007f53d2da715d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) frame 0
#0  0x00007f53d32d8a15 in __ieee754_exp_avx (x=<optimized out>) at ../sysdeps/ieee754/dbl-64/e_exp.c:214
214     in ../sysdeps/ieee754/dbl-64/e_exp.c
(gdb) info locals
ctx = {env = {__control_word = <optimized out>, __glibc_reserved1 = <optimized out>, __status_word = <optimized out>, __glibc_reserved2 = <optimized out>, __tags = <optimized out>, __glibc_reserved3 = <optimized out>, __eip = <optimized out>, __cs_selector = <optimized out>, 
    __opcode = <optimized out>, __glibc_reserved4 = <optimized out>, __data_offset = <optimized out>, __data_selector = <optimized out>, __glibc_reserved5 = <optimized out>, __mxcsr = 39281}, updated_status = <optimized out>}
bexp = <optimized out>
t = 0.11041169086502123
eps = <optimized out>
del = <optimized out>
base = 0.11041259765625
y = 25769803776.110413
al = 1.1167387406605691
bet = -1.4572163044673799e-09
res = 1.1167377264919247
rem = -1.0141686444258574e-06
cor = -2.8229067033144067e-17
junk1 = <optimized out>
m = 1082538556
n = 1082538556
ex = <optimized out>
retval = <optimized out>
(gdb) frame 1
#1  0x00007f53d329847f in __GI___exp (x=711.2794189453125) at ../sysdeps/ieee754/dbl-64/w_exp.c:26
26      ../sysdeps/ieee754/dbl-64/w_exp.c: No such file or directory.
(gdb) info locals
z = <optimized out>
(gdb) frame 2
#2  0x0000000000e2c4dd in hppl::tanh (a=-355.639709) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/cuda/src/hl_cpu_functions.cc:33
33          return (2.0 / (1.0 + exp(-2.0*a))) - 1.0;
(gdb) info locals
No locals.
(gdb) frame 3
#3  0x0000000000a3dd22 in hppl::forward::lstm::operator() (this=0x7f53cec28050, valueIn=@0x7f53cec28014: -0.940858305, valueIg=@0x7f53cec28010: 0.999997735, valueFg=@0x7f53cec2800c: 0.999997735, valueOg=@0x7f53cec28008: 0.999997735, prevState=@0x7f53cec27ff4: -354.699646, 
    state=@0x7f53cec27ff8: -355.639709, stateAtv=@0x7f53cec27ff0: 0.368853271, output=@0x7f53cec27fec: 0.165656254, checkI=@0x7f53cec28004: -0.0588896535, checkF=@0x7f53cec28000: -0.0764867961, checkO=@0x7f53cec27ffc: -0.0473404899, actInput=0xe2c4ba <hppl::tanh(float)>, 
    actGate=0xe2c431 <hppl::sigmoid(float)>, actState=0xe2c4ba <hppl::tanh(float)>) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/cuda/include/hl_lstm_ops.cuh:65
65          stateAtv = actState(state);
(gdb) info locals
No locals.
(gdb) frame 4
#4  0x0000000000a3ec6c in hl_naive_lstm_forward_one_sequence<hppl::forward::lstm> (op=..., value=..., frameSize=6, active_node=HL_ACTIVATION_TANH, active_gate=HL_ACTIVATION_SIGMOID, active_state=HL_ACTIVATION_TANH)
    at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/cuda/include/hl_cpu_lstm.cuh:60
60          op(rValueIn,
(gdb) info locals
i = 3
rValueIn = -0.940858305
rValueFg = 0.999997735
rCheckO = -0.0473404899
rPrevState = -354.699646
rOut = 0.165656254
valueOg = 0x7f5324c72908
rValueIg = 0.999997735
valueIn = 0x7f5324c728c0
valueFg = 0x7f5324c728f0
rCheckI = -0.0588896535
valueIg = 0x7f5324c728d8
rValueOg = 0.999997735
rCheckF = -0.0764867961
rState = -355.639709
rStateAtv = 0.368853271
(gdb) frame 5
#5  0x0000000000a3e662 in hl_cpu_lstm_forward<hppl::forward::lstm> (op=..., value=..., frameSize=6, active_node=HL_ACTIVATION_TANH, active_gate=HL_ACTIVATION_SIGMOID, active_state=HL_ACTIVATION_TANH) at /home/foreach/SOFT/BAIDU/PADDLE/Paddle/paddle/cuda/include/hl_cpu_lstm.cuh:348
348         hl_naive_lstm_forward_one_sequence(op, value, frameSize,
(gdb) info locals
No locals.

Paddle build options:

cmake -DWITH_GPU=ON -DWITH_DOC=OFF -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=/opt/paddle ..

BLAS backend is Intel MKL 11.3.3.210 CPU is Intel i5 4690K

line 46: 3085 Illegal instruction

I built paddle from source code on ubuntu14 successfully.
But when I try to run demo sentiment，./train.sh，it raise
what's wrong?

I0910 01:24:00.869675 3085 Util.cpp:113] Calling runInitFunctions
I0910 01:24:00.870432 3085 Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-10 01:24:02,138 networks.py:1122] The input order is [word, label]
[INFO 2016-09-10 01:24:02,138 networks.py:1129] The output order is [cost_0]
I0910 01:24:02.245304 3085 Trainer.cpp:169] trainer mode: Normal
I0910 01:24:12.127163 3085 PyDataProvider2.cpp:219] loading dataprovider dataprovider::process
[INFO 2016-09-10 01:24:12,160 dataprovider.py:22] dict len : 101745
I0910 01:24:12.680724 3085 PyDataProvider2.cpp:219] loading dataprovider dataprovider::process
[INFO 2016-09-10 01:24:12,681 dataprovider.py:22] dict len : 101745
I0910 01:24:12.687304 3085 GradientMachine.cpp:134] Initing parameters..
I0910 01:24:13.452847 3085 GradientMachine.cpp:141] Init parameters done.
Current Layer forward/backward stack is
LayerName: lstmemory_0
LayerName: fc_layer_0
LayerName: embedding_0
LayerName: word
*** Aborted at 1473495865 (unix time) try "date -d @1473495865" if you are using GNU date ***
Current Layer forward/backward stack is
PC: @ 0x7b0581 hppl::relu()
Current Layer forward/backward stack is
*** SIGILL (@0x7b0581) received by PID 3085 (TID 0x7f38e974a700) from PID 8062337; stack trace: ***
Current Layer forward/backward stack is
@ 0x7f39014e5340 (unknown)
Current Layer forward/backward stack is
@ 0x7b0581 hppl::relu()
Current Layer forward/backward stack is
@ 0x5b722c paddle::LstmCompute::forwardOneSequence<>()
Current Layer forward/backward stack is
@ 0x5b77cb paddle::LstmCompute::forwardBatch<>()
Current Layer forward/backward stack is
@ 0x62b278 paddle::LstmLayer::forwardBatch()
Current Layer forward/backward stack is
@ 0x62cad0 paddle::LstmLayer::forward()
Current Layer forward/backward stack is
@ 0x53749c paddle::NeuralNetwork::forward()
Current Layer forward/backward stack is
@ 0x54e447 paddle::TrainerThread::forward()
Current Layer forward/backward stack is
@ 0x55027c paddle::TrainerThread::computeThread()
Current Layer forward/backward stack is
@ 0x7f39004a1a60 (unknown)
Current Layer forward/backward stack is
@ 0x7f39014dd182 start_thread
Current Layer forward/backward stack is
@ 0x7f38ffc0930d (unknown)
Current Layer forward/backward stack is
@ 0x0 (unknown)
/home/lbj/Paddle-master/build/bin/paddle: line 46: 3085 Illegal instruction (core dumped) ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

按照quick start运行出错

您好，我按照你们的quick start配置好了环境，运行train.sh的时候报了错误：
/usr/local/bin/paddle: line 46: 4977 Illegal instruction (core dumped) ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

hi

Does it support AMD/OpenCL? Thanks

run vgg_16_cifar wrong.

运行./train.sh 出现下面错误：
I0910 08:11:05.670004 1881 GradientMachine.cpp:134] Initing parameters..
I0910 08:11:06.422868 1881 GradientMachine.cpp:141] Init parameters done.
/usr/local/bin/paddle: line 46: 1881 Killed ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}
No data to plot. Exiting!

运行的环境是paddledev/paddle:cpu-demo-latest，paddle版本为:
PaddlePaddle 0.8.0b, compiled with
with_avx: OFF
with_gpu: OFF
with_double: OFF
with_python: ON
with_rdma: OFF
with_glog: ON
with_gflags: ON
with_metric_learning:
with_timer: OFF
with_predict_sdk:

LSTM running error !

HI, all
In quick start demo, I have tested LR, WE+LR, WE+CNN successfully.
But when i run WE+LSTM, the following error occurred:

I0905 12:50:46.881464 24807 Util.cpp:144] commandline: /data11/dis_ml/deeplearning/paddle/bin/../opt/paddle/bin/paddle_trainer --config=trainer_config.lstm.py --save_dir=./output_lstm --trainer_count=16 --log_period=1000 --num_passes=15 --use_gpu=false --show_parameter_stats_period=2000 --test_all_data_in_one_period=1
I0905 12:50:46.881670 24807 Util.cpp:113] Calling runInitFunctions
I0905 12:50:46.881896 24807 Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-05 12:50:47,468 networks.py:1122] The input order is [word, label]
[INFO 2016-09-05 12:50:47,468 networks.py:1125] The output order is [cost_0]
I0905 12:50:47.492952 24807 Trainer.cpp:169] trainer mode: Normal
I0905 12:50:47.660745 24807 PyDataProvider2.cpp:219] loading dataprovider dataprovider_emb::process
I0905 12:50:47.684976 24807 PyDataProvider2.cpp:219] loading dataprovider dataprovider_emb::process
I0905 12:50:47.685173 24807 GradientMachine.cpp:134] Initing parameters..
I0905 12:50:47.901549 24807 GradientMachine.cpp:141] Init parameters done.
I0905 12:50:48.229571 24813 ThreadLocal.cpp:39] thread use undeterministic rand seed:24814
I0905 12:50:48.229737 24821 ThreadLocal.cpp:39] thread use undeterministic rand seed:24822
I0905 12:50:48.230121 24818 ThreadLocal.cpp:39] thread use undeterministic rand seed:24819
I0905 12:50:48.230481 24814 ThreadLocal.cpp:39] thread use undeterministic rand seed:24815
I0905 12:50:48.230881 24810 ThreadLocal.cpp:39] thread use undeterministic rand seed:24811
I0905 12:50:48.232058 24820 ThreadLocal.cpp:39] thread use undeterministic rand seed:24821
Current Layer forward/backward stack is
LayerName: lstmemory_0
LayerName: fc_layer_0
LayerName: embedding_0
LayerName: word
*** Aborted at 1473079848 (unix time) try "date -d @1473079848" if you are using GNU date ***
I0905 12:50:48.248039 24822 ThreadLocal.cpp:39] thread use undeterministic rand seed:24823
Current Layer forward/backward stack is
PC: @ 0x8024f0 (unknown)
I0905 12:50:48.253355 24811 ThreadLocal.cpp:39] thread use undeterministic rand seed:24812
I0905 12:50:48.254111 24812 ThreadLocal.cpp:39] thread use undeterministic rand seed:24813
I0905 12:50:48.256650 24816 ThreadLocal.cpp:39] thread use undeterministic rand seed:24817
I0905 12:50:48.259268 24823 ThreadLocal.cpp:39] thread use undeterministic rand seed:24824
I0905 12:50:48.260787 24819 ThreadLocal.cpp:39] thread use undeterministic rand seed:24820
I0905 12:50:48.263543 24815 ThreadLocal.cpp:39] thread use undeterministic rand seed:24816
I0905 12:50:48.264271 24808 ThreadLocal.cpp:39] thread use undeterministic rand seed:24809
I0905 12:50:48.265414 24817 ThreadLocal.cpp:39] thread use undeterministic rand seed:24818
I0905 12:50:48.271780 24809 ThreadLocal.cpp:39] thread use undeterministic rand seed:24810

demo sentiment: wrong parameter act_type

In source code paddle/demo/sentiment/sentiment_net.py, line 67 & 68 are:

output = fc_layer(input=dropout, size=class_dim,
act_type=SoftmaxActivation())

Should change "act_type" to "act".

Procedure of getting the bug:

Uncomment line 39 and comment line 37&38 in paddle/demo/sentiment/trainer_config.py
run train.sh in this demo, get the error log as follows:

I0902 14:54:12.794140 2965 Util.cpp:144] commandline: /data/paddle_build/bin/../opt/paddle/bin/paddle_trainer --config=trainer_config.py --save_dir=./model_output --job=train --use_gpu=true --trainer_count=4 --num_passes=10 --log_period=10 --dot_period=20 --show_parameter_stats_period=100 --test_all_data_in_one_period=1
I0902 14:54:14.613126 2965 Util.cpp:113] Calling runInitFunctions
I0902 14:54:14.613312 2965 Util.cpp:126] Call runInitFunctions done.
[WARNING 2016-09-02 14:54:14,902 default_decorators.py:40] please use keyword arguments in paddle config.
[WARNING 2016-09-02 14:54:14,903 default_decorators.py:40] please use keyword arguments in paddle config.
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer/config_parser.py", line 3113, in parse_config_and_serialize
config = parse_config(config_file, config_arg_str)
File "/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer/config_parser.py", line 3089, in parse_config
execfile(config_file, make_config_environment(config_file, config_args))
File "trainer_config.py", line 39, in
bidirectional_lstm_net(dict_dim, class_dim=class_dim, is_predict=is_predict)
File "/data/cactiball/paddle/demo/sentiment/sentiment_net.py", line 68, in bidirectional_lstm_net
act_type=SoftmaxActivation())
File "/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer_config_helpers/default_decorators.py", line 45, in wrapper
return func(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer_config_helpers/default_decorators.py", line 45, in wrapper
return func(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer_config_helpers/default_decorators.py", line 45, in wrapper
return func(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer_config_helpers/default_decorators.py", line 45, in wrapper
return func(_args, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer_config_helpers/layers.py", line 219, in wrapper
return method(_args, *_kwargs)
TypeError: fc_layer() got an unexpected keyword argument 'act_type'
F0902 14:54:14.905133 2965 PythonUtil.cpp:130] Check failed: (ret) != nullptr Python Error: <type 'exceptions.TypeError'> : fc_layer() got an unexpected keyword argument 'act_type'
Python Callstack:
/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer/config_parser.py : 3113
/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer/config_parser.py : 3089
trainer_config.py : 39
/data/cactiball/paddle/demo/sentiment/sentiment_net.py : 68
/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer_config_helpers/default_decorators.py : 45
/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer_config_helpers/default_decorators.py : 45
/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer_config_helpers/default_decorators.py : 45
/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer_config_helpers/default_decorators.py : 45
/usr/local/lib/python2.7/dist-packages/paddle-0.8.0b-py2.7.egg/paddle/trainer_config_helpers/layers.py : 219
Call Object failed.
*** Check failure stack trace: ***
@ 0x7f6febce7daa (unknown)
@ 0x7f6febce7ce4 (unknown)
@ 0x7f6febce76e6 (unknown)
@ 0x7f6febcea687 (unknown)
@ 0x82f6d7 paddle::callPythonFuncRetPyObj()
@ 0x82fa3c paddle::callPythonFunc()
@ 0x6a8173 paddle::TrainerConfigHelper::TrainerConfigHelper()
@ 0x6a87b4 paddle::TrainerConfigHelper::createFromFlags()
@ 0x53af73 main
@ 0x7f6feaef3ec5 (unknown)
@ 0x5466b5 (unknown)
@ (nil) (unknown)
/data/paddle_build/bin/paddle: line 46: 2965 Aborted (core d

paddle.trainer .config_parser

I0831 15:27:18.271451 18844 Util.cpp:113] Calling runInitFunctions
I0831 15:27:18.271617 18844 Util.cpp:126] Call runInitFunctions done.
F0831 15:27:18.277050 18844 PythonUtil.cpp:120] Check failed: (pyModule) != nullptr Python Error: <type 'exceptions.ImportError'> : No module named paddle.trainer.config_parser
Python Callstack:
Import Python Modulepaddle.trainer.config_parser failed.
*** Check failure stack trace: ***
@ 0x7f6f31c25daa (unknown)
@ 0x7f6f31c25ce4 (unknown)
@ 0x7f6f31c256e6 (unknown)
@ 0x7f6f31c28687 (unknown)
@ 0x832e19 paddle::callPythonFuncRetPyObj()
@ 0x832ffc paddle::callPythonFunc()
@ 0x6a94e3 paddle::TrainerConfigHelper::TrainerConfigHelper()
@ 0x6a9b24 paddle::TrainerConfigHelper::createFromFlags()
@ 0x53af73 main
@ 0x7f6f30e31f45 (unknown)
@ 0x5466b5 (unknown)
@ (nil) (unknown)
/home/jamin/Paddle/bin/paddle: line 46: 18844 Aborted (core dumped) ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

I used anaconda python , at first I make install to /opt/Paddle.
I think this error is because I use the sudo make install ?
when I use ipython , I can import paddle.trainer.config_parser well , but I can't train by ./train.sh ,
?

Remove unnecessary null pointer checks

An extra null pointer check is not needed in functions like the following.

有人安装成功吗？？？

死活安装不成，在多机环境下安装成功吗

Cudnn Batch-normalization not working with cudnn-v4

Hi,

I tried make-test with cudnn v4. The batch-normalization reports "segmentation fault" problem when I make test.

The Cudnn-v5 version works without problems. (Maybe cudnn-v5 suggested?)

安装失败

环境：ubuntu16.04 + cuda8.0RC + cuDNN5.1.5

CMake Error at paddle_cuda_generated_hl_table_apply.cu.o.cmake:262 (message):
  Error generating file
  /home/pcy/paddle/paddle/cuda/CMakeFiles/paddle_cuda.dir/src/./paddle_cuda_generated_hl_table_apply.cu.o


paddle/cuda/CMakeFiles/paddle_cuda.dir/build.make:2365: recipe for target 'paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_table_apply.cu.o' failed
make[2]: *** [paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_table_apply.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
Done processing /home/pcy/paddle/paddle/math/Vector.cpp
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/Allocator.h
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/BaseMatrix.h
Total errors found: 0
/home/pcy/paddle/paddle/cuda/include/hl_device_functions.cuh:39:13: error: 'double hppl::atomicAdd(double*, double)' conflicts with a previous declaration
 using hppl::atomicAdd;
             ^
/usr/local/cuda/include/sm_60_atomic_functions.h:308:46: note: previous declaration 'double atomicAdd(double*, double)'
 __SM_60_ATOMIC_FUNCTIONS_DECL__ double atomicAdd(double *address, double val) __DEF_IF_HOST
                                              ^
Done processing /home/pcy/paddle/paddle/math/Bits.h
Total errors found: 0
CMake Error at paddle_cuda_generated_hl_cuda_sequence.cu.o.cmake:262 (message):
  Error generating file
  /home/pcy/paddle/paddle/cuda/CMakeFiles/paddle_cuda.dir/src/./paddle_cuda_generated_hl_cuda_sequence.cu.o


paddle/cuda/CMakeFiles/paddle_cuda.dir/build.make:2140: recipe for target 'paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_cuda_sequence.cu.o' failed
make[2]: *** [paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_cuda_sequence.cu.o] Error 1
Done processing /home/pcy/paddle/paddle/math/CpuSparseMatrix.h
Total errors found: 0
/home/pcy/paddle/paddle/cuda/src/hl_cuda_aggregate.cu: In function 'void hl_vector_sum(paddle::real*, paddle::real*, int)':
/home/pcy/paddle/paddle/cuda/src/hl_cuda_aggregate.cu:268:21: error: comparison of constant '34' with boolean expression is always false [-Werror=bool-compare]
   } while (isNotReady == cudaErrorNotReady);
                     ^
/home/pcy/paddle/paddle/cuda/src/hl_cuda_aggregate.cu: In function 'void hl_vector_abs_sum(paddle::real*, paddle::real*, int)':
/home/pcy/paddle/paddle/cuda/src/hl_cuda_aggregate.cu:324:21: error: comparison of constant '34' with boolean expression is always false [-Werror=bool-compare]
   } while (isNotReady == cudaErrorNotReady);
                     ^
Done processing /home/pcy/paddle/paddle/math/ExecViaCpu.h
Total errors found: 0
cc1plus: all warnings being treated as errors
CMake Error at paddle_cuda_generated_hl_cuda_aggregate.cu.o.cmake:262 (message):
  Error generating file
  /home/pcy/paddle/paddle/cuda/CMakeFiles/paddle_cuda.dir/src/./paddle_cuda_generated_hl_cuda_aggregate.cu.o


paddle/cuda/CMakeFiles/paddle_cuda.dir/build.make:523: recipe for target 'paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_cuda_aggregate.cu.o' failed
make[2]: *** [paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_cuda_aggregate.cu.o] Error 1
Done processing /home/pcy/paddle/paddle/math/MathFunctions.h
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/MathUtils.h
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/Matrix.h
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/MemoryHandle.h
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/PoolAllocator.h
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/SIMDFunctions.h
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/SparseMatrix.h
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/SparseRowMatrix.h
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/Storage.h
Total errors found: 0
Done processing /home/pcy/paddle/paddle/math/Vector.h
Total errors found: 0
/home/pcy/paddle/paddle/cuda/include/hl_device_functions.cuh:39:13: error: 'double hppl::atomicAdd(double*, double)' conflicts with a previous declaration
 using hppl::atomicAdd;
             ^
/usr/local/cuda/include/sm_60_atomic_functions.h:308:46: note: previous declaration 'double atomicAdd(double*, double)'
 __SM_60_ATOMIC_FUNCTIONS_DECL__ double atomicAdd(double *address, double val) __DEF_IF_HOST
                                              ^
[ 48%] Built target paddle_math
CMake Error at paddle_gserver_generated_GruCompute.cu.o.cmake:262 (message):
  Error generating file
  /home/pcy/paddle/paddle/gserver/CMakeFiles/paddle_gserver.dir/layers/./paddle_gserver_generated_GruCompute.cu.o


paddle/gserver/CMakeFiles/paddle_gserver.dir/build.make:329: recipe for target 'paddle/gserver/CMakeFiles/paddle_gserver.dir/layers/paddle_gserver_generated_GruCompute.cu.o' failed
make[2]: *** [paddle/gserver/CMakeFiles/paddle_gserver.dir/layers/paddle_gserver_generated_GruCompute.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
/home/pcy/paddle/paddle/cuda/include/hl_device_functions.cuh:39:13: error: 'double hppl::atomicAdd(double*, double)' conflicts with a previous declaration
 using hppl::atomicAdd;
             ^
/usr/local/cuda/include/sm_60_atomic_functions.h:308:46: note: previous declaration 'double atomicAdd(double*, double)'
 __SM_60_ATOMIC_FUNCTIONS_DECL__ double atomicAdd(double *address, double val) __DEF_IF_HOST
                                              ^
CMake Error at paddle_gserver_generated_LstmCompute.cu.o.cmake:262 (message):
  Error generating file
  /home/pcy/paddle/paddle/gserver/CMakeFiles/paddle_gserver.dir/layers/./paddle_gserver_generated_LstmCompute.cu.o


paddle/gserver/CMakeFiles/paddle_gserver.dir/build.make:602: recipe for target 'paddle/gserver/CMakeFiles/paddle_gserver.dir/layers/paddle_gserver_generated_LstmCompute.cu.o' failed
make[2]: *** [paddle/gserver/CMakeFiles/paddle_gserver.dir/layers/paddle_gserver_generated_LstmCompute.cu.o] Error 1
CMakeFiles/Makefile2:519: recipe for target 'paddle/gserver/CMakeFiles/paddle_gserver.dir/all' failed
make[1]: *** [paddle/gserver/CMakeFiles/paddle_gserver.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
/home/pcy/paddle/paddle/cuda/include/hl_device_functions.cuh:39:13: error: 'double hppl::atomicAdd(double*, double)' conflicts with a previous declaration
 using hppl::atomicAdd;
             ^
/usr/local/cuda/include/sm_60_atomic_functions.h:308:46: note: previous declaration 'double atomicAdd(double*, double)'
 __SM_60_ATOMIC_FUNCTIONS_DECL__ double atomicAdd(double *address, double val) __DEF_IF_HOST
                                              ^
CMake Error at paddle_cuda_generated_hl_cuda_sparse.cu.o.cmake:262 (message):
  Error generating file
  /home/pcy/paddle/paddle/cuda/CMakeFiles/paddle_cuda.dir/src/./paddle_cuda_generated_hl_cuda_sparse.cu.o


paddle/cuda/CMakeFiles/paddle_cuda.dir/build.make:1009: recipe for target 'paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_cuda_sparse.cu.o' failed
make[2]: *** [paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_cuda_sparse.cu.o] Error 1
/home/pcy/paddle/paddle/cuda/include/hl_device_functions.cuh:39:13: error: 'double hppl::atomicAdd(double*, double)' conflicts with a previous declaration
 using hppl::atomicAdd;
             ^
/usr/local/cuda/include/sm_60_atomic_functions.h:308:46: note: previous declaration 'double atomicAdd(double*, double)'
 __SM_60_ATOMIC_FUNCTIONS_DECL__ double atomicAdd(double *address, double val) __DEF_IF_HOST
                                              ^
CMake Error at paddle_cuda_generated_hl_cuda_lstm.cu.o.cmake:262 (message):
  Error generating file
  /home/pcy/paddle/paddle/cuda/CMakeFiles/paddle_cuda.dir/src/./paddle_cuda_generated_hl_cuda_lstm.cu.o


paddle/cuda/CMakeFiles/paddle_cuda.dir/build.make:1464: recipe for target 'paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_cuda_lstm.cu.o' failed
make[2]: *** [paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_cuda_lstm.cu.o] Error 1
CMakeFiles/Makefile2:299: recipe for target 'paddle/cuda/CMakeFiles/paddle_cuda.dir/all' failed
make[1]: *** [paddle/cuda/CMakeFiles/paddle_cuda.dir/all] Error 2
Makefile:149: recipe for target 'all' failed
make: *** [all] Error 2

Not defined error!

Hi, I have followed the exact installation steps mentioned in the docs.
Now when I run python trainer_config.lr.py . Below mentioned is the error that I am getting.

NameError: name 'get_config_arg' is not defined

cuda compiler error

I'm using ubuntu 16.04 using gcc 5.4 using the hack to comment the #error in host_config.h of cuda 7.5

when building paddle, this compiler error is thrown:
[ 14%] Building NVCC (Device) object paddle/cuda/CMakeFiles/paddle_cuda.dir/src/paddle_cuda_generated_hl_cuda_aggregate.cu.o
/home/xxxx/devel/paddle/paddle/cuda/src/hl_cuda_aggregate.cu: In function ‘void hl_vector_sum(paddle::real_, paddle::real_, int)’:
/home/xxxx/devel/paddle/paddle/cuda/src/hl_cuda_aggregate.cu:268:21: error: comparison of constant ‘34’ with boolean expression is always false [-Werror=bool-compare]
} while (isNotReady == cudaErrorNotReady);

and other similar errors in the same file.

AttributeError: 'module' object has no attribute 'img_rnorm_layer' - while running image_classification demo training

I am trying to run image_classification demo but I am getting error during running train.sh (see log at the end).
I will appreciate your help in figuring out what is going wrong here.

Mandeep

[17:54 ~/paddle/src/paddle/demo/image_classification] > ./train.sh
I0902 17:54:59.306221 16647 Util.cpp:144] commandline: /home/deepbeast/paddle/bin/bin/../opt/paddle/bin/paddle_trainer --config=vgg_16_cifar.py --dot_period=10 --log_period=100 --test_all_data_in_one_period=1 --use_gpu=1 --trainer_count=1 --num_passes=200 --save_dir=./cifar_vgg_model
I0902 17:55:04.945137 16647 Util.cpp:113] Calling runInitFunctions
I0902 17:55:04.945307 16647 Util.cpp:126] Call runInitFunctions done.
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/paddle/trainer/config_parser.py", line 3113, in parse_config_and_serialize
config = parse_config(config_file, config_arg_str)
File "/usr/local/lib/python2.7/dist-packages/paddle/trainer/config_parser.py", line 3089, in parse_config
execfile(config_file, make_config_environment(config_file, config_args))
File "vgg_16_cifar.py", line 15, in
from paddle.trainer_config_helpers import *
File "/usr/local/lib/python2.7/dist-packages/paddle/trainer_config_helpers/init.py", line 19, in
from layers import *
AttributeError: 'module' object has no attribute 'img_rnorm_layer'
F0902 17:55:05.116209 16647 PythonUtil.cpp:130] Check failed: (ret) != nullptr Python Error: <type 'exceptions.AttributeError'> : 'module' object has no attribute 'img_rnorm_layer'
Python Callstack:
/usr/local/lib/python2.7/dist-packages/paddle/trainer/config_parser.py : 3113
/usr/local/lib/python2.7/dist-packages/paddle/trainer/config_parser.py : 3089
vgg_16_cifar.py : 15
/usr/local/lib/python2.7/dist-packages/paddle/trainer_config_helpers/init.py : 19
Call Object failed.
*** Check failure stack trace: ***
@ 0x7fd2bd0f65cd google::LogMessage::Fail()
@ 0x7fd2bd0f8433 google::LogMessage::SendToLog()
@ 0x7fd2bd0f615b google::LogMessage::Flush()
@ 0x7fd2bd0f8e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x89fcb2 paddle::callPythonFuncRetPyObj()
@ 0x8a0176 paddle::callPythonFunc()
@ 0x6fb681 paddle::TrainerConfigHelper::TrainerConfigHelper()
@ 0x6fc99d paddle::TrainerConfigHelper::createFromFlags()
@ 0x547dc4 main
@ 0x7fd2bc280830 __libc_start_main
@ 0x5501a9 _start
@ (nil) (unknown)
/home/deepbeast/paddle/bin/bin/paddle: line 46: 16647 Aborted (core dumped) ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}
No data to plot. Exiting!

running error!

When i run the quick_start, i have the following error:
paddle_trainer: error while loading shared libraries: libgflags.so.2: cannot open shared object file: No such file or directory
But i have installed gflags.2.2.0, which includes the following libs:
/usr/local/lib/libgflags.so /usr/local/lib/libgflags.so.2.2.0 /usr/local/lib/libgflags_nothreads.so.2.2
/usr/local/lib/libgflags.so.2.2 /usr/local/lib/libgflags_nothreads.so /usr/local/lib/libgflags_nothreads.so.2.2.0

why paddle use libgflags.so.2 instead of libgflags2.2.0???

Besides, when i installed gflags2.0, which includes libgflags.so.2, but i also have another running error:
paddle: line 46: 1877 Illegal instruction ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

why?

Russian language

Russian language is supported ?

bug in hl_devce_functions.cuh

Just made a git pull, but compiling paddle failed. There's an extra #endif in the end of the file hl_device_functions.cuh. Commenting the extra #endif out, paddle was built with cuda ok. Now going to try what's the difference using cuda in image_classification demo.

How to use CTC loss function?

How to use CTC loss function?
I'm planning to extend my dataset with unsegmented data, but cannot figure out how to use ctc_error_evaluator and ctc_layer
model:

def stacked_gru_net_with_ctc(input_dim=24,
                     class_dim=133,
                     hid_dim=129,
                     stacked_num=3,
                     is_predict=False):
    """
    """
    assert stacked_num % 2 == 1

    linear = LinearActivation()

    data = data_layer("word", input_dim)
    fc1 = fc_layer(input=data, size=hid_dim, act=linear)
    gru1 = grumemory(input=fc1)

    inputs = [fc1, gru1]
    for i in range(2, stacked_num + 1):
        fc = fc_layer(input=inputs, size=hid_dim, act=linear)
        gru = grumemory(input=fc, reverse=(i % 2) == 0)
        inputs = [fc, gru]


    output = fc_layer(input=inputs, size=class_dim, act=SoftmaxActivation())

    if is_predict:
        outputs(output)
    else:
        # FIXME? #outputs(classification_cost(input=output, label=data_layer('label', class_dim), evaluator=ctc_error_evaluator))
        outputs(classification_cost(input=output, label=data_layer('label', class_dim), evaluator=precision_recall_evaluator))

dataprovider:

    settings.input_types = [
        dense_vector_sequence(24),
        integer_value_sequence(133)]

Also I have no idea how to feed labels to model if sequences are not segmented. With simple classification_cost each timestep has label, but how to use dataprovider with unsegmented sequences and CTC?
My current dataset is 180 examples, each is roughly 5000 timesteps (variable length). Each timestep is len=24 float vector labeled with one int label in range 0..132 (133 labels total).
Any example/suggestion would be highly appreciated. Thanks

New LSTM error!

When lstm run more than 2000 batches in first iteration, the error (float point exception) occurred! some print info as following:

I0909 17:38:59.281976 23954 TrainerInternal.cpp:162] Batch=2000 samples=128000 AvgCost=0.239344 CurrentCost=0.201116 Eval: classification_error_evaluator=0.0999063 CurrentEval: classification_error_evaluator=0.0798125
I0909 17:40:21.527977 23954 Tester.cpp:111] Test samples=25000 cost=0.184603 Eval: classification_error_evaluator=0.07068
.............................................................................................................................. .........................................................................................................................................................................................................................................................................................................................................................................................................................Current Layer forward/backward stack is
LayerName: lstmemory_0
LayerName: fc_layer_0
LayerName: embedding_0
LayerName: word
*** Aborted at 1473414360 (unix time) try "date -d @1473414360" if you are using GNU date ***
Current Layer forward/backward stack is
PC: @ 0x7f0182f0ab25 __ieee754_exp
Current Layer forward/backward stack is
*** SIGFPE (@0x7f0182f0ab25) received by PID 23954 (TID 0x7f017130c700) from PID 18446744071611394853; stack trace: ***
Current Layer forward/backward stack is
@ 0x7f018414d710 (unknown)
Current Layer forward/backward stack is
@ 0x7f0182f0ab25 __ieee754_exp
Current Layer forward/backward stack is
@ 0x7f0182f20b52 __GI___exp
Current Layer forward/backward stack is
@ 0x800235 hppl::tanh()
Current Layer forward/backward stack is
@ 0x587383 paddle::LstmCompute::forwardOneSequence<>()
Current Layer forward/backward stack is
@ 0x58788a paddle::LstmCompute::forwardBatch<>()
Current Layer forward/backward stack is
@ 0x581bdc paddle::LstmLayer::forwardBatch()
Current Layer forward/backward stack is
@ 0x58521a paddle::LstmLayer::forward()
Current Layer forward/backward stack is
@ 0x614b94 paddle::NeuralNetwork::forward()
Current Layer forward/backward stack is
@ 0x61efe6 paddle::TrainerThread::forward()
Current Layer forward/backward stack is
@ 0x621194 paddle::TrainerThread::computeThread()
Current Layer forward/backward stack is
@ 0x7f01832553d2 execute_native_thread_routine
Current Layer forward/backward stack is
@ 0x7f01841459d1 start_thread
Current Layer forward/backward stack is
@ 0x7f0182a3a8fd clone
/data11/dis_ml/deeplearning/paddle/bin/paddle: line 46: 23954 Floating point exception${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

Parallelized version of CpuVectorT<real>::uniform and similar functions

I can think of OpenMP, but not sure whether it fits your threading model. If you can point me some reference code, I'm happy to create a pull request.

配置文档中下载docker包命名错误！

$ docker run -it paddledev/paddlepaddle:latest-cpu ->
$ docker run -it paddledev/paddle:cpu-latest

怎么没有windows版本呢?

Support CUDA 8.0

运行vgg_16_cifar 错误

运行./train.sh 出现下面错误：
/usr/local/bin/paddle: line 46: 209 Illegal instruction ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

运行的环境是paddledev/paddle:cpu-demo-latest，paddle版本为:
PaddlePaddle 0.8.0b, compiled with
with_avx: ON
with_gpu: OFF
with_double: OFF
with_python: ON
with_rdma: OFF
with_glog: ON
with_gflags: ON
with_metric_learning:
with_timer: OFF
with_predict_sdk:

gcc 6 ? (for archlinux + python2)

Hello,

In order to compare pyspark/tensorflow(+dask)/paddle, I created a package for Archlinux. There are small differences but, IMHO, one could be add to upstream ( here ). What do you think about it?

No regression observed.

Regards

Getting hl_matrix_classification_error if using trainer_config settings.batch_size > 16

Can't run train.sh if trainer_config.py settings batch_size > 16. Getting following error:
train.log:

./train.sh
I /home/user/SOFT/BAIDU/PADDLE/Paddle/paddle/utils/Util.cpp:144] commandline: /opt/paddle/bin/../opt/paddle/bin/paddle_trainer --config=trainer_config.py --save_dir=./model_output --job=train --use_gpu=true --trainer_count=1 --num_passes=100000 --log_period=15 --dot_period=1 --show_parameter_stats_period=100 --test_all_data_in_one_period=1 --saving_period=100 --test_period=100
I /home/user/SOFT/BAIDU/PADDLE/Paddle/paddle/utils/Util.cpp:113] Calling runInitFunctions
I /home/user/SOFT/BAIDU/PADDLE/Paddle/paddle/utils/Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-06 20:10:47,439 networks.py:1122] The input order is [input, label]
[INFO 2016-09-06 20:10:47,439 networks.py:1129] The output order is [cost_0]
I /home/user/SOFT/BAIDU/PADDLE/Paddle/paddle/trainer/Trainer.cpp:169] trainer mode: Normal
I /home/user/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/dataproviders/PyDataProvider2.cpp:219] loading dataprovider dataprovider::process
I /home/user/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/dataproviders/PyDataProvider2.cpp:219] loading dataprovider dataprovider::process
I /home/user/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/gradientmachines/GradientMachine.cpp:134] Initing parameters..
I /home/user/SOFT/BAIDU/PADDLE/Paddle/paddle/gserver/gradientmachines/GradientMachine.cpp:141] Init parameters done.
F /home/user/SOFT/BAIDU/PADDLE/Paddle/paddle/cuda/src/hl_cuda_matrix.cu:322] 0x933ba8[hl_matrix_classification_error] CUDA error: invalid configuration argument
/opt/paddle/bin/paddle: line 46: 10921 Aborted (core dumped) ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

I'm trying to solve clasification task with LSTM model. My dataset is 180 examples, each is roughly 5000 timesteps (variable length). Each timestep is len=24 float vector labeled with int label in range [0, 132].

settings.input_types = [
    dense_vector_sequence(settings.inputSize),
    integer_value_sequence(settings.vocabSize)]

Smaller size batches eg. 12 give no error, but my data is not very redundant, so gradients become unstable. My setup is 980ti (6Gb VRAM) memory usage for batch_size=12 is ~ 20%.
trainer_config.py:
settings( batch_size=24, learning_rate=0.001, learning_method=RMSPropOptimizer() ) stacked_lstm_net(input_dim=24, class_dim=133, hid_dim=24, stacked_num=7, is_predict=is_predict)

stacked_lstm_net
# simple sequential lstm

lstm_act = TanhActivation()
fc_act = LinearActivation()

data = data_layer("input", size=input_dim)

fc1 = fc_layer(input=data, size=hid_dim, act=fc_act)
lstm1 = lstmemory(input=fc1, act=lstm_act)

inputs = [fc1, lstm1]
for i in range(2, stacked_num + 1):
    fc = fc_layer(input=inputs, size=hid_dim, act=fc_act)
    lstm = lstmemory(input=fc, act=lstm_act)
    inputs = [fc, lstm]

output = fc_layer(input=[inputs[0], inputs[1]], size=class_dim,
                  act=SoftmaxActivation())

if is_predict:
    outputs(output)
else:
    outputs(classification_cost(input=output, label=data_layer('label', class_dim)))

Could you please explain this error or point me how to debug such issue?

run vgg_16 error

I got a error when I run "train.sh" in demo/image_classification
I0907 14:35:07.593504 49407 Util.cpp:144] commandline: /home/hadoop/paddle/paddle/build/bin/../opt/paddle/bin/paddle_trainer --config=vgg_16_cifar.py --dot_period=10 --log_period=100 --test_all_data_in_one_period=1 --use_gpu=1 --trainer_count=1 --num_passes=200 --save_dir=./cifar_vgg_model
I0907 14:35:08.002375 49407 Util.cpp:113] Calling runInitFunctions
I0907 14:35:08.002609 49407 Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-07 14:35:08,043 layers.py:1438] channels=3 size=3072
[INFO 2016-09-07 14:35:08,043 layers.py:1438] output size for conv_0 is 32
[INFO 2016-09-07 14:35:08,044 layers.py:1438] channels=64 size=65536
[INFO 2016-09-07 14:35:08,045 layers.py:1438] output size for conv_1 is 32
[INFO 2016-09-07 14:35:08,046 layers.py:1499] output size for pool_0 is 16_16
[INFO 2016-09-07 14:35:08,046 layers.py:1438] channels=64 size=16384
[INFO 2016-09-07 14:35:08,046 layers.py:1438] output size for conv_2 is 16
[INFO 2016-09-07 14:35:08,047 layers.py:1438] channels=128 size=32768
[INFO 2016-09-07 14:35:08,048 layers.py:1438] output size for conv_3 is 16
[INFO 2016-09-07 14:35:08,049 layers.py:1499] output size for pool_1 is 8_8
[INFO 2016-09-07 14:35:08,049 layers.py:1438] channels=128 size=8192
[INFO 2016-09-07 14:35:08,049 layers.py:1438] output size for conv_4 is 8
[INFO 2016-09-07 14:35:08,051 layers.py:1438] channels=256 size=16384
[INFO 2016-09-07 14:35:08,051 layers.py:1438] output size for conv_5 is 8
[INFO 2016-09-07 14:35:08,052 layers.py:1438] channels=256 size=16384
[INFO 2016-09-07 14:35:08,052 layers.py:1438] output size for conv_6 is 8
[INFO 2016-09-07 14:35:08,053 layers.py:1499] output size for pool_2 is 4_4
[INFO 2016-09-07 14:35:08,054 layers.py:1438] channels=256 size=4096
[INFO 2016-09-07 14:35:08,054 layers.py:1438] output size for conv_7 is 4
[INFO 2016-09-07 14:35:08,055 layers.py:1438] channels=512 size=8192
[INFO 2016-09-07 14:35:08,055 layers.py:1438] output size for conv_8 is 4
[INFO 2016-09-07 14:35:08,056 layers.py:1438] channels=512 size=8192
[INFO 2016-09-07 14:35:08,056 layers.py:1438] output size for conv_9 is 4
[INFO 2016-09-07 14:35:08,058 layers.py:1499] output size for pool_3 is 2_2
[INFO 2016-09-07 14:35:08,058 layers.py:1499] output size for pool_4 is 1_1
[INFO 2016-09-07 14:35:08,060 networks.py:1122] The input order is [image, label]
[INFO 2016-09-07 14:35:08,060 networks.py:1129] The output order is [cost_0]
I0907 14:35:08.067443 49407 Trainer.cpp:169] trainer mode: Normal
I0907 14:35:08.075389 49407 PyDataProvider2.cpp:219] loading dataprovider image_provider::processData
[INFO 2016-09-07 14:35:08,109 image_provider.py:52] Image size: 32
[INFO 2016-09-07 14:35:08,109 image_provider.py:53] Meta path: data/cifar-out/batches/batches.meta
[INFO 2016-09-07 14:35:08,109 image_provider.py:58] DataProvider Initialization finished
I0907 14:35:08.109668 49407 PyDataProvider2.cpp:219] loading dataprovider image_provider::processData
[INFO 2016-09-07 14:35:08,109 image_provider.py:52] Image size: 32
[INFO 2016-09-07 14:35:08,109 image_provider.py:53] Meta path: data/cifar-out/batches/batches.meta
[INFO 2016-09-07 14:35:08,109 image_provider.py:58] DataProvider Initialization finished
I0907 14:35:08.109978 49407 GradientMachine.cpp:134] Initing parameters..
I0907 14:35:08.554066 49407 GradientMachine.cpp:141] Init parameters done.
Current Layer forward/backward stack is
LayerName: batch_norm_10
LayerName: fc_layer_0
LayerName: dropout_0
LayerName: pool_4
LayerName: pool_3
LayerName: batch_norm_9
LayerName: conv_9
LayerName: batch_norm_8
LayerName: conv_8
LayerName: batch_norm_7
LayerName: conv_7
LayerName: pool_2
LayerName: batch_norm_6
LayerName: conv_6
LayerName: batch_norm_5
LayerName: conv_5
LayerName: batch_norm_4
LayerName: conv_4
LayerName: pool_1
LayerName: batch_norm_3
LayerName: conv_3
LayerName: batch_norm_2
LayerName: conv_2
LayerName: pool_0
LayerName: batch_norm_1
LayerName: conv_1
LayerName: batch_norm_0
LayerName: conv_0
LayerName: image
*_* Aborted at 1473230129 (unix time) try "date -d @1473230129" if you are using GNU date ***
Current Layer forward/backward stack is
PC: @ 0x7fb72227a855 (unknown)
Current Layer forward/backward stack is
*** SIGSEGV (@0x130701aa00) received by PID 49407 (TID 0x7fb7386fe800) from PID 117549568; stack trace: ***
Current Layer forward/backward stack is
@ 0x7fb737fdf100 (unknown)
Current Layer forward/backward stack is
@ 0x7fb72227a855 (unknown)
Current Layer forward/backward stack is
@ 0x8b3350 hl_batch_norm_backward()
Current Layer forward/backward stack is
@ 0x5d4684 paddle::CudnnBatchNormLayer::backward()
Current Layer forward/backward stack is
@ 0x620bae paddle::NeuralNetwork::backward()
Current Layer forward/backward stack is
@ 0x69c95d paddle::TrainerInternal::forwardBackwardBatch()
Current Layer forward/backward stack is
@ 0x69cf14 paddle::TrainerInternal::trainOneBatch()
Current Layer forward/backward stack is
@ 0x698350 paddle::Trainer::trainOnePass()
Current Layer forward/backward stack is
@ 0x69ba47 paddle::Trainer::train()
Current Layer forward/backward stack is
@ 0x53aea3 main
Current Layer forward/backward stack is
@ 0x7fb73587bb15 __libc_start_main
Current Layer forward/backward stack is
@ 0x545b15 (unknown)
/home/hadoop/paddle/paddle/build/bin/paddle: line 46: 49407 Segmentation fault (core dumped) ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}
No data to plot. Exiting!

Someone know why?

some mistakes in Paddle installation wiki of docker version

As mentioned in the Chinese installation wiki, there are 6 docker images, such as paddledev/paddlepaddle:latest-cpu. But from docker images repository, the useful ones are images like paddledev/paddle:cpu-latest.
So, command of Installing and Running docker image should be as follows,

$ docker run -it paddledev/paddle:cpu-latest

Thanks.

Is this OK to install requirements from source ?

When build on Ubuntu, paddle's requirements(cmake, protobuf, blas, etc.) can be installed using following way:

sudo apt-get install -y g++ make cmake build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git

But if i want to build paddle on other platform which can not connect to internet, the only way is to build these requirements(cmake, g++, blas, protobuf, python) from downloaded source code.

So i want to confirm is this ok to install these requirements from source.

thanks~

Mistake in demo/image_classification/preprocess.sh

A mistake in preprocess.sh line 17 "export PYTHONPATH=$PYTHONPATH:../../"
In this path ,can't find python module.
export path shoule be inplace with PYTHONPATH=$PYTHONPATH:../../python"

ImportError: No module named paddle.trainer.PyDataProvider2

First time to use paddle and not very familiar, so I try to do as the quickstart doc step by step.
http://www.paddlepaddle.org/doc/demo/quick_start/index_en.html
http://www.paddlepaddle.org/doc_cn/index.html
Environment: ubuntu 12.04 and install cpu-deb.
I have finished data preparation step and when i try to use dataprovider_bow.py script, it throw the error as my title,here is the log:

cloud@cloud-virtual-machine:~/paddle-git/demo/quick_start$ python dataprovider_bow.py
Traceback (most recent call last):
File "dataprovider_bow.py", line 15, in
from paddle.trainer.PyDataProvider2 import *
ImportError: No module named paddle.trainer.PyDataProvider2

What's the problem and how can i get that package imported correctly?? And i doubt if i install paddlepaddle correctly, how can i conform that??thx~

make error!

[ 79%] Building CXX object paddle/utils/tests/CMakeFiles/test_CommandLineParser.dir/test_CommandLineParser.cpp.o
Linking CXX executable test_CommandLineParser
Done processing test_CommandLineParser.cpp
Total errors found: 0
/data10/anaconda2/lib/python2.7/config/libpython2.7.a(posixmodule.o): In function posix_tmpnam': /home/ilan/minonda/conda-bld/work/Python-2.7.12/./Modules/posixmodule.c:7631: warning: the use oftmpnam_r' is dangerous, better use mkstemp' /data10/anaconda2/lib/python2.7/config/libpython2.7.a(posixmodule.o): In functionposix_tempnam':
/home/ilan/minonda/conda-bld/work/Python-2.7.12/./Modules/posixmodule.c:7578: warning: the use of tempnam' is dangerous, better usemkstemp'
/data10/anaconda2/lib/python2.7/config/libpython2.7.a(posixmodule.o): In function posix_forkpty': posixmodule.c:(.text+0x2a8f): undefined reference toforkpty'
/data10/anaconda2/lib/python2.7/config/libpython2.7.a(posixmodule.o): In function posix_openpty': posixmodule.c:(.text+0x4776): undefined reference toopenpty'
collect2: error: ld returned 1 exit status
make[2]: *** [paddle/utils/tests/test_CommandLineParser] Error 1
make[1]: *** [paddle/utils/tests/CMakeFiles/test_CommandLineParser.dir/all] Error 2
make: *** [all] Error 2

Compilation error for ParameterClient2.cpp - Narrowing conversion

I am trying to compile Paddle on a 32-bit server running the Ubunutu 16.04.1 operating system. The build reports 2 errors compiling pserver/Parameterclient2.cpp on line 281 and 306 stating this would result in a narrowing conversion error

Infer PADDLE_VERSION macro value from Git tag

I noticed the following code snippet in /CMakeLists.txt file:

# add PaddlePaddle version
if(DEFINED ENV{PADDLE_VERSION})
    add_definitions(-DPADDLE_VERSION=\"$ENV{PADDLE_VERSION}\")
else()
    if(EXISTS ${PROJ_ROOT}/.svn/)
        find_package(Subversion REQUIRED)
        if(SUBVERSION_FOUND)
            Subversion_WC_INFO(${PROJ_ROOT} Project)
            add_definitions(-DPADDLE_VERSION=${Project_WC_REVISION})
        endif()
    endif()
endif()

It seems that the source code was managed by SVN, but migrated to Git later. So the idea of getting the PADDLE_VERSION macro value from .svn revision does no longer work.

I saw a workaround before above code snippet:

set(PADDLE_MAJOR_VERSION 0)
set(PADDLE_MINOR_VERSION 8)
set(PADDLE_PATCH_VERSION 0b)
set(PADDLE_VERSION ${PADDLE_MAJOR_VERSION}.${PADDLE_MINOR_VERSION}.${PADDLE_PATCH_VERSION})

To understand how the macro PADDLE_VERSION is used, I did:

$ for i in $(du -a | cut -f 2); do if [[ -f $i ]]; then if grep PADDLE_VERSION $i; then echo === $i ===; fi; fi; done
set(CPACK_PACKAGE_VERSION ${PADDLE_VERSION})
=== ./cmake/package.cmake ===
set(PADDLE_VERSION ${PADDLE_MAJOR_VERSION}.${PADDLE_MINOR_VERSION}.${PADDLE_PATCH_VERSION})
if(DEFINED ENV{PADDLE_VERSION})
    add_definitions(-DPADDLE_VERSION=\"$ENV{PADDLE_VERSION}\")
            add_definitions(-DPADDLE_VERSION=${Project_WC_REVISION})
=== ./CMakeLists.txt ===
        echo "PaddlePaddle @PADDLE_VERSION@, compiled with"
=== ./paddle/scripts/submit_local.sh.in ===
#ifndef PADDLE_VERSION
#define PADDLE_VERSION "unknown"
  os << "paddle version: " << PADDLE_VERSION << std::endl << std::boolalpha
=== ./paddle/utils/Version.cpp ===
      version='${PADDLE_VERSION}',
=== ./python/setup.py.in ===

It seems that PADDLE_VERSION is used some way, and it would be reasonable if we can infer its value from Git tags.

I confirm that we can find the tag we are on by running

git describe --tags --abbrev=0

Sentiment analysis failures: invalid device function

Here is the error code.

./train.sh
I0903 14:10:57.917793 18690 Util.cpp:144] commandline: /data2/package/pypaddle/bin/../opt/paddle/bin/paddle_trainer --config=trainer_config.py --save_dir=./model_output --job=train --use_gpu=1 --trainer_count=4 --num_passes=10 --log_period=10 --dot_period=20 --show_parameter_stats_period=100 --test_all_data_in_one_period=1
I0903 14:11:01.704715 18690 Util.cpp:113] Calling runInitFunctions
I0903 14:11:01.705032 18690 Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-03 14:11:02,367 networks.py:1122] The input order is [word, label]
[INFO 2016-09-03 14:11:02,368 networks.py:1129] The output order is [__cost_0__]
I0903 14:11:02.395427 18690 Trainer.cpp:169] trainer mode: Normal
I0903 14:11:02.395754 18690 MultiGradientMachine.cpp:108] numLogicalDevices=1 numThreads=4 numDevices=4
F0903 14:11:02.400593 18690 hl_gpu_matrix_kernel.cuh:181] Check failed: cudaSuccess == err (0 vs. 8) [hl_gpu_apply_unary_op failed] CUDA error: invalid device function
*** Check failure stack trace: ***
    @     0x7fc9c37175cd  google::LogMessage::Fail()
    @     0x7fc9c3719433  google::LogMessage::SendToLog()
    @     0x7fc9c371715b  google::LogMessage::Flush()
    @     0x7fc9c3719e1e  google::LogMessageFatal::~LogMessageFatal()
    @           0x7d65b2  hl_gpu_apply_unary_op<>()
    @           0x79d156  paddle::BaseMatrixT<>::applyUnary<>()
    @           0x79ccf0  paddle::BaseMatrixT<>::applyUnary<>()
    @           0x780733  paddle::BaseMatrixT<>::zero()
    @           0x561960  paddle::Parameter::enableType()
    @           0x564531  paddle::parameterInitNN()
    @           0x567fe9  paddle::NeuralNetwork::init()
    @           0x55ee4b  paddle::TrainerThread::TrainerThread()
    @           0x55fab7  paddle::MultiGradientMachine::MultiGradientMachine()
    @           0x58788e  paddle::GradientMachine::create()
    @           0x6e296d  paddle::TrainerInternal::init()
    @           0x6dc144  paddle::Trainer::init()
    @           0x54622d  main
    @     0x7fc9c2699830  __libc_start_main
    @           0x54db19  _start
    @              (nil)  (unknown)
/data2/package/pypaddle/bin/paddle: line 46: 18690 Aborted                 ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

demo/sentiment$ ./train.sh error

lbj@ubuntu:~/work/Paddle-master/demo/sentiment$ ./train.sh
I0910 20:52:11.219410 26997 Util.cpp:144] commandline: /usr/local/bin/../opt/paddle/bin/paddle_trainer --config=trainer_config.py --save_dir=./model_output --job=train --use_gpu=false --trainer_count=4 --num_passes=10 --log_period=10 --dot_period=20 --show_parameter_stats_period=100 --test_all_data_in_one_period=1
I0910 20:52:11.220432 26997 Util.cpp:113] Calling runInitFunctions
I0910 20:52:11.221349 26997 Util.cpp:126] Call runInitFunctions done.
[INFO 2016-09-10 20:52:12,141 networks.py:1122] The input order is [word, label]
[INFO 2016-09-10 20:52:12,142 networks.py:1129] The output order is [cost_0]
I0910 20:52:12.241542 26997 Trainer.cpp:169] trainer mode: Normal
I0910 20:52:13.312613 26997 PyDataProvider2.cpp:219] loading dataprovider dataprovider::process
[INFO 2016-09-10 20:52:13,584 dataprovider.py:22] dict len : 101745
I0910 20:52:13.779626 26997 PyDataProvider2.cpp:219] loading dataprovider dataprovider::process
[INFO 2016-09-10 20:52:13,780 dataprovider.py:22] dict len : 101745
I0910 20:52:13.781951 26997 GradientMachine.cpp:134] Initing parameters..
I0910 20:52:14.922147 26997 GradientMachine.cpp:141] Init parameters done.
I0910 20:52:25.065448 27001 ThreadLocal.cpp:39] thread use undeterministic rand seed:27002
I0910 20:52:25.857774 26999 ThreadLocal.cpp:39] thread use undeterministic rand seed:27000
I0910 20:52:26.466917 26998 ThreadLocal.cpp:39] thread use undeterministic rand seed:26999
I0910 20:52:26.739195 27000 ThreadLocal.cpp:39] thread use undeterministic rand seed:27001
/usr/local/bin/paddle: line 81: 26997 Killed ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}

paddlepaddle / paddle Goto Github PK

paddle's People

Contributors

Stargazers

Watchers

Forkers

paddle's Issues

sudo apt-get install -y g++ make cmake build-essential libatlas-base-dev python python-pip libpython-dev m4 libprotobuf-dev protobuf-compiler python-protobuf python-numpy git

Recommend Projects

Recommend Topics

Recommend Org

Jobs