mathmanu / caffe-jacinto-models Goto Github PK
View Code? Open in Web Editor NEWThis repository has moved. The new link can be obtained from https://github.com/TexasInstruments/jacinto-ai-devkit
This repository has moved. The new link can be obtained from https://github.com/TexasInstruments/jacinto-ai-devkit
Hello,i run the tarin_image_object_detection.sh, and got some problems in the test_quantize phase.
Here is the log.
I0910 14:36:28.498530 13312 common.cpp:475] GPU 0 'TITAN Xp' has compute capability 6.1
I0910 14:36:29.036025 13312 caffe.cpp:902] This is NVCaffe 0.17.0 started at Mon Sep 10 14:36:28 2018
I0910 14:36:29.036056 13312 caffe.cpp:904] CuDNN version: 7104
I0910 14:36:29.036072 13312 caffe.cpp:905] CuBLAS version: 9000
I0910 14:36:29.036077 13312 caffe.cpp:906] CUDA version: 9000
I0910 14:36:29.036082 13312 caffe.cpp:907] CUDA driver version: 9010
I0910 14:36:29.036089 13312 caffe.cpp:908] Arguments:
[0]: /home/junxiang/caffe-jacinto/build/tools/caffe.bin
[1]: test_detection
[2]: --model=training/voc0712/JDetNet/20180828_14-57_ds_PSP_dsFac_32_hdDS8_1/test_quantize/test.prototxt
[3]: --iterations=496
[4]: --weights=training/voc0712/JDetNet/20180828_14-57_ds_PSP_dsFac_32_hdDS8_1/sparse/voc0712_ssdJacintoNetV2_iter_120000.caffemodel
…………………………………………
I0910 14:36:48.912307 13312 net.cpp:2195] Enabling quantization at output of: Concat mbox_loc
I0910 14:36:48.912477 13312 net.cpp:2195] Enabling quantization at output of: Concat mbox_conf
I0910 14:36:48.912649 13312 net.cpp:2195] Enabling quantization at output of: Concat mbox_priorbox
I0910 14:36:48.917215 13350 common.cpp:192] New stream 0x7fa3ac006960, device 0, thread 13350
F0910 14:36:48.941680 13312 permute_layer.cu:70] Check failed: error == cudaSuccess (7 vs. 0) too many resources requested for launch
*** Check failure stack trace: ***
@ 0x7fa4660295cd google::LogMessage::Fail()
@ 0x7fa46602b433 google::LogMessage::SendToLog()
@ 0x7fa46602915b google::LogMessage::Flush()
@ 0x7fa46602be1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7fa467a7ce48 caffe::PermuteLayer<>::Forward_gpu()
@ 0x7fa4673e8a7f caffe::Layer<>::Forward()
@ 0x7fa4672561fe caffe::Net::ForwardFromTo()
@ 0x7fa46725633d caffe::Net::Forward()
@ 0x44cc4b test_detection()
@ 0x4521f2 main
@ 0x7fa4647ab830 __libc_start_main
@ 0x449699 _start
@ (nil) (unknown)
Can you tell me how to solve this problem?
i want to learn it and do some changes.
I am using v0.17
I0810 14:59:47.258464 25579 solver.cpp:352] Iteration 7800 (1.27604 iter/s, 78.3675s/100 iter), 15.1/452.4ep, loss = 5.78276
I0810 14:59:47.258651 25579 solver.cpp:376] Train net output #0: mbox_loss = 6.02961 (* 1 = 6.02961 loss)
I0810 14:59:47.258673 25579 sgd_solver.cpp:172] Iteration 7800, lr = 0.01, m = 0.9, wd = 0.0001, gs = 1
I0810 15:01:05.190842 25579 solver.cpp:352] Iteration 7900 (1.28319 iter/s, 77.9311s/100 iter), 15.3/452.4ep, loss = 5.22862
I0810 15:01:05.191012 25579 solver.cpp:376] Train net output #0: mbox_loss = 5.81463 (* 1 = 5.81463 loss)
I0810 15:01:05.191031 25579 sgd_solver.cpp:172] Iteration 7900, lr = 0.01, m = 0.9, wd = 0.0001, gs = 1
I0810 15:02:23.070077 25579 solver.cpp:905] Snapshotting to binary proto file training/voc0712/JDetNet/_ds_PSP_dsFac_32_hdDS8_1/initial/voc0712_ssdJacintoNetV2_iter_8000.caffemodel
I0810 15:02:23.083901 25580 net.cpp:1071] Ignoring source layer mbox_loss
I0810 15:02:23.112282 25579 sgd_solver.cpp:398] Snapshotting solver state to binary proto file training/voc0712/JDetNet/_ds_PSP_dsFac_32_hdDS8_1/initial/voc0712_ssdJacintoNetV2_iter_8000.solverstate
I0810 15:02:23.131805 25579 solver.cpp:635] Iteration 8000, Testing net (#0)
F0810 15:02:23.132831 25579 net.cpp:1081] Check failed: target_blobs[j]->shape() == source_blob->shape() Cannot share param 0 weights from layer 'conv1a/bn'; shape mismatch. Source param shape is 1 32 1 1 (32); target param shape is 32 (32)
*** Check failure stack trace: ***
@ 0x7f43d58d85cd google::LogMessage::Fail()
@ 0x7f43d58da433 google::LogMessage::SendToLog()
@ 0x7f43d58d815b google::LogMessage::Flush()
@ 0x7f43d58dae1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f43d6879b3e caffe::Net::ShareTrainedLayersWith()
@ 0x7f43d68edc6c caffe::Solver::TestDetection()
@ 0x7f43d68f0a87 caffe::Solver::TestAll()
@ 0x7f43d68f1347 caffe::Solver::Step()
@ 0x7f43d68f2d62 caffe::Solver::Solve()
@ 0x7f43d68e3c0a caffe::P2PSync::InternalThreadEntry()
@ 0x7f43d6472b2c caffe::InternalThread::entry()
@ 0x7f43d6474ddb boost::detail::thread_data<>::run()
@ 0x7f43d3dcf5d5 (unknown)
@ 0x7f43d38a06ba start_thread
@ 0x7f43d40eb41d clone
@ (nil) (unknown)
Aborted (core dumped)
Is there any performance comparison done for Object Detection for following two models:
jdetnet21 vs mobilenet
I see jdetnet21 is loosely based on mobilenet. Are there any particular benefits of using one vs other.
is there any information about ssd ? like trained model and performance
`
class Quantized:
def init(self):
pass
@staticmethod
def check_valid(key):
invalid = ['running_mean', 'running_var', 'num_batches_tracked']
for iv in invalid:
if iv in key:
return False
return True
def get_range(self, metric):
minr = float('inf')
maxr = -float('inf')
minr = min(minr, torch.min(metric.cpu()).item())
maxr = max(maxr, torch.max(metric.cpu()).item())
return minr, maxr
@staticmethod
def clip(metric, mq):
metric = torch.clamp(torch.round(metric * mq), -128, 127)
return metric / mq
def __call__(self, state_dict):
for key, value in enumerate(state_dict):
if Quantized.check_valid(value):
metric = state_dict[value]
minr, maxr = self.get_range(metric)
int_len = math.log2(max(abs(minr), abs(maxr))) + 1
fac_len = 8 - int_len
mq = math.pow(2, fac_len)
metric = Quantized.clip(metric, mq)
state_dict[value] = metric
print("minr: {} maxr: {}".format(minr, maxr))
print("quantized complete ^_^")
return state_dict
`
I write it in pytorch, but i have a poor result. I need your help.
1.How can i get the range [Rmin, Rmax]?
Or change to the address of ti.com such as
Dear,
I try to build with:
USE_CUDNN := 0
CPU_ONLY := 1
But always get error:
CXX src/caffe/blob.cpp
In file included from ./include/caffe/common.hpp:48:0,
from ./include/caffe/blob.hpp:11,
from src/caffe/blob.cpp:6:
./include/caffe/util/device_alternate.hpp:3:23: fatal error: cublas_v2.h: No such file or directory
#include <cublas_v2.h>
^
compilation terminated.
make: *** [.build_release/src/caffe/blob.o] Error 1
How to build it without GPU?
Thanks and best regards
He Wei
I0611 00:04:31.663493 8740 net.cpp:403] Top memory (TEST) required for data: 1703675240 diff: 1703675240
I0611 00:04:31.663501 8740 net.cpp:406] Bottom memory (TEST) required for data: 1703674816 diff: 1703674816
I0611 00:04:31.663507 8740 net.cpp:409] Shared (in-place) memory (TEST) by data: 695552000 diff: 695552000
I0611 00:04:31.663511 8740 net.cpp:412] Parameters memory (TEST) required for data: 19132568 diff: 19132568
I0611 00:04:31.663516 8740 net.cpp:415] Parameters shared memory (TEST) by data: 0 diff: 0
I0611 00:04:31.663519 8740 net.cpp:421] Network initialization done.
F0611 00:04:31.663878 8740 io.cpp:55] Check failed: fd != -1 (-1 vs. -1) File not found: training/voc0712/JDetNet/20190611_00-04_ds_PSP_dsFac_32_hdDS8_1/sparse/voc0712_ssdJacintoNetV2_iter_120000.caffemodel
*** Check failure stack trace: ***
@ 0x7f6b599f75cd google::LogMessage::Fail()
@ 0x7f6b599f9433 google::LogMessage::SendToLog()
@ 0x7f6b599f715b google::LogMessage::Flush()
@ 0x7f6b599f9e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f6b5a5d84dc caffe::ReadProtoFromBinaryFile()
@ 0x7f6b5a653de6 caffe::ReadNetParamsFromBinaryFileOrDie()
@ 0x7f6b5aa99dea caffe::Net::CopyTrainedLayersFromBinaryProto()
@ 0x7f6b5aa99e8e caffe::Net::CopyTrainedLayersFrom()
@ 0x41202c test_detection()
@ 0x40d1d0 main
@ 0x7f6b58179830 __libc_start_main
@ 0x40de69 _start
@ (nil) (unknown)
Hi,
While am trying to train the initial training am getting the below error.
I0302 02:22:22.283617 5159 common.cpp:528] NVML initialized, thread 5159
I0302 02:22:22.285171 5132 net.cpp:1071] Ignoring source layer mbox_loss
F0302 02:22:22.321794 5132 solver.cpp:668] Check failed: result[j]->width() == 5 (3 vs. 5)
*** Check failure stack trace: ***
@ 0x7f65e6a015cd google::LogMessage::Fail()
@ 0x7f65e6a03433 google::LogMessage::SendToLog()
@ 0x7f65e6a0115b google::LogMessage::Flush()
@ 0x7f65e6a03e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f65e7a09a38 caffe::Solver::TestDetection()
@ 0x7f65e7a0a857 caffe::Solver::TestAll()
@ 0x7f65e7a0b3bc caffe::Solver::Step()
@ 0x7f65e7a0d512 caffe::Solver::Solve()
@ 0x410732 train()
@ 0x40d310 main
@ 0x7f65e5034830 __libc_start_main
@ 0x40dfa9 _start
@ (nil) (unknown)
Kindly share your comments.
I0908 09:01:46.720330 7901 caffe.cpp:268] Solver performance on device 0: 1.667 * 32 = 53.33 img/sec (6 itr in 2.4 sec)
I0908 09:01:46.720353 7901 caffe.cpp:271] Optimization Done in 16s
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injectorboost::lock_error >'
what(): boost: mutex lock failed in pthread_mutex_lock: Invalid argument
*** Aborted at 1567904507 (unix time) try "date -d @1567904507" if you are using GNU date ***
PC: @ 0x7fe0f0f8ae97 gsignal
*** SIGABRT (@0x3e800001edd) received by PID 7901 (TID 0x7fe079fff700) from PID 7901; stack trace: ***
@ 0x7fe0f0f8af20 (unknown)
@ 0x7fe0f0f8ae97 gsignal
@ 0x7fe0f0f8c801 abort
@ 0x7fe0f1d9e957 (unknown)
@ 0x7fe0f1da4ab6 (unknown)
@ 0x7fe0f1da4af1 std::terminate()
@ 0x7fe0f1da4d24 __cxa_throw
@ 0x7fe0f3389734 boost::throw_exception<>()
@ 0x7fe0f33898c7 boost::unique_lock<>::lock()
@ 0x7fe0f37c51af caffe::BlockingQueue<>::push()
@ 0x7fe0f3484e76 caffe::AnnotatedDataLayer<>::load_batch()
@ 0x7fe0f3748656 caffe::BasePrefetchingDataLayer<>::InternalThreadEntryN()
@ 0x7fe0f33688c6 caffe::InternalThread::entry()
@ 0x7fe0f336b03b boost::detail::thread_data<>::run()
@ 0x7fe0e6f92bcd (unknown)
@ 0x7fe0d0ade6db start_thread
@ 0x7fe0f106d88f clone
Hi,
When I tried with mobiledetnetv2, I got the following problem:
solver_param: {'type': 'SGD', 'max_iter': 120000, 'stepvalue': [60000, 90000, 300000], 'base_lr': 0.01, 'lr_policy': 'multistep', 'power': 1.0, 'weight_decay': 0.0001}
config_param: {'model_name': 'mobiledetnetv2-0.5', 'config_name': '/home/user/projects/caffe-jdetnet/trained_models/rovit_traffic_dataset/mobiledetnetv2-0.5/2019_01_24/ssd_256x256_ds_PSP_dsFac_32_hdDS8_1/initial', 'gpus': '0', 'threads': 8, 'pretrain_model': '/home/user/projects/caffe-jdetnet/pretrained_models/imagenet_mobilenet-0.5_iter_320000.caffemodel', 'dataset': 'rovit_traffic_dataset', 'train_data': '/media/user/DATA/data/rovit_traffic_dataset/lmdb/rovit_traffic_dataset_trainval_lmdb', 'test_data': '/media/user/DATA/data/rovit_traffic_dataset/lmdb/rovit_traffic_dataset_test_lmdb', 'name_size_file': '/media/user/DATA/data/rovit_traffic_dataset/test_name_size.txt', 'label_map_file': '/media/user/DATA/data/rovit_traffic_dataset/labelmap_rovit_traffic_dataset.prototxt', 'num_test_image': 21375, 'num_classes': 8, 'ssd_size': '256x256', 'use_batchnorm_mbox': 1, 'small_object': 1, 'mean_value': 128, 'use_batchnorm': False, 'use_scale': True, 'lr_mult': 1, 'kernel_mbox_loc_conf': 1, 'chop_num_heads': 0, 'num_intermediate': 512, 'rhead_name_non_linear': 0, 'first_hd_same_op_ch': 1, 'reg_head_at_ds8': 1, 'aspect_ratio_type': 1, 'concat_reg_head': 0, 'base_nw_3_head': 0, 'use_difficult_gt': 1, 'evaluate_difficult_gt': 0, 'ignore_difficult_gt': False, 'fully_conv_at_end': 0, 'force_color': 0, 'shuffle': 1, 'use_image_list': 1, 'log_space_steps': 0, 'min_ratio': 5, 'max_ratio': 85, 'batch_size': 16, 'accum_batch_size': 16, 'test_batch_size': 8, 'feature_stride': 32, 'num_feature': 32, 'ds_type': 'PSP', 'ds_fac': 32, 'min_dim': 256, 'resize_width': 256, 'resize_height': 256, 'crop_width': 256, 'crop_height': 256, 'run_soon': True, 'resume_training': True, 'remove_old_models': False, 'stride_list': None, 'dilation_list': None, 'freeze_layers': [], 'flip': True, 'clip': False, 'share_location': True, 'background_label_id': 0, 'normalization_mode': 1, 'code_type': 2, 'ignore_cross_boundary_bbox': False, 'mining_type': 1, 'neg_pos_ratio': 3.0, 'loc_weight': 1.0}
config_param.ds_fac: 32
config_param.stride_list: [2, 2, 2, 2, 2]
Traceback (most recent call last):
File "/home/user/projects/caffe-jdetnet/train_jdetnet.py", line 221, in
train(config_param, solver_param, caffe_cmd)
File "/home/user/projects/caffe-jdetnet/models/train_jdetnet_model.py", line 768, in train
net, out_layer, out_layer_names = CoreNetwork(config_param, net, out_layer)
File "/home/user/projects/caffe-jdetnet/models/train_jdetnet_model.py", line 338, in CoreNetwork
num_intermediate=config_param['num_intermediate'])
File "/home/user/projects/caffe-jdetnet/models/mobilenetv2.py", line 230, in mobiledetnetv2
num_input = num_channels[from_layer]
KeyError: 'relu5_5/sep'
Is this the problem about model architect? How to solve it?
Thank you.
hello, i run the imagenet classification demo, i find it is slower when setting quantize=ture, is it normal?
using quantization
not using quantization
I was able to train mobilenet, however I believe EVE cores are loaded too much so that there is no image.
# Default - 0
randParams = 0
# 0: Caffe, 1: TensorFlow, Default - 0
modelType = 0
# 0: Fixed quantization By tarininng Framework, 1: Dyanamic quantization by TIDL, Default - 1
quantizationStyle = 1
# quantRoundAdd/100 will be added while rounding to integer, Default - 50
quantRoundAdd = 25
numParamBits = 8
# 0 : 8bit Unsigned, 1 : 8bit Signed Default - 1
inElementType = 0
inputNetFile = "deploy.prototxt
inputParamsFile = "voc0712_mobiledetnet-0.5_iter_120000.caffemodel"
outputNetFile = "NET_OD_mobilenet.bin"
outputParamsFile = "PRM_OD_mobilenet.bin"
rawSampleInData = 1
preProcType = 4
sampleInData = "trace_dump_0_768x320.y"
tidlStatsTool = "eve_test_dl_algo.out.exe"
Does anyone verify using mobilenet with TIDL OD usecase, if so could you share the import configuration please.
Thanks in advance.
I installed NCCL to use more gpus to train model.
install step:
When I train model I use below instruction:
But I can use 2 gpus to train.
Did I loss something instruction?
I have gone through all the steps and this came out.
I tried the symbolic link solution for libturbojpeg but did not change.
CXX .build_release/src/caffe/proto/caffe.pb.cc
./3rdparty/half_float/half.hpp(1659): warning: calling a host function("half_float::detail::round_half<( ::std::float_round_style)1> ") from a host device function("half_float::detail::functions::rint") is not allowed
./3rdparty/half_float/half.hpp(1659): warning: calling a host function("half_float::detail::round_half<( ::std::float_round_style)1> ") from a host device function("half_float::detail::functions::rint") is not allowed
./3rdparty/half_float/half.hpp(1659): warning: calling a host function("half_float::detail::round_half<( ::std::float_round_style)1> ") from a host device function("half_float::detail::functions::rint") is not allowed
./3rdparty/half_float/half.hpp(1659): warning: calling a host function("half_float::detail::round_half<( ::std::float_round_style)1> ") from a host device function("half_float::detail::functions::rint") is not allowed
AR -o .build_release/lib/libcaffe-nv.a
LD -o .build_release/lib/libcaffe-nv.so.0.17.0
/usr/bin/ld: cannot find -lopenblas
collect2: error: ld returned 1 exit status
Makefile:600: recipe for target '.build_release/lib/libcaffe-nv.so.0.17.0' failed
make: *** [.build_release/lib/libcaffe-nv.so.0.17.0] Error 1
My system is Ubuntu 16.04 , GTX 1080Ti , CUDA 8.0 , CuDNN V6
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.