mathmanu / caffe-jacinto-models Goto Github PK

View Code? Open in Web Editor NEW

63.0 63.0 47.0 468.53 MB

This repository has moved. The new link can be obtained from https://github.com/TexasInstruments/jacinto-ai-devkit

caffe-jacinto-models's People

Contributors

Stargazers

Watchers

caffe-jacinto-models's Issues

Check failed when testing quantize in JDetNet

Hello，i run the tarin_image_object_detection.sh, and got some problems in the test_quantize phase.
Here is the log.
I0910 14:36:28.498530 13312 common.cpp:475] GPU 0 'TITAN Xp' has compute capability 6.1
I0910 14:36:29.036025 13312 caffe.cpp:902] This is NVCaffe 0.17.0 started at Mon Sep 10 14:36:28 2018
I0910 14:36:29.036056 13312 caffe.cpp:904] CuDNN version: 7104
I0910 14:36:29.036072 13312 caffe.cpp:905] CuBLAS version: 9000
I0910 14:36:29.036077 13312 caffe.cpp:906] CUDA version: 9000
I0910 14:36:29.036082 13312 caffe.cpp:907] CUDA driver version: 9010
I0910 14:36:29.036089 13312 caffe.cpp:908] Arguments:
[0]: /home/junxiang/caffe-jacinto/build/tools/caffe.bin
[1]: test_detection
[2]: --model=training/voc0712/JDetNet/20180828_14-57_ds_PSP_dsFac_32_hdDS8_1/test_quantize/test.prototxt
[3]: --iterations=496
[4]: --weights=training/voc0712/JDetNet/20180828_14-57_ds_PSP_dsFac_32_hdDS8_1/sparse/voc0712_ssdJacintoNetV2_iter_120000.caffemodel
…………………………………………

I0910 14:36:48.912307 13312 net.cpp:2195] Enabling quantization at output of: Concat mbox_loc
I0910 14:36:48.912477 13312 net.cpp:2195] Enabling quantization at output of: Concat mbox_conf
I0910 14:36:48.912649 13312 net.cpp:2195] Enabling quantization at output of: Concat mbox_priorbox
I0910 14:36:48.917215 13350 common.cpp:192] New stream 0x7fa3ac006960, device 0, thread 13350
F0910 14:36:48.941680 13312 permute_layer.cu:70] Check failed: error == cudaSuccess (7 vs. 0) too many resources requested for launch
*** Check failure stack trace: ***
@ 0x7fa4660295cd google::LogMessage::Fail()
@ 0x7fa46602b433 google::LogMessage::SendToLog()
@ 0x7fa46602915b google::LogMessage::Flush()
@ 0x7fa46602be1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7fa467a7ce48 caffe::PermuteLayer<>::Forward_gpu()
@ 0x7fa4673e8a7f caffe::Layer<>::Forward()
@ 0x7fa4672561fe caffe::Net::ForwardFromTo()
@ 0x7fa46725633d caffe::Net::Forward()
@ 0x44cc4b test_detection()
@ 0x4521f2 main
@ 0x7fa4647ab830 __libc_start_main
@ 0x449699 _start
@ (nil) (unknown)

Can you tell me how to solve this problem？

Hi, could you tell me the sparse method in segmetation?

i want to learn it and do some changes.

JDetNet failed while training

I am using v0.17

I0810 14:59:47.258464 25579 solver.cpp:352] Iteration 7800 (1.27604 iter/s, 78.3675s/100 iter), 15.1/452.4ep, loss = 5.78276
I0810 14:59:47.258651 25579 solver.cpp:376] Train net output #0: mbox_loss = 6.02961 (* 1 = 6.02961 loss)
I0810 14:59:47.258673 25579 sgd_solver.cpp:172] Iteration 7800, lr = 0.01, m = 0.9, wd = 0.0001, gs = 1
I0810 15:01:05.190842 25579 solver.cpp:352] Iteration 7900 (1.28319 iter/s, 77.9311s/100 iter), 15.3/452.4ep, loss = 5.22862
I0810 15:01:05.191012 25579 solver.cpp:376] Train net output #0: mbox_loss = 5.81463 (* 1 = 5.81463 loss)
I0810 15:01:05.191031 25579 sgd_solver.cpp:172] Iteration 7900, lr = 0.01, m = 0.9, wd = 0.0001, gs = 1
I0810 15:02:23.070077 25579 solver.cpp:905] Snapshotting to binary proto file training/voc0712/JDetNet/_ds_PSP_dsFac_32_hdDS8_1/initial/voc0712_ssdJacintoNetV2_iter_8000.caffemodel
I0810 15:02:23.083901 25580 net.cpp:1071] Ignoring source layer mbox_loss
I0810 15:02:23.112282 25579 sgd_solver.cpp:398] Snapshotting solver state to binary proto file training/voc0712/JDetNet/_ds_PSP_dsFac_32_hdDS8_1/initial/voc0712_ssdJacintoNetV2_iter_8000.solverstate
I0810 15:02:23.131805 25579 solver.cpp:635] Iteration 8000, Testing net (#0)
F0810 15:02:23.132831 25579 net.cpp:1081] Check failed: target_blobs[j]->shape() == source_blob->shape() Cannot share param 0 weights from layer 'conv1a/bn'; shape mismatch. Source param shape is 1 32 1 1 (32); target param shape is 32 (32)
*** Check failure stack trace: ***
@ 0x7f43d58d85cd google::LogMessage::Fail()
@ 0x7f43d58da433 google::LogMessage::SendToLog()
@ 0x7f43d58d815b google::LogMessage::Flush()
@ 0x7f43d58dae1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f43d6879b3e caffe::Net::ShareTrainedLayersWith()
@ 0x7f43d68edc6c caffe::Solver::TestDetection()
@ 0x7f43d68f0a87 caffe::Solver::TestAll()
@ 0x7f43d68f1347 caffe::Solver::Step()
@ 0x7f43d68f2d62 caffe::Solver::Solve()
@ 0x7f43d68e3c0a caffe::P2PSync::InternalThreadEntry()
@ 0x7f43d6472b2c caffe::InternalThread::entry()
@ 0x7f43d6474ddb boost::detail::thread_data<>::run()
@ 0x7f43d3dcf5d5 (unknown)
@ 0x7f43d38a06ba start_thread
@ 0x7f43d40eb41d clone
@ (nil) (unknown)
Aborted (core dumped)

Regarding performance comparison

Is there any performance comparison done for Object Detection for following two models:
jdetnet21 vs mobilenet
I see jdetnet21 is loosely based on mobilenet. Are there any particular benefits of using one vs other.

这个可以在windows10+anaconda3的环境下使用吗？

any docs about SSD Object detection ?

is there any information about ssd ? like trained model and performance

no module named propagate_obj

ssd_detect_video.py

Hi，I have the problem in quantized in your paper.

`
class Quantized:
def init(self):
pass

@staticmethod
def check_valid(key):
    invalid = ['running_mean', 'running_var', 'num_batches_tracked']
    for iv in invalid:
        if iv in key:
            return False
    
    return True

def get_range(self, metric):
    minr = float('inf')
    maxr = -float('inf')

    minr = min(minr, torch.min(metric.cpu()).item())
    maxr = max(maxr, torch.max(metric.cpu()).item())
    return minr, maxr     

@staticmethod
def clip(metric, mq):
    metric = torch.clamp(torch.round(metric * mq), -128, 127)
    return metric / mq

def __call__(self, state_dict):
    for key, value in enumerate(state_dict):
        if Quantized.check_valid(value):
            metric = state_dict[value]
            minr, maxr = self.get_range(metric)
            int_len = math.log2(max(abs(minr), abs(maxr))) + 1
            fac_len = 8 - int_len
            
            mq = math.pow(2, fac_len)
            metric = Quantized.clip(metric, mq)
            state_dict[value] = metric
            print("minr: {} maxr: {}".format(minr, maxr))
    print("quantized complete ^_^")
    return state_dict

`
I write it in pytorch, but i have a poor result. I need your help.
1.How can i get the range [Rmin, Rmax]?

In some scripts named train_xx_.sh, weights_src URLs need to be modified from caffe-0.15 to caffe-0.17

Or change to the address of ti.com such as

https://git.ti.com/cgit/jacinto-ai-devkit/caffe-jacinto-models/about/trained/object_detection/voc0712/JDetNet/ssd512x512_ds_PSP_dsFac_32_fc_0_hdDS8_1_kerMbox_3_1stHdSameOpCh_1/initial/deploy.prototxt

https://git.ti.com/cgit/jacinto-ai-devkit/caffe-jacinto-models/about/trained/object_detection/voc0712/JDetNet/ssd512x512_ds_PSP_dsFac_32_fc_0_hdDS8_1_kerMbox_3_1stHdSameOpCh_1/initial/voc0712_ssdJacintoNetV2_iter_106000.caffemodel

how to build it using CPU_ONLY??

Dear,
I try to build with:

USE_CUDNN := 0
CPU_ONLY := 1

But always get error:

CXX src/caffe/blob.cpp
In file included from ./include/caffe/common.hpp:48:0,
from ./include/caffe/blob.hpp:11,
from src/caffe/blob.cpp:6:
./include/caffe/util/device_alternate.hpp:3:23: fatal error: cublas_v2.h: No such file or directory
#include <cublas_v2.h>
^
compilation terminated.
make: *** [.build_release/src/caffe/blob.o] Error 1

How to build it without GPU?

Thanks and best regards
He Wei

training error ./scripts/train_image_object_detection.sh

I0611 00:04:31.663493 8740 net.cpp:403] Top memory (TEST) required for data: 1703675240 diff: 1703675240
I0611 00:04:31.663501 8740 net.cpp:406] Bottom memory (TEST) required for data: 1703674816 diff: 1703674816
I0611 00:04:31.663507 8740 net.cpp:409] Shared (in-place) memory (TEST) by data: 695552000 diff: 695552000
I0611 00:04:31.663511 8740 net.cpp:412] Parameters memory (TEST) required for data: 19132568 diff: 19132568
I0611 00:04:31.663516 8740 net.cpp:415] Parameters shared memory (TEST) by data: 0 diff: 0
I0611 00:04:31.663519 8740 net.cpp:421] Network initialization done.
F0611 00:04:31.663878 8740 io.cpp:55] Check failed: fd != -1 (-1 vs. -1) File not found: training/voc0712/JDetNet/20190611_00-04_ds_PSP_dsFac_32_hdDS8_1/sparse/voc0712_ssdJacintoNetV2_iter_120000.caffemodel
*** Check failure stack trace: ***
@ 0x7f6b599f75cd google::LogMessage::Fail()
@ 0x7f6b599f9433 google::LogMessage::SendToLog()
@ 0x7f6b599f715b google::LogMessage::Flush()
@ 0x7f6b599f9e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f6b5a5d84dc caffe::ReadProtoFromBinaryFile()
@ 0x7f6b5a653de6 caffe::ReadNetParamsFromBinaryFileOrDie()
@ 0x7f6b5aa99dea caffe::Net::CopyTrainedLayersFromBinaryProto()
@ 0x7f6b5aa99e8e caffe::Net::CopyTrainedLayersFrom()
@ 0x41202c test_detection()
@ 0x40d1d0 main
@ 0x7f6b58179830 __libc_start_main
@ 0x40de69 _start
@ (nil) (unknown)

error occur while training

Hi,
While am trying to train the initial training am getting the below error.

I0302 02:22:22.283617 5159 common.cpp:528] NVML initialized, thread 5159
I0302 02:22:22.285171 5132 net.cpp:1071] Ignoring source layer mbox_loss
F0302 02:22:22.321794 5132 solver.cpp:668] Check failed: result[j]->width() == 5 (3 vs. 5)
*** Check failure stack trace: ***
@ 0x7f65e6a015cd google::LogMessage::Fail()
@ 0x7f65e6a03433 google::LogMessage::SendToLog()
@ 0x7f65e6a0115b google::LogMessage::Flush()
@ 0x7f65e6a03e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f65e7a09a38 caffe::Solver::TestDetection()
@ 0x7f65e7a0a857 caffe::Solver::TestAll()
@ 0x7f65e7a0b3bc caffe::Solver::Step()
@ 0x7f65e7a0d512 caffe::Solver::Solve()
@ 0x410732 train()
@ 0x40d310 main
@ 0x7f65e5034830 __libc_start_main
@ 0x40dfa9 _start
@ (nil) (unknown)

Kindly share your comments.

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::lock_error> >' what(): boost: mutex lock failed in pthread_mutex_lock: Invalid argument

Hi,
when I try to train my models, problems happened in the end of every stage. I mean 'initial'、'l1reg'、‘sparse’ and so on. The main work of every stage seems to be done, but no result charts saved compared to the example stored in the './trained' fold. The problems is like due to the multi-threads. the run log shows as follows.

I0908 09:01:46.720330 7901 caffe.cpp:268] Solver performance on device 0: 1.667 * 32 = 53.33 img/sec (6 itr in 2.4 sec)
I0908 09:01:46.720353 7901 caffe.cpp:271] Optimization Done in 16s
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injectorboost::lock_error >'
what(): boost: mutex lock failed in pthread_mutex_lock: Invalid argument
*** Aborted at 1567904507 (unix time) try "date -d @1567904507" if you are using GNU date ***
PC: @ 0x7fe0f0f8ae97 gsignal
*** SIGABRT (@0x3e800001edd) received by PID 7901 (TID 0x7fe079fff700) from PID 7901; stack trace: ***
@ 0x7fe0f0f8af20 (unknown)
@ 0x7fe0f0f8ae97 gsignal
@ 0x7fe0f0f8c801 abort
@ 0x7fe0f1d9e957 (unknown)
@ 0x7fe0f1da4ab6 (unknown)
@ 0x7fe0f1da4af1 std::terminate()
@ 0x7fe0f1da4d24 __cxa_throw
@ 0x7fe0f3389734 boost::throw_exception<>()
@ 0x7fe0f33898c7 boost::unique_lock<>::lock()
@ 0x7fe0f37c51af caffe::BlockingQueue<>::push()
@ 0x7fe0f3484e76 caffe::AnnotatedDataLayer<>::load_batch()
@ 0x7fe0f3748656 caffe::BasePrefetchingDataLayer<>::InternalThreadEntryN()
@ 0x7fe0f33688c6 caffe::InternalThread::entry()
@ 0x7fe0f336b03b boost::detail::thread_data<>::run()
@ 0x7fe0e6f92bcd (unknown)
@ 0x7fe0d0ade6db start_thread
@ 0x7fe0f106d88f clone

Problem with mobiledetnetv2

Hi,
When I tried with mobiledetnetv2, I got the following problem:
solver_param: {'type': 'SGD', 'max_iter': 120000, 'stepvalue': [60000, 90000, 300000], 'base_lr': 0.01, 'lr_policy': 'multistep', 'power': 1.0, 'weight_decay': 0.0001}
config_param: {'model_name': 'mobiledetnetv2-0.5', 'config_name': '/home/user/projects/caffe-jdetnet/trained_models/rovit_traffic_dataset/mobiledetnetv2-0.5/2019_01_24/ssd_256x256_ds_PSP_dsFac_32_hdDS8_1/initial', 'gpus': '0', 'threads': 8, 'pretrain_model': '/home/user/projects/caffe-jdetnet/pretrained_models/imagenet_mobilenet-0.5_iter_320000.caffemodel', 'dataset': 'rovit_traffic_dataset', 'train_data': '/media/user/DATA/data/rovit_traffic_dataset/lmdb/rovit_traffic_dataset_trainval_lmdb', 'test_data': '/media/user/DATA/data/rovit_traffic_dataset/lmdb/rovit_traffic_dataset_test_lmdb', 'name_size_file': '/media/user/DATA/data/rovit_traffic_dataset/test_name_size.txt', 'label_map_file': '/media/user/DATA/data/rovit_traffic_dataset/labelmap_rovit_traffic_dataset.prototxt', 'num_test_image': 21375, 'num_classes': 8, 'ssd_size': '256x256', 'use_batchnorm_mbox': 1, 'small_object': 1, 'mean_value': 128, 'use_batchnorm': False, 'use_scale': True, 'lr_mult': 1, 'kernel_mbox_loc_conf': 1, 'chop_num_heads': 0, 'num_intermediate': 512, 'rhead_name_non_linear': 0, 'first_hd_same_op_ch': 1, 'reg_head_at_ds8': 1, 'aspect_ratio_type': 1, 'concat_reg_head': 0, 'base_nw_3_head': 0, 'use_difficult_gt': 1, 'evaluate_difficult_gt': 0, 'ignore_difficult_gt': False, 'fully_conv_at_end': 0, 'force_color': 0, 'shuffle': 1, 'use_image_list': 1, 'log_space_steps': 0, 'min_ratio': 5, 'max_ratio': 85, 'batch_size': 16, 'accum_batch_size': 16, 'test_batch_size': 8, 'feature_stride': 32, 'num_feature': 32, 'ds_type': 'PSP', 'ds_fac': 32, 'min_dim': 256, 'resize_width': 256, 'resize_height': 256, 'crop_width': 256, 'crop_height': 256, 'run_soon': True, 'resume_training': True, 'remove_old_models': False, 'stride_list': None, 'dilation_list': None, 'freeze_layers': [], 'flip': True, 'clip': False, 'share_location': True, 'background_label_id': 0, 'normalization_mode': 1, 'code_type': 2, 'ignore_cross_boundary_bbox': False, 'mining_type': 1, 'neg_pos_ratio': 3.0, 'loc_weight': 1.0}
config_param.ds_fac: 32
config_param.stride_list: [2, 2, 2, 2, 2]
Traceback (most recent call last):
File "/home/user/projects/caffe-jdetnet/train_jdetnet.py", line 221, in
train(config_param, solver_param, caffe_cmd)
File "/home/user/projects/caffe-jdetnet/models/train_jdetnet_model.py", line 768, in train
net, out_layer, out_layer_names = CoreNetwork(config_param, net, out_layer)
File "/home/user/projects/caffe-jdetnet/models/train_jdetnet_model.py", line 338, in CoreNetwork
num_intermediate=config_param['num_intermediate'])
File "/home/user/projects/caffe-jdetnet/models/mobilenetv2.py", line 230, in mobiledetnetv2
num_input = num_channels[from_layer]
KeyError: 'relu5_5/sep'
Is this the problem about model architect? How to solve it?
Thank you.

using quantization is slower

hello, i run the imagenet classification demo, i find it is slower when setting quantize=ture, is it normal?
using quantization
not using quantization

caffe-0.17 mobilenet object detection TIDL import configuration

I was able to train mobilenet, however I believe EVE cores are loaded too much so that there is no image.

I might be missing a step in configuring it for TIDL import tool.
So far I have;

# Default - 0
randParams         = 0

# 0: Caffe, 1: TensorFlow, Default - 0
modelType          = 0

# 0: Fixed quantization By tarininng Framework, 1: Dyanamic quantization by TIDL, Default - 1
quantizationStyle  = 1

# quantRoundAdd/100 will be added while rounding to integer, Default - 50
quantRoundAdd      = 25

numParamBits       = 8
# 0 : 8bit Unsigned, 1 : 8bit Signed Default - 1
inElementType      = 0

inputNetFile       = "deploy.prototxt
inputParamsFile       = "voc0712_mobiledetnet-0.5_iter_120000.caffemodel"
outputNetFile      = "NET_OD_mobilenet.bin"
outputParamsFile   = "PRM_OD_mobilenet.bin"

rawSampleInData = 1
preProcType   = 4
sampleInData = "trace_dump_0_768x320.y"
tidlStatsTool = "eve_test_dl_algo.out.exe"

Does anyone verify using mobilenet with TIDL OD usecase, if so could you share the import configuration please.

Additionally, I am using deploy.prototxt and voc0712_mobiledetnet-0.5_iter_120000.caffemodel from scripts/training/../initial folder as other folders do not have caffe model, is this correct?

Thanks in advance.

Get error when use 3 or 4 gpus to train model

I installed NCCL to use more gpus to train model.
install step:

git clone https://github.com/NVIDIA/nccl.git
cd nccl
sudo make install -j8
remove Makefile.config USE_NCCL comment

When I train model I use below instruction:
$CAFFE_ROOT/build/tools/caffe train --solver="models/ssd/${PROJECT}/initial/solver.prototxt" --weights="models/ssd/${PROJECT}/initial/${PRETRAINED}" -gpu 0,1,2

it get error:

But I can use 2 gpus to train.
Did I loss something instruction?

make: *** [.build_release/lib/libcaffe-nv.so.0.17.0] Error 1

I have gone through all the steps and this came out.

I tried the symbolic link solution for libturbojpeg but did not change.

CXX .build_release/src/caffe/proto/caffe.pb.cc
./3rdparty/half_float/half.hpp(1659): warning: calling a host function("half_float::detail::round_half<( ::std::float_round_style)1> ") from a host device function("half_float::detail::functions::rint") is not allowed

./3rdparty/half_float/half.hpp(1659): warning: calling a host function("half_float::detail::round_half<( ::std::float_round_style)1> ") from a host device function("half_float::detail::functions::rint") is not allowed

AR -o .build_release/lib/libcaffe-nv.a
LD -o .build_release/lib/libcaffe-nv.so.0.17.0
/usr/bin/ld: cannot find -lopenblas
collect2: error: ld returned 1 exit status
Makefile:600: recipe for target '.build_release/lib/libcaffe-nv.so.0.17.0' failed
make: *** [.build_release/lib/libcaffe-nv.so.0.17.0] Error 1

My system is Ubuntu 16.04 , GTX 1080Ti , CUDA 8.0 , CuDNN V6

mathmanu / caffe-jacinto-models Goto Github PK

caffe-jacinto-models's People

Contributors

Stargazers

Watchers

Forkers

caffe-jacinto-models's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs