duankaiwen / cpndet Goto Github PK
View Code? Open in Web Editor NEWCorner Proposal Network for Anchor-free, Two-stage Object Detection
License: MIT License
Corner Proposal Network for Anchor-free, Two-stage Object Detection
License: MIT License
i have trained my own dataset on this project, , but got final box scores >1, what's wrong
ModuleNotFoundError: No module named 'roi_align_cuda'
Hi,
I came here following your suggestion to use this instead of CenterNet. I actually managed to train CenterNet with promising results but decided to try this out since the code base is very similar, and has more promising inference speed.
So the setup is almost exactly the same (compiling custom layers etc) but I'm heaving trouble with CUDA out of memory errors, 12% into the training which seems bizarre to me.
Env setup:
using the same instance as for training Centernet which worked fine.
I setup the config file of HG52 with various chunk sizes (even batch size 4, 1 image per gpu) and the same error persists.
Attaching the log below.
I will be very grateful for any advice that you might have.
LOG:
iter 1165, all: 4.9353, focal: 3.4306, grouping:0.8367, region: 0.5174, regr: 0.1507
iter 1170, all: 5.6282, focal: 3.9427, grouping:0.9505, region: 0.5869, regr: 0.1482
iter 1175, all: 5.3373, focal: 3.7071, grouping:0.8346, region: 0.6310, regr: 0.1647
iter 1180, all: 5.3797, focal: 3.7966, grouping:0.8720, region: 0.5691, regr: 0.1419
12%|████▎ | 1184/10000 [42:00<5:12:45, 2.13s/it]
Traceback (most recent call last):
File "train.py", line 203, in
train(training_dbs, validation_db, args.start_iter)
File "train.py", line 143, in train
training_loss, focal_loss, grouping_loss, region_loss,regr_loss = nnet.train(**training)
File "/home/ubuntu/CPNDet/code/nnet/py_factory.py", line 97, in train
loss.backward()
File "/home/ubuntu/CenterNet/env/lib/python3.6/site-packages/torch/tensor.py", line 118, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/ubuntu/CenterNet/env/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: CUDA out of memory. Tried to allocate 3.86 GiB (GPU 0; 15.75 GiB total capacity; 10.16 GiB already allocated; 3.86 GiB free; 554.53 MiB cached)
ModuleNotFoundError: No module named 'models./home/lenovo/lpf/CPNDet-master/code/config/DLA34-multi_scale'
这种把分类器嵌入进去进行端到端训练,和单独训练一个分类器,他们的分类精度有多大差别呢。
我现在用的是 anchor-free模型+独立分类器模型 来解决误报问题的,想尝试下你这种端到端训练的方法,请多指教
I haven't looked at the code yet, but I'm just surprised that this work used CornerNet and added a classifier. Why is the reasoning going to be faster
.local/lib/python3.6/site-packages/roi_align-0.0.0-py3.6-linux-x86_64.egg/roi_align_cuda.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Someone encountered such a problem?
Hi,
I trained a model on a single class (human heads-for object detection), and now when running the testing script I'm having problems with filtering boxes.
The log attached below narrows down the problem to: this line where the region scores are gathered. However, I think it's because of the way sigmoid is used to obtain them in the first place, which works well for many classes but in my one-class detection, it fails .
Do you have any advice on this matter?
Log below:
Traceback (most recent call last):
File "test.py", line 105, in
test(testing_db, args.cfg_file, args.split, args.testiter, args.debug, args.no_flip, args.suffix)
File "test.py", line 67, in test
testing(db, cfg_file , nnet, result_dir, debug=debug, no_flip = no_flip)
File "/home/ubuntu/CPNDet/code/test/coco.py", line 342, in testing
return globals()[system_configs.sampling_function](db, cfg_file, nnet, result_dir, debug=debug, no_flip=no_flip)
File "/home/ubuntu/CPNDet/code/test/coco.py", line 238, in kp_detection
dets = decode_func(nnet, images, K, no_flip, ae_threshold=ae_threshold, kernel=nms_kernel, image_idx = image_idx)
File "/home/ubuntu/CPNDet/code/test/coco.py", line 62, in kp_decode
detections = nnet.test([images], ae_threshold=ae_threshold, K=K, no_flip = no_flip, kernel=kernel, image_idx = image_idx)
File "/home/ubuntu/CPNDet/code/nnet/py_factory.py", line 115, in test
return self.model(*xs, **kwargs)
File "/home/ubuntu/CPNDet/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/CPNDet/code/nnet/py_factory.py", line 33, in forward
return self.module(*xs, **kwargs)
File "/home/ubuntu/CPNDet/env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/CPNDet/code/models/py_utils/HG52.py", line 452, in forward
return self._test(*xs, **kwargs)
File "/home/ubuntu/CPNDet/code/models/py_utils/HG52.py", line 433, in _test
_filter_bboxes(ht_boxes, tl_clses, region_scores, grouping_scores, self.gr_threshold)
File "/home/ubuntu/CPNDet/code/models/py_utils/kp_utils.py", line 159, in _filter_bboxes
specific_rscores = region_scores.gather(1, ppos_ht_score_cls[:,1].long()).squeeze()
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
training is slow.
The size of my test image is 512512, but when I run the command:python test.py --cfg_file HG52 --testiter 270000 --split testing , the size of the input image is 639639, before the self.pre . I looked through all the files and couldn't find the value 639 set anywhere. (The size of my train image is also 512512, I found the size of the input image is 512512). How can I make the input size I'm testing match my actual image size?
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:114: UserWarning: torch.gt received 'out' parameter with dtype torch.uint8, this behavior is now deprecated,please use 'out' parameter with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:38: UserWarning: masked_scatter_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:114: UserWarning: torch.gt received 'out' parameter with dtype torch.uint8, this behavior is now deprecated,please use 'out' parameter with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:38: UserWarning: masked_scatter_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:38: UserWarning: masked_scatter_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:14: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
我在训练的时候有这个warning,我该怎么修改它呢?环境都是按照说明搭建的.
Could you provide the pre-training model of HG104?Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.