<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

some comparison about crossstagepartialnetworks HOT 12 CLOSED

WongKinYiu commented on August 23, 2024

some comparison

from crossstagepartialnetworks.

Comments (12)

WongKinYiu commented on August 23, 2024 5

@amusi Hello,

I saw your article, here I provide some comparison of Pytorch version YOLOv3, YOLOv4, and YOLOv5. (All experiments are run on a same Tesla V100 GPU)

Pytorch version

Train with YOLOv3 setting (416x416)

trained on coco 2014 trainvalno5k set and tested on coco 2014 5k set.

YOLOv3-SPP:

yolov3-spp 43.1% AP @ 608x608
Model Summary: 152 layers, 6.29719e+07 parameters, 6.29719e+07 gradients
Speed: 6.8/1.6/8.3 ms inference/NMS/total per 608x608 image at batch-size 16

Train with YOLOv4 setting (512x512)

trained on coco 2014 trainvalno5k set and tested on coco 2014 5k set.

YOLOv3-SPP:

yolov3-spp 43.6% AP @ 608x608
Model Summary: 152 layers, 6.29719e+07 parameters, 6.29719e+07 gradients
Speed: 6.8/1.6/8.3 ms inference/NMS/total per 608x608 image at batch-size 16

CSPDarknet53s-YOSPP: (~YOLOv4(Leaky) backbone + YOLOv3 head)

cd53s-yospp 43.7% AP @ 608x608
Model Summary: 184 layers, 4.89836e+07 parameters, 4.89836e+07 gradients
Speed: 6.3/1.6/7.8 ms inference/NMS/total per 608x608 image at batch-size 16

CSPDarknet53s-YOSPP-Mish: (~YOLOv4 backbone + YOLOv3 head)

cd53s-yospp-mish 44.3% AP @ 608x608
Model Summary: 184 layers, 4.89836e+07 parameters, 4.89836e+07 gradients
Speed: 7.9/1.6/9.6 ms inference/NMS/total per 608x608 image at batch-size 16

CSPDarknet53s-PASPP: (~YOLOv4(Leaky))

cd53s-paspp 44.5% AP @ 608x608
Model Summary: 212 layers, 6.43092e+07 parameters, 6.43092e+07 gradients
Speed: 6.9/1.6/8.5 ms inference/NMS/total per 608x608 image at batch-size 16

CSPDarknet53s-PASPP-Mish: (~YOLOv4)

cd53s-paspp-mish 45.0% AP @ 608x608
Model Summary: 212 layers, 6.43092e+07 parameters, 6.43092e+07 gradients
Speed: 8.7/1.6/10.3 ms inference/NMS/total per 608x608 image at batch-size 16

CSPDarknet53s-PACSP:

cd53s-paspp-cspt 45.1% AP @ 608x608
Model Summary: 222 layers, 5.84596e+07 parameters, 5.84596e+07 gradients
Speed: 6.6/1.5/8.1 ms inference/NMS/total per 608x608 image at batch-size 16

Train with YOLOv5 setting (640x640)

trained on coco 2017 train set and tested on coco 2017 5k set.

YOLOv3-SPP:

yolov3-spp 45.5% AP @ 736x736
Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
Speed: 10.4/2.1/12.6 ms inference/NMS/total per 736x736 image at batch-size 16

YOLOv5s:

yolov5s 33.1% AP @ 736x736
Model Summary: 99 layers, 6.99302e+06 parameters, 6.99302e+06 gradients
Speed: 2.2/2.1/4.4 ms inference/NMS/total per 736x736 image at batch-size 16

YOLOv5m:

yolov5m 41.5% AP @ 736x736
Model Summary: 165 layers, 2.51928e+07 parameters, 2.51928e+07 gradients
Speed: 5.4/1.8/7.2 ms inference/NMS/total per 736x736 image at batch-size 16

YOLOv5l:

yolov5l 44.2% AP @ 736x736
Model Summary: 231 layers, 6.17556e+07 parameters, 6.17556e+07 gradients
Speed: 11.3/2.2/13.5 ms inference/NMS/total per 736x736 image at batch-size 16

YOLOv5x:

yolov5x 47.1% AP @ 736x736
Model Summary: 297 layers, 1.23102e+08 parameters, 1.23102e+08 gradients
Speed: 20.3/2.2/22.5 ms inference/NMS/total per 736x736 image at batch-size 16

from crossstagepartialnetworks.

AlexeyAB commented on August 23, 2024 3

@WongKinYiu Hi,

It obviously CSPDarknet53s-PASPP-Mish: (~YOLOv4) is much better than amusi YOLOv5l (640x640) (batch-size 16):

CSPDarknet53s-PASPP-Mish: (~YOLOv4) 512x512/608x608: - 45.0% AP - Speed: 8.7/1.6/10.3 ms
YOLOv5l (640x640)/(736x736): - 44.2% AP - Speed: 11.3/2.2/13.5 ms

While our new YOLOv4 model is even much better:

CSPDarknet53s-PACSP: 45.1% AP - Speed: 6.6/1.5/8.1 ms

Does it use inference time data augmetation?
Why is batch 16 used here?
Is there GitHub-repo with amusi YOLOv5l (640x640) ?

Train with YOLOv5 setting (640x640)

trained on coco 2017 train set and tested on coco 2017 5k set.

YOLOv3-SPP:
yolov3-spp 45.5% AP
Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
Speed: 10.4/2.1/12.6 ms inference/NMS/total per 736x736 image at batch-size 16

Is better AP for Yolov3-spp achieved just by using 640x640 network resolution, or something else?

from crossstagepartialnetworks.

WongKinYiu commented on August 23, 2024 2

@AlexeyAB

Does CSPDarknet53s s give improvements for training on both Ultralitics and Darknet?

I am not sure for Darknet due to I do not train it on ImageNet, but yes for Ultralytics.

Interesting, what AP will give P6-model that is trained on 640x640 and tested on 736x736?

To acheive this goal I have to take a look how to construct P6 model using new Ultralytics repository. Then I need construct the YOLOv4 model, it does not support all of blocks of YOLOv4 currently.
(or maybe directly modify my current used pytorch code)
I think I will design training scheme to train P6 model on Darknet first.

from crossstagepartialnetworks.

WongKinYiu commented on August 23, 2024 1

@AlexeyAB

OK, will train this setting on tiny-yolov4 with width=640 and height=640.
If this can work good, users can use cheaper gpu to train yolo.

from crossstagepartialnetworks.

WongKinYiu commented on August 23, 2024 1

@AlexeyAB

cd53s-paspp-mish.cfg
cd53s-paspp-mish.pt

from crossstagepartialnetworks.

WongKinYiu commented on August 23, 2024

@AlexeyAB

Does it use inference time data augmetation?

No, there is no any inference time augmentation.

Why is batch 16 used here?

I just follow Ultralytics testing protocol with batch size 16.

Is there GitHub-repo with amusi YOLOv5l (640x640) ?

It is not amusi's repo, it is Ultralytics's new repo.

Is better AP for Yolov3-spp achieved just by using 640x640 network resolution, or something else?

There are some modifications in Ultralytics's new repo.
But yes I think main reason of improvement is from 640x640 training.
And In Ultralytics's new repo, it seems use affine transform instead of multi-resolution training.
So new training won't use too much GPU ram. (need to check code in details.) training log details

I am training CSPDarknet53-PACSP-(SAM)-Mish with darknet on MSCOCO 2017.

from crossstagepartialnetworks.

AlexeyAB commented on August 23, 2024

And In Ultralytics's new repo, it seems use affine transform instead of multi-resolution training.

Yes:

scale=0.5 https://github.com/ultralytics/yolov5/blob/391492ee5b56ef36424b4a9257c18f7c784a8f44/train.py#L44
python train.py --data coco.yaml --cfg yolov5s.yaml --weights '' --batch-size 16

May be we should use random=0 resize=1.5 instead of random=1 too in the Darknet?

from crossstagepartialnetworks.

WongKinYiu commented on August 23, 2024

@AlexeyAB Hello,

Yes, the AP is benefit by 640x640 training.
CSPDarknet53s-YOSPP gets 12.5% faster model inference speed and 0.1% higher AP than YOLOv3-SPP.
CSPDarknet53s-YOSPP gets 19.5% faster model inference speed and 1.3% higher AP than YOLOv5l.

YOLOv3-SPP:

yolov3-spp: 45.5% AP @736x736
Model Summary: 225 layers, 6.29987e+07 parameters, 6.29987e+07 gradients
Speed: 10.4/2.1/12.6 ms inference/NMS/total per 736x736 image at batch-size 16

CSPDarknet53s-YOSPP: (~YOLOv4(Leaky) backbone + YOLOv3 head)

cd53s-yospp: 45.6% AP @736x736
Model Summary: 225 layers, 4.90092e+07 parameters, 4.90092e+07 gradients
Speed: 9.1/2.0/11.1 ms inference/NMS/total per 736x736 image at batch-size 16

YOLOv5l:

yolov5l 44.2% AP @ 736x736
Model Summary: 231 layers, 6.17556e+07 parameters, 6.17556e+07 gradients
Speed: 11.3/2.2/13.5 ms inference/NMS/total per 736x736 image at batch-size 16

from crossstagepartialnetworks.

AlexeyAB commented on August 23, 2024

@WongKinYiu Nice.

Does CSPDarknet53s s give improvements for training on both Ultralitics and Darknet?
Interesting, what AP will give P6-model that is trained on 640x640 and tested on 736x736?

from crossstagepartialnetworks.

AlexeyAB commented on August 23, 2024

@WongKinYiu Hi,

Can you share cfg/weights files for this model?

CSPDarknet53s-PASPP-Mish: (~YOLOv4) - trained 512x512, tested 608x608

cd53s-paspp-mish 45.0% AP @ 608x608
Model Summary: 212 layers, 6.43092e+07 parameters, 6.43092e+07 gradients
Speed: 8.7/1.6/10.3 ms inference/NMS/total per 608x608 image at batch-size 16

from crossstagepartialnetworks.

clw5180 commented on August 23, 2024

Hi WongKinYiu, what does -PACSP mean ? And I can't find config and weight file of it, thanks a lot !

from crossstagepartialnetworks.

WongKinYiu commented on August 23, 2024

Hello, PACSP means apply CSP to PANet, the model is still in training process, will release .weights file after finish training.

from crossstagepartialnetworks.

some comparison about crossstagepartialnetworks HOT 12 CLOSED

Comments (12)

Pytorch version

Train with YOLOv3 setting (416x416)

Train with YOLOv4 setting (512x512)

Train with YOLOv5 setting (640x640)

Train with YOLOv5 setting (640x640)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs