Comments (16)
Please check the docs of YOLOv7End2EndORT:
This class does not support TRT backend. To support TRT::EfficientNMS_TRT op, you should use YOLOv7End2EndTRT, please refer to:
export yolov7 with trt nms (let max-wh
be None):
wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7.pt
# Export onnx format file with TRT_NMS (Tips: corresponding to YOLOv7 release v0.1 code)
python export.py --weights yolov7.pt --grid --end2end --simplify --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640
# The command to export other models is similar Replace yolov7.pt with yolov7x.pt yolov7-d6.pt yolov7-w6.pt ...
# When using YOLOv7End2EndTRT, you only need to provide the onnx file, no need to transfer the trt file, and it will be automatically converted during inference
or download from vision/detection/yolov7end2end_trt .
from fastdeploy.
@DefTruth Thanks I noticed the problem.
Another interesting thing that I quantize of the exported to onnx (small object detection / visdrone paddle) and its succesfully QUint8 and inferencin.
Problem is qunatized onnx much more slower than the orignal fp32.. 10 times.
Meanwhile I will give a shot to int8 chip (hailo-8)
To export onnx some converter asks for calibration images as in
weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
How can I quantize with calibration ? weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
Best
from fastdeploy.
@DefTruth Thanks I noticed the problem.
Another interesting thing that I quantize of the exported to onnx (small object detection / visdrone paddle) and its succesfully QUint8 and inferencin.
Problem is qunatized onnx much more slower than the orignal fp32.. 10 times.
Meanwhile I will give a shot to int8 chip (hailo-8) To export onnx some converter asks for calibration images as in weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
How can I quantize with calibration ? weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
Best
Inference with CPU or GPU while you use the quantized onnx model?
from fastdeploy.
@jiangjiajun inferencing with GPU 2080Ti
from fastdeploy.
Which tool you are using to quantize your onnx model?
from fastdeploy.
@jiangjiajun
onnx:
`from onnxruntime.quantization import quantize_dynamic, QuantType
model_fp32 = '/Users/tulpar/Project/devPaddleDetection/sliced_visdrone.onnx'
model_quant = '/Users/tulpar/Project/devPaddleDetection/88quant_sliced_visdrone.onnx'
# quantized_model = quantize_dynamic(model_fp32, model_quant)
_quantized_model = quantize_dynamic(model_fp32 , model_quant , weight_type=QuantType.QUInt8)
`
from fastdeploy.
This quant tool is not supported by TensorRT now, Refer this doc https://onnxruntime.ai/docs/performance/quantization.html#quantization-on-gpu
from fastdeploy.
@jiangjiajun onnx:
`from onnxruntime.quantization import quantize_dynamic, QuantType model_fp32 = '/Users/tulpar/Project/devPaddleDetection/sliced_visdrone.onnx' model_quant = '/Users/tulpar/Project/devPaddleDetection/88quant_sliced_visdrone.onnx' # quantized_model = quantize_dynamic(model_fp32, model_quant) _quantized_model = quantize_dynamic(model_fp32 , model_quant , weight_type=QuantType.QUInt8) `
Hi, FastDeploy will provide the tools to quantize model, which could suit deployment on FastDeploy better. See current tutorials as : https://github.com/PaddlePaddle/FastDeploy/tree/develop/tools/quantization. And we will release the examples about how to deploy INT8 models (YOLO series) on FastDeploy in tow days.
What model do you want to quantize and deploy on FastDeploy? We would give you supports.
from fastdeploy.
The model is : https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
We have a competition for a huge project for object detection as in above model exactly. But we need to achieve 100fps minimum on the Xavier NX at 640x480 px.
Accuracy is perfect for above model. Neet to speed up and achive the performance.
This is why I tried to quantize int8.
I tried cpp inference 👍
./main --model_dir=/data/dProjects/devPaddleDetection/output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025 --video_file=/data/RTSP_oz/20221010_14_30:00_rtsp029.mp4 --device=gpu --run_mode=trt_int8 --batch_size=8 --output_dir=/data/int8.mp4
But not much speedup . Only 40millisec on the 2080 rtx Ti
from fastdeploy.
The model is : https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
We have a competition for a huge project for object detection as in above model exactly. But we need to achieve 100fps minimum on the Xavier NX at 640x480 px.
Accuracy is perfect for above model. Neet to speed up and achive the performance.
This is why I tried to quantize int8.
I tried cpp inference 👍
./main --model_dir=/data/dProjects/devPaddleDetection/output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025 --video_file=/data/RTSP_oz/20221010_14_30:00_rtsp029.mp4 --device=gpu --run_mode=trt_int8 --batch_size=8 --output_dir=/data/int8.mp4
But not much speedup . Only 40millisec on the 2080 rtx Ti
We have tried to quantize ppyoloe_crn_l_300e_coco, and it works well on FastDeploy.
Maybe we could help you, how about join our Slack channel for further support? link: https://fastdeployworkspace.slack.com/ssb/redirect
from fastdeploy.
@yunyaoXYY
I will join.
any speed improvement.
from fastdeploy.
@yunyaoXYY did you tried ppyoloe_crn_l_80e_sliced_visdrone_640_025 ? any speedup ?
from fastdeploy.
@yunyaoXYY I need and invitation to join Slack
from fastdeploy.
@yunyaoXYY I need and invitation to join Slack
HI, please try this.
https://join.slack.com/t/fastdeployworkspace/shared_invite/zt-1hm4rrdqs-RZEm6_EAanuwEVZ8EJsG~g
from fastdeploy.
from fastdeploy.
此ISSUE由于一年未更新,将会关闭处理,如有需要,可再次更新打开。
from fastdeploy.
Related Issues (20)
- ppyoloe sod模型支持切图拼图操作吗? HOT 1
- C_api Detcction batchPredict 问题 HOT 1
- 想知道是否支持FastDeploy 部署serving 的 表格识别模型,目前没看到有相关文档 HOT 1
- 本地使用pytorch的yolov5模型跑出来的识别效果要远好于在fastdeploy部署后跑出来的效果 HOT 1
- 希望官方能补全rkyolo的C_API HOT 3
- test HOT 4
- module 'paddle' has no attribute 'io' HOT 4
- 运行出错 HOT 2
- About using Nuitka to package fastdeploy HOT 3
- 当前版本示例代码中的MASK RCNN不会输出MASK,查看MASK data都是0.C++版和python版都存在这个问题 HOT 3
- WARNING:root:`RuntimeOption.enable_trt_fp16` will be deprecated in v1.2.0, please use `RuntimeOption.trt_option.enable_fp16 = True` instead. HOT 1
- [bug] tinypose 使用 C++ 推理过程中 通过负数索引访问数组
- tensorrt_yolov8 HOT 3
- 将rknpu2中的rkyolov5添加到capi,执行中无连接错误 HOT 2
- yolov8的实例分割和关键点,OBB旋转框,什么时候能上? HOT 1
- 关于deploy.yaml HOT 1
- 代码问题 HOT 2
- mask_rcnn_r50_fpn_1x_coco.yml训练出来的模型不支持fastdeploy服务部署?
- 使用paddle中mask_rcnn_x101_vd_64x4d_fpn_2x_coco训练出的模型部署服务化失败
- 如何使用python API进行多卡推理 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fastdeploy.