GithubHelp home page GithubHelp logo

chuzhixing / yolov5_obb_from_cv_hub Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cvhub520/yolov5_obb

1.0 0.0 0.0 3.47 MB

License: GNU General Public License v3.0

Shell 0.47% C++ 32.01% Python 62.75% Cuda 4.11% Makefile 0.01% CMake 0.16% Cython 0.15% Dockerfile 0.32% SWIG 0.03%

yolov5_obb_from_cv_hub's Introduction

Introduction

YOLOv5-OBB is a variant of YOLOv5 that supports oriented bounding boxes. This model is designed to yield predictions that better fit objects that are positioned at an angle.

X-AnyLabeling is not only an annotation tool, it’s a leap forward into the future of automated data annotation. It’s designed to not only simplify the process of annotation but also to integrate cutting-edge AI models for superior results. With a focus on practical applications, X-AnyLabeling strives to provide an industrial-grade, feature-rich tool that will assist developers in automating annotation and data processing for a wide range of complex tasks.

Annotation

Installation

cd yolov5_obb
git submodule update --init --recursive
cd X-AnyLabeling
pip install -r requirements.txt
# pip install -r requirements-gpu.txt
python anylabeling/app.py

Toturial

  • Prepare a predefined category label file (refer to this).
  • Click on the 'Format' option in the top menu bar, select 'DOTA' and import the file prepared in the previous step.

[Option-1] Basic usage

  • Press the shortcut key "O" to create a rotation shape.
  • Open edit mode (shortcut: "Ctrl+J") and click to select the rotation box.
  • rotate the selected box via shortcut "zxcv", where:
    • z: Large counterclockwise rotation
    • x: Small counterclockwise rotation
    • c: Small clockwise rotation
    • v: Large clockwise rotation

[Option-2] Additionally, you can use the model to batch pre-label the current dataset.

  • Press the shorcut key "Ctrl+A" to open the Auto-Labeling mode;
  • Choose an appropriate default model or load a custom model.
  • Press the shorcut key "Ctrl+M" to run all images once.

YOLOv5m_obb_dota_result

For more detail, you can refer to this document.

Getting start

Installation

  • Requirements
    • Python 3.7+
    • PyTorch ≥ 1.7
    • CUDA 9.0 or higher
    • Ubuntu 16.04/18.04

Note:

  1. please be aware that if you downloaded the source code from the origin repo, it is advisable to make necessary modifications to the poly_nms_cuda.cu file. Failing to do so will likely result in compilation issues.
  2. For Windows user, please refer to this issue if you have difficulty in generating utils/nms_rotated_ext.cpython-XX-XX-XX-XX.so)**
  • Install

a. Create a conda virtual environment and activate it:

conda create -n yolov5_obb python=3.8 -y 
source activate yolov5_obb

b. Make sure your CUDA runtime api version ≤ CUDA driver version. (for example 11.3 ≤ 11.4)

nvcc -V
nvidia-smi

c. Install PyTorch and torchvision following the official instructions based on your machine env, and make sure cudatoolkit version same as CUDA runtime api version, e.g.,

pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu116
nvcc -V
python
>>> import torch
>>> torch.version.cuda
>>> exit()

d. Clone the modified version of the follow YOLOv5_OBB repository.

git clone https://github.com/CVHub520/yolov5_obb.git
cd yolov5_obb

e. Install yolov5-obb.

pip install -r requirements.txt
cd utils/nms_rotated
python setup.py develop  # or "pip install -v -e ."
  • DOTA_devkit [Optional]

If you need to split the high-resolution image and evaluate the oriented bounding boxes (OBB), it is recommended to use the following tool:

cd yolov5_obb/DOTA_devkit
sudo apt-get install swig
swig -c++ -python polyiou.i
python setup.py build_ext --inplace

Datasets

Prepare custom dataset files

Note: Ensure that the label format is [polygon classname difficulty], for example, you can set difficulty=0 unless otherwise specified.

  x1      y1       x2        y2       x3       y3       x4       y4       classname     diffcult

1686.0   1517.0   1695.0   1511.0   1711.0   1535.0   1700.0   1541.0   large-vehicle      1

image

Then, modify the path parameters and run this script if there is no need to split the high-resolution images. Otherwise, you can follow the steps below.

cd yolov5_obb
python DOTA_devkit/ImgSplit_multi_process.py

Ensure that your dataset is organized in the directory structure as shown below:

.
└── dataset_demo
    ├── images
    │   └── P0032.png
    └── labelTxt
        └── P0032.txt

Finally, you can create a custom data yaml file, e.g., yolov5obb_demo.yaml.

Note:

  • DOTA is a high resolution image dataset, so it needs to be splited before training/testing to get better performance.
  • For single-class problems, it is recommended to add a "None" class, effectively making it a 2-class task, e.g., DroneVehicle_poly.yaml

Train/Val/Detect

Before formally starting the training task, please follow the following recommendations:

  1. Ensure that the input resolution is set to a multiple of 32.
  2. By default, set the batch size to 8. If you increase it to 16 or larger, adjust the scaling factor for the box loss to help the convergence of the theta.
  • To train on multiple GPUs with Distributed Data Parallel (DDP) mode, please refer to this shell script.

  • To train the orignal dataset demo without split dataset, please refer to the following command:

python train.py \
  --weights weights/yolov5n.pt \
  --data data/task.yaml \
  --hyp data/hyps/obb/hyp.finetune_dota.yaml \
  --epochs 300 \
  --batch-size 1 \
  --img 1024 \
  --device 0 \
  --name /path/to/save_dir
  • To detect a custom image file/folder/video, please refer to the following command:
python detect.py \
    --weights /path/to/*.pt \
    --source /path/to/image \
    --img 1024 \
    --device 0 \
    --conf-thres 0.25 \
    --iou-thres 0.2 \
    --name /path/to/save_dir

Note:

For more details, please refer to this document.

Deploy

  • Export *.onnx file:
python export.py \
    --weights runs/train/task/weights/best.pt \
    --data data/task.yaml \
    --imgsz 1024 \
    --simplify \
    --opset 12 \
    --include onnx

Python

  • Detect with the exported onnx file using onnxruntime:
python deploy/onnxruntime/python/main.py \
    --model /path/to/*.onnx \
    --image /path/to/image

C++

  • Enter the directory:
cd opencv/cpp

.
└── cpp
    ├── CMakeLists.txt
    ├── build
    ├── image
    │   ├── demo.jpg
    ├── main.cpp
    ├── model
    │   └── yolov5m_obb_csl_dotav15.onnx
    └── obb
        ├── include
        └── src

Note, it is recommended to use OpenCV version 4.6.0 or newer, where v4.7.0 has been successfully tested.

  • Place the images and model files in the specified directory.
  • Modify the contents of the CMakeLists.txt, main.cpp, and yolo_obb.h files according to your specific requirements and use case.
  • Run the demo:
mkdir build && cd build
cmake ..
make

Model Zoo

The results on DOTA_subsize1024_gap200_rate1.0 test-dev set are shown in the table below. (password: yolo)

Model
(download link)
Size
(pixels)
TTA
(multi-scale/
rotate testing)
OBB mAPtest
0.5
DOTAv1.0
OBB mAPtest
0.5
DOTAv1.5
OBB mAPtest
0.5
DOTAv2.0
Speed
CPU b1
(ms)
Speed
2080Ti b1
(ms)
Speed
2080Ti b16
(ms)
params
(M)
FLOPs
@640 (B)
yolov5m [baidu/google] 1024 × 77.3 73.2 58.0 328.2 16.9 11.3 21.6 50.5
yolov5s [baidu] 1024 × 76.8 - - - 15.6 - 7.5 17.5
yolov5n [baidu] 1024 × 73.3 - - - 15.2 - 2.0 5.0
Table Notes (click to expand)
  • All checkpoints are trained to 300 epochs with COCO pre-trained checkpoints, default settings and hyperparameters.
  • mAPtest dota values are for single-model single-scale on DOTA(1024,1024,200,1.0) dataset.
    Reproduce Example:
python val.py --data 'data/dotav15_poly.yaml' --img 1024 --conf 0.01 --iou 0.4 --task 'test' --batch 16 --save-json --name 'dotav15_test_split'
python tools/TestJson2VocClassTxt.py --json_path 'runs/val/dotav15_test_split/best_obb_predictions.json' --save_path 'runs/val/dotav15_test_split/obb_predictions_Txt'
python DOTA_devkit/ResultMerge_multi_process.py --scrpath 'runs/val/dotav15_test_split/obb_predictions_Txt' --dstpath 'runs/val/dotav15_test_split/obb_predictions_Txt_Merged'
zip the poly format results files and submit it to https://captain-whu.github.io/DOTA/evaluation.html
  • Speed averaged over DOTAv1.5 val_split_subsize1024_gap200 images using a 2080Ti gpu. NMS + pre-process times is included.
    Reproduce by python val.py --data 'data/dotav15_poly.yaml' --img 1024 --task speed --batch 1
Model Name File Size Input Size Configuration File
yolov5n_obb_drone_vehicle.onnx 8.39MB 864 yolov5n_obb_drone_vehicle.yaml
yolov5s_obb_csl_dotav10.onnx 29.8MB 1024 dotav1_poly.yaml
yolov5m_obb_csl_dotav15.onnx 83.6MB 1024 dotav15_poly.yaml
yolov5m_obb_csl_dotav20.onnx 83.6MB 1024 dotav2_poly.yaml

Acknowledgements

This project relies on the following open-source projects and resources:

yolov5_obb_from_cv_hub's People

Contributors

cvhub520 avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.