yingkunwu / r-yolov4 Goto Github PK

This is a PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection.

Python 98.40% Shell 0.50% Dockerfile 1.10%

yolov4 pytorch oriented-object-detection

r-yolov4's People

Contributors

Stargazers

Watchers

Forkers

tiantian-creator zxf864823150 shaomaoxiao sharpiless zyg11 yanxioa louderthanthunderx1 fireae qqylalala qilei123 sxj731533730 sundawei osm1892 sonalrpatel prefectsol

r-yolov4's Issues

OSError: [WinError 126] 找不到指定的模块。

您好。两台服务器都报这个错。请问是有什么编译步骤嘛？

训练

![Snipaste_2021-11-25_15-17-15](https://user-images.githubusercontent.com/77728649/143396711-7c37b453-ab8d-4988-b1e9-dca767ac068e.png

)

大佬，我将我找的数据集的xml转化为txt格式如下，然后train.py出现了问题：文件为空，请您帮我解答一下，谢谢

数据集格式问题

大佬，请问您一下，你的数据集是标签txt中的格式，和r3det中txt的不太一样，您用的是哪个版本的rolabelimg软件呢，我转换后的txt和您的不太一样，谢谢

缺失标注数据集

在/data/train中的示例数据集中，标注的每一行中只有位置信息而没有类别信息，请问程序是如何获得类别信息的

Traceback (most recent call last):
File "/home/zero/pjpompom/R-YOLOv4-main(test)/test.py", line 65, in
sample_metrics += get_batch_statistics(outputs, targets, iou_threshold=args.iou_thres)
File "/home/zero/pjpompom/R-YOLOv4-main(test)/tools/utils.py", line 191, in get_batch_statistics
if pred_label not in target_labels:
File "/home/zero/anaconda3/envs/pjpompom/lib/python3.8/site-packages/torch/tensor.py", line 646, in contains
return (element == self).any().item() # type: ignore[union-attr]
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

数据集txt标签问题

这个模型好像必须使用这种13列的标签，我试了用rolabelimg2制作的5列标签，他就会报错，错误如下：

请问您的yolov4的预训练权重是在什么数据集上训练的?

顺便给个小建议，tools/logger.py 中用的是tensorflow
不过我个人不太想再去装tensorflow
您可以了解一下torch.utils.tensorboard

# 官方示例代码
from torch.utils.tensorboard import SummaryWriter
import numpy as np

writer = SummaryWriter()

for n_iter in range(100):
    writer.add_scalar('Loss/train', np.random.random(), n_iter)

训练custom数据集的问题

大佬你好，我训练自己的数据集运行train.py文件时出现以下报错：
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory
pytorch的版本是1.7.0，请问是版本的原因吗

Values normalization

It is not very clear what the input of the model is. I understand that in labels.txt The following is expected:

<class_id> <centerX/imgsz> <centerY/imgsz> <w/imgsz> <h/imgsz> <theta/360 or 2P?>

What is expected on theta?
Or maybe the values don't have to be normalized at all

Training while labeling with label-studio

Hi! I'm trying to implement your project as a ML backend for label-studio and I'm having some trouble. Predicting labels works without any problems and even training will work the first time. But when I try to train a second time I'll get the following error:

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 7.79 GiB total capacity; 2.62 GiB already allocated; 37.62 MiB free; 2.72 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

This is my implentation of the ML backend:

import os, sys
currentdir = os.path.dirname(os.path.realpath(__file__))
parentdir = os.path.dirname(currentdir)
sys.path.append(parentdir)

from label_studio_ml.model import LabelStudioMLBase
from label_studio_ml.utils import get_image_size, get_single_tag_keys, is_skipped
from label_studio.core.utils.io import json_load, get_data_dir 
from label_studio.core.settings.base import DATA_UNDEFINED_NAME

import time
import random
import numpy as np
import torch
import shutil
import json
from terminaltables import AsciiTable
import glob

from model.yolo import Yolo
from lib.utils import load_class_names
from lib.scheduler import CosineAnnealingWarmupRestarts
from lib.post_process import post_process
from lib.logger import *
from lib.options import LabelStudioOptions
from lib.plot import rescale_boxes
import label_studio_sdk
from datasets.label_studio_dataset import ImageDataset, LabelStudioDataset, get_transformed_image
import cv2 as cv

from urllib.parse import urlparse

from PIL import Image

print("LabelStudioSdk Version: ", label_studio_sdk.__version__)

LABEL_STUDIO_HOST = os.getenv('LABEL_STUDIO_HOST', 'http://localhost:8080')
LABEL_STUDIO_API_KEY = os.getenv('LABEL_STUDIO_API_KEY', '4c23feec13e2118e053b9a9940f73ed96c0e0841')

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

def weights_init_normal(m):
    if isinstance(m, torch.nn.Conv2d):
        torch.nn.init.normal_(m.weight.data, 0.0, 0.02)
    elif isinstance(m, torch.nn.BatchNorm2d):
        torch.nn.init.normal_(m.weight.data, 1.0, 0.02)
        torch.nn.init.constant_(m.bias.data, 0.0)

def init():
    random.seed(42)
    np.random.seed(42)
    torch.manual_seed(42)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

class RotBBoxModel(object):
    def __init__(self, num_classes, args):
        self.args = args
        
        self.model = Yolo(n_classes=num_classes)
        self.model = self.model.to(device)

        self.logger = None
        self.model_path = None

        self.optimizer = torch.optim.Adam(self.model.parameters(), lr=self.args.lr)

    def log(self, total_loss, num_epochs, epoch, global_step, total_step, start_time):
        log = "\n---- [Epoch %d/%d] ----\n" % (epoch + 1, num_epochs)

        tensorboard_log = {}
        loss_table_name = ["Step: %d/%d" % (global_step, total_step),
                            "loss", "reg_loss", "conf_loss", "cls_loss"]
        loss_table = [loss_table_name]

        temp = ["YoloLayer1"]
        for name, metric in self.model.yolo1.metrics.items():
            if name in loss_table_name:
                temp.append(metric)
            tensorboard_log[f"{name}_1"] = metric
        loss_table.append(temp)

        temp = ["YoloLayer2"]
        for name, metric in self.model.yolo2.metrics.items():
            if name in loss_table_name:
                temp.append(metric)
            tensorboard_log[f"{name}_2"] = metric
        loss_table.append(temp)

        temp = ["YoloLayer3"]
        for name, metric in self.model.yolo3.metrics.items():
            if name in loss_table_name:
                temp.append(metric)
            tensorboard_log[f"{name}_3"] = metric
        loss_table.append(temp)

        tensorboard_log["total_loss"] = total_loss
        self.logger.list_of_scalars_summary(tensorboard_log, global_step)

        log += AsciiTable(loss_table).table
        log += "\nTotal Loss: %f, Runtime: %f\n" % (total_loss, time.time() - start_time)
        print(log)

    def save(self, path):
        print("Model saved in: ", path)
        torch.save(self.model.state_dict(), path)

    def load(self, path, train=False):
        print("Loading model...")
        if not train:
            print("Loading model for prediction...")
            self.model_path = path
            if os.path.exists(self.model_path):
                weight_path = glob.glob(os.path.join(self.model_path, "*.pth"))
                if len(weight_path) == 0:
                    assert False, "Model weight not found"
                elif len(weight_path) > 1:
                    assert False, "Multiple weights are found. Please keep only one weight in your model directory"
                else:
                    weight_path = weight_path[0]
            else:
                assert False, "Model is not exist"
            pretrained_dict = torch.load(weight_path, map_location=device)
            self.model.load_state_dict(pretrained_dict)
            self.model.eval()
        else:
            print("Loading model for training...")
            # if os.path.exists(path):
            #     weight_path = glob.glob(os.path.join(path, "*.pth"))[0]
            #     print("weight_path: ", weight_path)
            # else:
            #     print("Path does not exist")
            weight_path = "weights/pretrained/yolov4.pth"
            pretrained_dict = torch.load(weight_path, map_location=device)
            model_dict = self.model.state_dict()

            # 1. filter out unnecessary keys
            # pretrained_dict = {k: v for k, v in pretrained_dict.items() if np.shape(model_dict[k]) == np.shape(v)}
            pretrained_dict = {k: v for i, (k, v) in enumerate(pretrained_dict.items()) if i < 552}
            # 2. overwrite entries in the existing state dict
            model_dict.update(pretrained_dict)
            # 3. load the new state dict
            self.model.apply(weights_init_normal)
            self.model.load_state_dict(model_dict)
            self.model.eval()

    def predict(self, image_urls):
        images = torch.stack([get_transformed_image(url, self.args.img_size) for url in image_urls]).to(device)
        
        with torch.no_grad():
            temp = time.time()
            output, _ = self.model(images)  # batch=1 -> [1, n, n], batch=3 -> [3, n, n]
            temp1 = time.time()
            boxes = post_process(output, self.args.conf_thres, self.args.nms_thres)
            temp2 = time.time()
            print('-----------------------------------')
            num = 0
            for b in boxes:
                if b is None:
                    break
                num += len(b)
            print("{} objects found".format(num))
            print("Inference time : ", round(temp1 - temp, 5))
            print("Post-processing time : ", round(temp2 - temp1, 5))
            print('-----------------------------------')
            return boxes

    def train(self, dataloader, num_epochs=5):
        init()
        if(self.model_path == None):
            self.model_path = os.path.join("weights", self.args.model_name)
        self.logger = Logger(os.path.join(self.model_path, "logs"))

        num_iters_per_epoch = len(dataloader)
        scheduler_iters = round(num_epochs * len(dataloader) / self.args.subdivisions)
        total_step = num_iters_per_epoch * num_epochs

        scheduler = CosineAnnealingWarmupRestarts(self.optimizer,
                                                first_cycle_steps=round(scheduler_iters),
                                                max_lr=self.args.lr,
                                                min_lr=1e-5,
                                                warmup_steps=round(scheduler_iters * 0.1),
                                                cycle_mult=1,
                                                gamma=1)

        start_time = time.time()
        self.model.train()
        for epoch in range(num_epochs):
            print('Epoch {}/{}'.format(epoch, num_epochs - 1))
            print('-' * 10)

            for batch, (_, imgs, targets) in enumerate(dataloader):
                global_step = num_iters_per_epoch * epoch + batch + 1
                imgs = imgs.to(device)
                targets = targets.to(device)

                outputs, loss = self.model(imgs, targets)

                loss.backward()
                total_loss = loss.detach().item()

                if global_step % self.args.subdivisions == 0:
                    self.optimizer.step()
                    self.optimizer.zero_grad()
                    scheduler.step()

                self.log(total_loss, num_epochs, epoch, global_step, total_step, start_time)

        print()

        time_elapsed = time.time() - start_time
        print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))

        return self.model

class RotBBoxModelApi(LabelStudioMLBase):
    
    def __init__(self, **kwargs):
        # don't forget to initialize base class...
        super(RotBBoxModelApi, self).__init__(**kwargs)
        
        parser = LabelStudioOptions()
        self.args = parser.parse()
        
        self.from_name, self.to_name, self.value, self.labels_in_config = get_single_tag_keys(
            self.parsed_label_config, 'RectangleLabels', 'Image'
        )

        print("from_name: ", self.from_name)
        print("to_name: ", self.to_name)
        print("value: ", self.value)
        print("labels_in_config: ", self.labels_in_config)
        print("parsed_label_config: ", self.parsed_label_config)
        print("train_output: ", self.train_output)

        # self.model = RotBBoxModel(len(self.labels_in_config), self.args)
        # self.model_path = os.path.join("weights", self.args.model_name)
        # print(self.model_path)
        # self.model.load(self.model_path)

        if self.train_output:
            self.model = RotBBoxModel(len(self.labels_in_config), self.args)
            self.model.load(self.train_output['model_path'], self.train_output)
        else:
            self.model = RotBBoxModel(len(self.labels_in_config), self.args)
            model_path = os.path.join("weights", self.args.model_name)
            print(model_path)
            self.model.load(model_path)

    def reset_model(self):
        self.model = RotBBoxModel(len(self.labels_in_config), self.args)
        self.model_path = os.path.join("weights", self.args.model_name)
        self.model.load(self.model_path)
   
    def predict(self, tasks, **kwargs):
        """ This is where inference happens:
            model returns the list of predictions based on input list of tasks

            :param tasks: Label Studio tasks in JSON format
        """
        image_urls = [task['data'][self.value] for task in tasks]
        print(image_urls)
        model_results = self.model.predict(image_urls)
        results = []
        all_scores = []
        avg_score = 0

        for i, (url, box) in enumerate(zip(image_urls, model_results)):
            if box is not None:
                image_path = self.get_local_path(url)
                # image_shape = get_image_shape(url)
                img_width, img_height = get_image_size(image_path)
                boxes = rescale_boxes(box, self.args.img_size, (img_height, img_width))
                boxes = np.array(boxes)

                for i in range(len(boxes)):
                    bbox = boxes[i]
                    center_x, center_y, w, h, theta = bbox[0], bbox[1], bbox[2], bbox[3], bbox[4]
                    score = round(bbox[5] * bbox[6], 2)
                    cls_id = np.squeeze(int(bbox[7]))

                    # Calculate top left corner of rotated bbox (box center is origin)
                    left_local = -w/2
                    top_local = -h/2
                    rotated_left_local = np.cos(theta) * left_local - np.sin(theta) * top_local
                    rotated_top_local = np.sin(theta) * left_local + np.cos(theta) * top_local
                    rotated_left = center_x + rotated_left_local
                    rotated_top = center_y + rotated_top_local

                    x_percent = ( (rotated_left / img_width) * 100.0).item()
                    y_percent = ( (rotated_top / img_height) * 100.0).item()
                    w_percent = ( (w / img_width) * 100.0).item()
                    h_percent = ( (h / img_height) * 100.0).item()

                    results.append({
                        'from_name': self.from_name,
                        'to_name': self.to_name,
                        'type': 'rectanglelabels',
                        'value': {
                            'rectanglelabels': [self.labels_in_config[cls_id]],
                            'x': x_percent,
                            'y': y_percent,
                            'width': w_percent,
                            'height': h_percent,
                            'rotation': np.rad2deg(theta).item()
                        },
                        'score': score.item()
                    })
                    all_scores.append(score)
                avg_score = sum(all_scores) / max(len(all_scores), 1)
        if(avg_score != 0):
            avg_score = avg_score.item()
        return [{
            'result': results,
            'score': avg_score
        }]

    def download_tasks(self, project):
        """
        Download all labeled tasks from project using the Label Studio SDK.
        Read more about SDK here https://labelstud.io/sdk/
        :param project: project ID
        :return:
        """
        ls = label_studio_sdk.Client(LABEL_STUDIO_HOST, LABEL_STUDIO_API_KEY)
        project = ls.get_project(id=project)
        tasks = project.get_labeled_tasks()
        return tasks

    def fit(self, tasks, workdir=None, batch_size=32, num_epochs=10, **kwargs):
        """
        This method is called each time an annotation is created or updated
        :param kwargs: contains "data" and "event" key, that could be used to retrieve project ID and annotation event type
                        (read more in https://labelstud.io/guide/webhook_reference.html#Annotation-Created)
        :return: dictionary with trained model artefacts that could be used further in code with self.train_output
        """
        if 'data' not in kwargs:
            raise KeyError(f'Project is not identified. Go to Project Settings -> Webhooks, and ensure you have "Send Payload" enabled')
        
        data = kwargs['data']
        project = data['project']['id']
        tasks = self.download_tasks(project)
        if len(tasks) > 0:
            print(f'{len(tasks)} labeled tasks downloaded for project {project}')
            
            image_urls, image_labels = [], []
            print('Collecting annotations...')
            for task in tasks:
                
                if is_skipped(task):
                    continue

                filepath = self.get_local_path(task['data'][self.value])
                image_urls.append(filepath)
                image_labels.append(task['annotations'][0]['result'])


            # augment = False if self.args.no_augmentation else True
            # mosaic = False if self.args.no_mosaic else True
            # multiscale = False if self.args.no_multiscale else True

            augment = False
            mosaic = False
            multiscale = False

            print(f'Creating dataset with {len(image_urls)} images...')
            dataset = LabelStudioDataset(image_urls, image_labels, self.labels_in_config, 
                                         self.args.img_size, self.args.sample_size,
                                         augment=augment, mosaic=mosaic, multiscale=multiscale)
            dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=True, pin_memory=True, collate_fn=dataset.collate_fn)
            
            print('Train model...')
            self.reset_model()
            self.model.train(dataloader, num_epochs=num_epochs)

            print('Save model...')
            # model_path = os.path.join(workdir, 'model.pt')
            model_path = os.path.join(self.model_path, "ryolov4.pth")
            self.model.save(model_path)

            return {
                'model_path': model_path, 
                'labels': image_labels
            }

        else:
            print('No labeled tasks found: make some annotations...')
            return {}

This is basically just your code combined from detect.py and train.py.

The testing is performed with the trash dataset and a model that was also trained on it. I'm not really familiar with pytorch and don't know if I implemented it correctly for this kind of application. I guess that the out-of-memory error is caused by reloading the model without clearing some old variables first? I have no idea which though.

Could you please take a look at it if you have the time? Maybe I'm just loading the model the wrong way.

训练自己的数据集，跌带到中间epoch出现错误

---- [Epoch 12/100] ----
+------------------+--------------------+--------------------+---------------------+---------------------+
| Step: 4563/38900 | loss | reg_loss | conf_loss | cls_loss |
+------------------+--------------------+--------------------+---------------------+---------------------+
| YoloLayer1 | 0.5429285764694214 | 0.3865797519683838 | 0.11499390006065369 | 0.04135490208864212 |
| YoloLayer2 | 0.7612576484680176 | 0.4948746860027313 | 0.16640125215053558 | 0.09998173266649246 |
| YoloLayer3 | 1.0451996326446533 | 0.6731263399124146 | 0.20551152527332306 | 0.16656182706356049 |
+------------------+--------------------+--------------------+---------------------+---------------------+
Total Loss: 2.349386, Runtime: 6990.246672
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [25,0,0] Assertion input_val >= zero && input_val <= one failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [26,0,0] Assertion input_val >= zero && input_val <= one failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [27,0,0] Assertion input_val >= zero && input_val <= one failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [28,0,0] Assertion input_val >= zero && input_val <= one failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [29,0,0] Assertion input_val >= zero && input_val <= one failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [30,0,0] Assertion input_val >= zero && input_val <= one failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/Loss.cu:102: block: [0,0,0], thread: [31,0,0] Assertion input_val >= zero && input_val <= one failed.
Traceback (most recent call last):
File "D:/WorkSpace/PythonWorkSpace/R-YOLOv4/train.py", line 174, in
t.train()
File "D:/WorkSpace/PythonWorkSpace/R-YOLOv4/train.py", line 151, in train
outputs, loss = self.model(imgs, targets)
File "D:\Software\Anconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\WorkSpace\PythonWorkSpace\R-YOLOv4\model\yolo.py", line 35, in forward
y1, loss1 = self.yolo1(x2, target)
File "D:\Software\Anconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\WorkSpace\PythonWorkSpace\R-YOLOv4\model\yololayer.py", line 202, in forward
cls_loss += F.binary_cross_entropy(pred_cls[obj_mask], tcls[obj_mask], reduction=self.reduction)
RuntimeError: CUDA error: device-side assert triggered

训练100个epoch后验证时ap都为0，推理时基本都是错误的

大佬您好，能否帮我看看。训练采用自己的数据集（200张），标注完并使用您提供的xml2txt.py转换为txt文件后，运行display_inputs.py检查训练集的图片没有问题。训练完后打开TensorBoard各项loss也基本收敛了，各类别的precision和recall都显示正常。但是运行test.py验证时AP均为0，以及detect.py推理时都是错误的杂框，比自己标注的gt要小很多。下载了您的trash数据集并训练10个epoch后ap就以及很高了，训练及验证时的参数也与您的基本一致，没有改动。

我感觉还是数据集转换的问题，但是找不到具体错在哪里，您方便帮我看一下吗，感谢！
mAP为0：
部分PR如下：
display_inputs.py转换结果：

推理错误框：

检测数据保存问题

检测过的图片并没有保存

怎样会出现这种检测后的图片，谢谢

How does it work with m x n input image size ?

I have 8k images, what should i pass in the img_size param of train.py ?

vertical_flip实现是不是不太对？

您好，最近我在可视化预处理过后的图片。但是发现vertical_flip之后的图片颜色不对。
以下三张分别是原图，您的vertical_flip实现，PIL实现。

box2 is sometimes empty causing assertions

I'm not sure if this is an issue or not.
Using the full code from this project, I have tried to train the model to predict rotated bounding boxes on a custom dataset with only 2 classes.
After the first epoch and while the first training cycle completes, in the test cycle, I always get the following error:

 44 def skewiou(box1, box2):
 45

---> 46 assert len(box1) == 5 and len(box2[0]) == 5
47 # End of diamantis additions. if you remove that, please re enable assert.
48

IndexError: index 0 is out of bounds for dimension 0 with size 0

According to excessive debugging in order to see what is going wrong, I noticed that box2 is indeed empty causing the training process to crash.
Model is always generates predictions for box2 too, I verified that in the step below using these 2 prints:
#print("box1:", detect[0, :5])
#print("box2:", detect[:, :5])
large_overlap = iou(detect[0, :5], detect[:, :5], nms_thres)

The issue happens when matrixes are passed to iou. It's not the happening from the start in the epoch, it successfully processes multiple times the data (meaning that it is not crashing at the first iteration inside post_process).

Any idea of what is going wrong or what I'm doing wrong would be indeed very helpful.

Detect

How do I detect on an image or a video instead of using dataloader like you?

数据格式转化问题

您好，我想问一下使用labimg2标注图片得到的xml文件怎么转化为程序中使用的txt文件，能发一下代码嘛？

Transfer Learning

Hello,
I have a question regarding training a custom dataset.

How can I transfer learning of some specific classes from the pre-trained weights (e.g. dota) to my custom training if my custom classes are different from the pre-trained classes?

Best regards and Thank you

multiple values for argument 'img_size' ["bug"]

I'm running a training script:

python train.py --data ../dataset/dataset.yaml --config /home/prefect/Work/PalmDetector/Model/OBB/R-YOLOv4/data/hyp.yaml --img_size 640 --epochs 300

And I get the following:

Namespace(batch_size=4, config='/home/prefect/Work/PalmDetector/Model/OBB/R-YOLOv4/data/hyp.yaml', data='../dataset/dataset.yaml', epochs=300, img_size=640, lr=0.01, mode='csl', model_name='trash', optimizer='SGD', ver='yolov5', weights_path='')
2023-12-13 00:56:13 WARNING  Model name exists, do you want to override the previous model?
>> [Y:N]y
Traceback (most recent call last):
  File "train.py", line 266, in <module>
    t.train()
  File "train.py", line 145, in train
    train_dataset, train_dataloader = load_data(
  File "/home/prefect/Work/PalmDetector/Model/OBB/R-YOLOv4/lib/load.py", line 15, in load_data
    dataset = CustomDataset(data_dir, class_names, hyp, img_size=640, augment=augment, csl=csl)
TypeError: __init__() got multiple values for argument 'img_size'

Debugging the sources, I found the following:
lib/load.py:

        dataset = CustomDataset(data_dir, class_names, hyp, img_size=img_size, augment=augment, csl=csl)

datasets/custom_dataset.py:

class CustomDataset(BaseDataset):
    def __init__(self, data_dir, img_size=416, augment=True, mosaic=True, multiscale=True, normalized_labels=False):

The arguments passed do not match the actual signature of the class constructor, which is why an error appears when learning on a custom dataset

I suspect that the same problem exists in the CustomDataset class with the hyp parameter

DOTA1.0train，test下来ap不是很高

楼主好，我用的这个代码+DOTA1.0，裁剪成1024尺寸，epochs=50，AP最后的结果很低，并与您的outputs对比十分不理想，nms=0.2，最后杂框很多，我不知道是哪里出问题了

权重可以共享到百度云嘛

Extra classes

Do you think it would be possible to train this on my own dataset or add extra classes to this?
What would I need to change to have more than plane and car classes?

Thank you

请问theta是怎么定义的？

How to call the local camera to realize real-time detection？

Reg loss calculation

The regression loss considered is smooth_l1_loss (for angle) - ciou (for xywh).

Considering that, should not the ciou loss calculation be ciou = bbox_loss_scale * (0.0 - ciou)
instead of ciou = bbox_loss_scale * (1.0 - ciou) in below line?
https://github.com/kunnnnethan/R-YOLOv4/blob/ab85440b135cd029c8151d6a0d120632db0b35bc/model/yololayer.py#L113

https://github.com/kunnnnethan/R-YOLOv4/blob/ab85440b135cd029c8151d6a0d120632db0b35bc/model/yololayer.py#L119

https://github.com/kunnnnethan/R-YOLOv4/blob/ab85440b135cd029c8151d6a0d120632db0b35bc/model/yololayer.py#L197

With the current implementation, the effective regression loss calculation is smooth_l1_loss (for angle) + 1 - ciou (for xywh).