GithubHelp home page GithubHelp logo

Comments (13)

github-actions avatar github-actions commented on June 18, 2024

👋 Hello @azizche, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

from ultralytics.

glenn-jocher avatar glenn-jocher commented on June 18, 2024

Hey there! It seems like the issue you're encountering is related to mixing images of different sizes in a single batch, which isn't supported in the current setup. The albumentations RandomSizedBBoxSafeCrop is correctly resizing to 1024x1024, but it seems your original dataset contains images of size 4000x4000 which are not being resized consistently.

A possible solution would be to ensure all images and their corresponding bounding boxes are resized to the same dimensions before batching them. You could apply resizing directly in your dataset preprocessing step to make all images 1024x1024 (or any consistent size that suits your needs), then proceed to apply the other albumentations transforms.

Here’s a small example on how you might adjust your dataset preprocessing:

class Albumentations:
    def __init__(self, p=1.0):
        self.p = p
        import albumentations as A
        self.resize_transform = A.Resize(1024, 1024)
        self.augmentations = A.Compose([
            A.HorizontalFlip(0.5),
            A.RandomSizedBBoxSafeCrop(1024, 1024, p=0.9),
            # ... other transforms
        ], bbox_params=A.BboxParams(format="yolo", label_fields=["class_labels"]))

    def __call__(self, labels):
        im = labels['img']
        bboxes = labels['instances'].bboxes
        cls = labels['cls']

        # Resize first
        im = self.resize_transform(image=im)['image']
        
        # Apply other augmentations
        if self.augmentations and random.random() < self.p:
            transformed = self.augmentations(image=im, bboxes=bboxes, class_labels=cls)
            im = transformed['image']
            bboxes = transformed['bboxes']
            cls = transformed['class_labels']

        labels.update({'img': im, 'bboxes': bboxes, 'cls': cls})
        return labels

This script first ensures that every image is resized uniformly before applying further augmentations. This should circumvent the RuntimeError regarding differing tensor sizes during stacking in your data loader. Hope this helps, and good luck with your training! 😊👍

from ultralytics.

azizche avatar azizche commented on June 18, 2024

Isin't the image supposed to be resized before augmenting it with albumentation (From what I understood, the LetterBox class is responsible for resizing the images)
I am trying to resize images to a high image size number (40004000) and during augmentation I only want to take image crops of size 10241024 so I put the probability of the A.RandomSizedBBoxSafeCrop to 1.
Plus, from the error I got, it seems that the training excepts the images to be all the same size. If that's the case, how does the model deal with cropping augmentation(and any other augmentation that changes the image size)

Note: I tried the solution you proposed but still I still get this error
~/Experiments/ultralytics/ultralytics/engine/model.py in train(self, trainer, **kwargs)
    655 
    656         self.trainer.hub_session = self.session  # attach optional HUB session
--> 657         self.trainer.train()
    658         # Update model and cfg after training
    659         if RANK in (-1, 0):

~/Experiments/ultralytics/ultralytics/engine/trainer.py in train(self)
    211 
    212         else:
--> 213             self._do_train(world_size)
    214 
    215     def _setup_scheduler(self):

~/Experiments/ultralytics/ultralytics/engine/trainer.py in _do_train(self, world_size)
    361             self.tloss = None
    362             self.optimizer.zero_grad()
--> 363             for i, batch in pbar:
    364                 self.run_callbacks("on_train_batch_start")
    365                 # Warmup

/opt/conda/lib/python3.9/site-packages/tqdm/std.py in __iter__(self)
   1178 
   1179         try:
-> 1180             for obj in iterable:
   1181                 yield obj
   1182                 # Update and possibly print the progressbar.

~/Experiments/ultralytics/ultralytics/data/build.py in __iter__(self)
     47         """Creates a sampler that repeats indefinitely."""
     48         for _ in range(len(self)):
---> 49             yield next(self.iterator)
     50 
     51     def reset(self):

/opt/conda/lib/python3.9/site-packages/torch/utils/data/dataloader.py in __next__(self)
    629                 # TODO(https://github.com/pytorch/pytorch/issues/76750)
    630                 self._reset()  # type: ignore[call-arg]
--> 631             data = self._next_data()
    632             self._num_yielded += 1
    633             if self._dataset_kind == _DatasetKind.Iterable and \

/opt/conda/lib/python3.9/site-packages/torch/utils/data/dataloader.py in _next_data(self)
   1344             else:
   1345                 del self._task_info[idx]
-> 1346                 return self._process_data(data)
   1347 
   1348     def _try_put_index(self):

/opt/conda/lib/python3.9/site-packages/torch/utils/data/dataloader.py in _process_data(self, data)
   1370         self._try_put_index()
   1371         if isinstance(data, ExceptionWrapper):
-> 1372             data.reraise()
   1373         return data
   1374 

/opt/conda/lib/python3.9/site-packages/torch/_utils.py in reraise(self)
    703             # instantiate since we don't know how to
    704             raise RuntimeError(msg) from None
--> 705         raise exception
    706 
    707 

ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
  File "/opt/conda/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/jovyan/Experiments/ultralytics/ultralytics/data/base.py", line 253, in __getitem__
    return self.transforms(self.get_image_and_label(index))
  File "/home/jovyan/Experiments/ultralytics/ultralytics/data/augment.py", line 74, in __call__
    data = t(data)
  File "/home/jovyan/Experiments/ultralytics/ultralytics/data/augment.py", line 841, in __call__
    transformed = self.augmentations(image=im, bboxes=bboxes, class_labels=cls)
  File "/opt/conda/lib/python3.9/site-packages/albumentations/core/composition.py", line 228, in __call__
    p.preprocess(data)
  File "/opt/conda/lib/python3.9/site-packages/albumentations/core/utils.py", line 90, in preprocess
    data[data_name] = self.check_and_convert(data[data_name], rows, cols, direction="to")
  File "/opt/conda/lib/python3.9/site-packages/albumentations/core/utils.py", line 104, in check_and_convert
    return self.convert_to_albumentations(data, rows, cols)
  File "/opt/conda/lib/python3.9/site-packages/albumentations/core/bbox_utils.py", line 151, in convert_to_albumentations
    return convert_bboxes_to_albumentations(data, self.params.format, rows, cols, check_validity=True)
  File "/opt/conda/lib/python3.9/site-packages/albumentations/core/bbox_utils.py", line 433, in convert_bboxes_to_albumentations
    return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes]
  File "/opt/conda/lib/python3.9/site-packages/albumentations/core/bbox_utils.py", line 433, in <listcomp>
    return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes]
  File "/opt/conda/lib/python3.9/site-packages/albumentations/core/bbox_utils.py", line 352, in convert_bbox_to_albumentations
    raise ValueError(msg)
ValueError: In YOLO format all coordinates must be float and in range (0, 1]

Thank you for your prompt answer @glenn-jocher. Much appreciated 😀

from ultralytics.

glenn-jocher avatar glenn-jocher commented on June 18, 2024

@azizche hi there! 😊 It appears that after your transformations, some bboxes may not be formatted correctly or have fallen out of the expected range (0, 1]. This can happen particularly after aggressive crops or other transformations change the image size dramatically. Here’s an approach that might help:

Make sure to normalize the bbox coordinates relative to the new image size after the crop and ensure they're correctly bounded within (0, 1]. Albumentations should handle this, but double-check your transformations to see if the order or configuration might be affecting the output.

Here's a code snippet that can enforce these constraints:

from albumentations import BboxParams

# Ensure the bbox params are correctly specified
bbox_params = BboxParams(format='yolo', label_fields=['class_labels'], min_visibility=0.1)

# Include this bbox_params in your Compose
self.augmentations = A.Compose([
    A.RandomSizedBBoxSafeCrop(1024, 1024, p=1.0),
    # further transformations
], bbox_params=bbox_params)
  • The min_visibility argument can assist in filtering out bboxes that become too small or invisible after augmentation, potentially removing problematic cases.

It might also be useful to log the transformation outputs in your training loop to see how they affect bboxes throughout the augmentation process.

If errors persist, consider simplifying or adjusting the sequence of your transformations to identify which specific augmentation might be the cause. Stay tuned! 🚀

from ultralytics.

azizche avatar azizche commented on June 18, 2024

Nothing worked... The problem that I am getting in the first place does not make sense. If I am cropping the images so that their size would be 1024 * 1024 (with a probability of 1), why would it tell me that it's trying to stack images of 3008 * 3008 (note that this is the image size I specified in the training function) and 1024 * 1024.
To answer your question the cropping augmentation is the one that is causing the problem.

from ultralytics.

glenn-jocher avatar glenn-jocher commented on June 18, 2024

Hey @azizche! It sounds like you're dealing with a tricky problem there. Since the cropping is set at 100% probability to process images to 1024x1024, but you're seeing some inconsistency with larger dimensions appearing (like 3008x3008), let's double-check:

  1. Ensure that all preprocessing steps, including resizing and cropping, are applied consistently in your pipeline.
  2. Verify if the load_images_and_labels() function or wherever your dataset is being prepared respects this transformation before the images reach the batching phase.

Here's a quick snippet to ensure all images are uniformly handled:

import albumentations as A

def get_transform():
    return A.Compose([
        A.RandomSizedBBoxSafeCrop(1024, 1024, p=1),
        # add any other transformations here
    ], bbox_params=A.BboxParams(format='yolo', label_fields=['class_labels']))

# Apply this transform in your data loading function
transform = get_transform()
data = transform(image=image_array, bboxes=bboxes, class_labels=labels)

This ensures every image and corresponding bbox is processed to the desired dimensions. Let’s ensure similar preprocessing is maintained across your workflow. 😊

from ultralytics.

azizche avatar azizche commented on June 18, 2024

Hi again @glenn-jocher, well I found a problem in code that was causing this mismatch in images' sizes.
In fact, in the original ultralytics code, if the image has no labels, the albumentation augmentation gets skipped. For reference, here is the part of the code that I'm talking about

if self.transform and random.random() < self.p:
        new = self.transform(image=im, bboxes=bboxes, class_labels=cls)  # transformed
        if len(new["class_labels"]) > 0:  # skip update if no bbox in new im
                    labels["img"] = new["image"]
                    labels["cls"] = np.array(new["class_labels"])
                    bboxes = np.array(new["bboxes"], dtype=np.float32)

This create an error because images with no labels don't get cropped while images with labels get cropped. So, I changed it to update the image every time (It should be like this by default don't you think?)
Still for some reason, the images reach the collate_fn function (where the stacking happens located in the dataset.py) and an error occurs because of mismatch between images' sizes. I think that might indicate that the albumentation augmentation does not happen consistently to every image. Can that be the case?
Note: I am 100% now that all the images after the augmentation have similar sizes.

from ultralytics.

azizche avatar azizche commented on June 18, 2024

Oh it worked! There was another if statement I did not see

if len(cls):
            labels["instances"].convert_bbox("xywh")
            labels["instances"].normalize(*im.shape[:2][::-1])
            bboxes = labels["instances"].bboxes
            # TODO: add supports of segments and keypoints
            if self.transform and random.random() < self.p:
                new = self.transform(image=im, bboxes=bboxes, class_labels=cls)  # transformed
                if len(new["class_labels"]) > 0:  # skip update if no bbox in new im
                    labels["img"] = new["image"]
                    labels["cls"] = np.array(new["class_labels"])
                    bboxes = np.array(new["bboxes"], dtype=np.float32)
            labels["instances"].update(bboxes=bboxes)
        return labels

The augmentation happens only where there are labels. The changes happen under the if(cls) statement. Again I don't think that should be the case. The image transformation needs to happen regardless. Should I submit a PR ensuring that?
Another thing I want to point out to is that what if someone wants to do a crop but with a probabilty less than 1. In that case, we need to expect images with difference sizes. So, I suggest resizing the images after every augmentation.

from ultralytics.

azizche avatar azizche commented on June 18, 2024

Another issue that I encountered is that the update of the bboxes should happen even when the transformed image has no bounding boxes because the new cropped bounding box can have no bounding boxes present so the labels should be updated accordingly

from ultralytics.

azizche avatar azizche commented on June 18, 2024

Can you suggest a way to update the bounding boxes of the image to be empty in the albumentation class?

from ultralytics.

glenn-jocher avatar glenn-jocher commented on June 18, 2024

Hello! Great question! To update the bounding boxes to be empty when no objects are present in the transformed image, you can modify the condition to update the labels regardless of whether new bounding boxes are detected. Here's a quick example:

if self.transform and random.random() < self.p:
    new = self.transform(image=im, bboxes=bboxes, class_labels=cls)
    labels["img"] = new["image"]
    labels["cls"] = np.array(new["class_labels"]) if new["class_labels"] else np.array([])
    bboxes = np.array(new["bboxes"], dtype=np.float32) if new["bboxes"] else np.array([])

labels["instances"].update(bboxes=bboxes)

This ensures that the image and labels are always updated, even if no bounding boxes are present after the transformation. Hope this helps! 😊

from ultralytics.

azizche avatar azizche commented on June 18, 2024

hello @glenn-jocher, I tried that and unfortunately it did not work. The Instances update function first instantiate a Boxes class:

def update(self, bboxes, segments=None, keypoints=None):
        """Updates instance variables."""
        self._bboxes = Bboxes(bboxes, format=self._bboxes.format) 

The boxes class asserts that the bounding boxes are not empty

    def __init__(self, bboxes, format="xyxy") -> None:
        """Initializes the Bboxes class with bounding box data in a specified format."""
        assert format in _formats, f"Invalid bounding box format: {format}, format must be one of {_formats}"
        bboxes = bboxes[None, :] if bboxes.ndim == 1 else bboxes
        assert bboxes.ndim == 2
        assert bboxes.shape[1] == 4

which is surprising! I know that yolo accepts images with empty bounding boxes. But, how so if the Boxes class does not accept empty bouding boxes?

from ultralytics.

glenn-jocher avatar glenn-jocher commented on June 18, 2024

@azizche hello! It looks like the issue arises because the Boxes class expects non-empty bounding boxes due to the assertions. To handle cases with empty bounding boxes, you might consider modifying the Boxes class to allow for empty inputs. Here's a quick example of how you could adjust the initialization method to bypass the assertion when no bounding boxes are present:

def __init__(self, bboxes, format="xyxy") -> None:
    """Initializes the Bboxes class with bounding box data in a specified format."""
    assert format in _formats, f"Invalid bounding box format: {format}, format must be one of {_formats}"
    if bboxes.size == 0:
        self.bboxes = np.empty((0, 4))
    else:
        bboxes = bboxes[None, :] if bboxes.ndim == 1 else bboxes
        assert bboxes.ndim == 2 and bboxes.shape[1] == 4
        self.bboxes = bboxes

This modification checks if bboxes is empty and, if so, initializes an empty array with the correct shape. This should allow the Boxes class to handle cases without bounding boxes gracefully. 😊

from ultralytics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.