Comments (13)
👋 Hello @azizche, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.
If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.
Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.
Install
Pip install the ultralytics
package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
Environments
YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
- Notebooks with free GPU:
- Google Cloud Deep Learning VM. See GCP Quickstart Guide
- Amazon Deep Learning AMI. See AWS Quickstart Guide
- Docker Image. See Docker Quickstart Guide
Status
If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.
from ultralytics.
Hey there! It seems like the issue you're encountering is related to mixing images of different sizes in a single batch, which isn't supported in the current setup. The albumentations RandomSizedBBoxSafeCrop
is correctly resizing to 1024x1024, but it seems your original dataset contains images of size 4000x4000 which are not being resized consistently.
A possible solution would be to ensure all images and their corresponding bounding boxes are resized to the same dimensions before batching them. You could apply resizing directly in your dataset preprocessing step to make all images 1024x1024 (or any consistent size that suits your needs), then proceed to apply the other albumentations transforms.
Here’s a small example on how you might adjust your dataset preprocessing:
class Albumentations:
def __init__(self, p=1.0):
self.p = p
import albumentations as A
self.resize_transform = A.Resize(1024, 1024)
self.augmentations = A.Compose([
A.HorizontalFlip(0.5),
A.RandomSizedBBoxSafeCrop(1024, 1024, p=0.9),
# ... other transforms
], bbox_params=A.BboxParams(format="yolo", label_fields=["class_labels"]))
def __call__(self, labels):
im = labels['img']
bboxes = labels['instances'].bboxes
cls = labels['cls']
# Resize first
im = self.resize_transform(image=im)['image']
# Apply other augmentations
if self.augmentations and random.random() < self.p:
transformed = self.augmentations(image=im, bboxes=bboxes, class_labels=cls)
im = transformed['image']
bboxes = transformed['bboxes']
cls = transformed['class_labels']
labels.update({'img': im, 'bboxes': bboxes, 'cls': cls})
return labels
This script first ensures that every image is resized uniformly before applying further augmentations. This should circumvent the RuntimeError
regarding differing tensor sizes during stacking in your data loader. Hope this helps, and good luck with your training! 😊👍
from ultralytics.
Isin't the image supposed to be resized before augmenting it with albumentation (From what I understood, the LetterBox class is responsible for resizing the images)
I am trying to resize images to a high image size number (40004000) and during augmentation I only want to take image crops of size 10241024 so I put the probability of the A.RandomSizedBBoxSafeCrop to 1.
Plus, from the error I got, it seems that the training excepts the images to be all the same size. If that's the case, how does the model deal with cropping augmentation(and any other augmentation that changes the image size)
Note: I tried the solution you proposed but still I still get this error
~/Experiments/ultralytics/ultralytics/engine/model.py in train(self, trainer, **kwargs)
655
656 self.trainer.hub_session = self.session # attach optional HUB session
--> 657 self.trainer.train()
658 # Update model and cfg after training
659 if RANK in (-1, 0):
~/Experiments/ultralytics/ultralytics/engine/trainer.py in train(self)
211
212 else:
--> 213 self._do_train(world_size)
214
215 def _setup_scheduler(self):
~/Experiments/ultralytics/ultralytics/engine/trainer.py in _do_train(self, world_size)
361 self.tloss = None
362 self.optimizer.zero_grad()
--> 363 for i, batch in pbar:
364 self.run_callbacks("on_train_batch_start")
365 # Warmup
/opt/conda/lib/python3.9/site-packages/tqdm/std.py in __iter__(self)
1178
1179 try:
-> 1180 for obj in iterable:
1181 yield obj
1182 # Update and possibly print the progressbar.
~/Experiments/ultralytics/ultralytics/data/build.py in __iter__(self)
47 """Creates a sampler that repeats indefinitely."""
48 for _ in range(len(self)):
---> 49 yield next(self.iterator)
50
51 def reset(self):
/opt/conda/lib/python3.9/site-packages/torch/utils/data/dataloader.py in __next__(self)
629 # TODO(https://github.com/pytorch/pytorch/issues/76750)
630 self._reset() # type: ignore[call-arg]
--> 631 data = self._next_data()
632 self._num_yielded += 1
633 if self._dataset_kind == _DatasetKind.Iterable and \
/opt/conda/lib/python3.9/site-packages/torch/utils/data/dataloader.py in _next_data(self)
1344 else:
1345 del self._task_info[idx]
-> 1346 return self._process_data(data)
1347
1348 def _try_put_index(self):
/opt/conda/lib/python3.9/site-packages/torch/utils/data/dataloader.py in _process_data(self, data)
1370 self._try_put_index()
1371 if isinstance(data, ExceptionWrapper):
-> 1372 data.reraise()
1373 return data
1374
/opt/conda/lib/python3.9/site-packages/torch/_utils.py in reraise(self)
703 # instantiate since we don't know how to
704 raise RuntimeError(msg) from None
--> 705 raise exception
706
707
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
File "/opt/conda/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/conda/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/jovyan/Experiments/ultralytics/ultralytics/data/base.py", line 253, in __getitem__
return self.transforms(self.get_image_and_label(index))
File "/home/jovyan/Experiments/ultralytics/ultralytics/data/augment.py", line 74, in __call__
data = t(data)
File "/home/jovyan/Experiments/ultralytics/ultralytics/data/augment.py", line 841, in __call__
transformed = self.augmentations(image=im, bboxes=bboxes, class_labels=cls)
File "/opt/conda/lib/python3.9/site-packages/albumentations/core/composition.py", line 228, in __call__
p.preprocess(data)
File "/opt/conda/lib/python3.9/site-packages/albumentations/core/utils.py", line 90, in preprocess
data[data_name] = self.check_and_convert(data[data_name], rows, cols, direction="to")
File "/opt/conda/lib/python3.9/site-packages/albumentations/core/utils.py", line 104, in check_and_convert
return self.convert_to_albumentations(data, rows, cols)
File "/opt/conda/lib/python3.9/site-packages/albumentations/core/bbox_utils.py", line 151, in convert_to_albumentations
return convert_bboxes_to_albumentations(data, self.params.format, rows, cols, check_validity=True)
File "/opt/conda/lib/python3.9/site-packages/albumentations/core/bbox_utils.py", line 433, in convert_bboxes_to_albumentations
return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes]
File "/opt/conda/lib/python3.9/site-packages/albumentations/core/bbox_utils.py", line 433, in <listcomp>
return [convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity) for bbox in bboxes]
File "/opt/conda/lib/python3.9/site-packages/albumentations/core/bbox_utils.py", line 352, in convert_bbox_to_albumentations
raise ValueError(msg)
ValueError: In YOLO format all coordinates must be float and in range (0, 1]
Thank you for your prompt answer @glenn-jocher. Much appreciated 😀
from ultralytics.
@azizche hi there! 😊 It appears that after your transformations, some bboxes may not be formatted correctly or have fallen out of the expected range (0, 1]. This can happen particularly after aggressive crops or other transformations change the image size dramatically. Here’s an approach that might help:
Make sure to normalize the bbox coordinates relative to the new image size after the crop and ensure they're correctly bounded within (0, 1]. Albumentations should handle this, but double-check your transformations to see if the order or configuration might be affecting the output.
Here's a code snippet that can enforce these constraints:
from albumentations import BboxParams
# Ensure the bbox params are correctly specified
bbox_params = BboxParams(format='yolo', label_fields=['class_labels'], min_visibility=0.1)
# Include this bbox_params in your Compose
self.augmentations = A.Compose([
A.RandomSizedBBoxSafeCrop(1024, 1024, p=1.0),
# further transformations
], bbox_params=bbox_params)
- The
min_visibility
argument can assist in filtering out bboxes that become too small or invisible after augmentation, potentially removing problematic cases.
It might also be useful to log the transformation outputs in your training loop to see how they affect bboxes throughout the augmentation process.
If errors persist, consider simplifying or adjusting the sequence of your transformations to identify which specific augmentation might be the cause. Stay tuned! 🚀
from ultralytics.
Nothing worked... The problem that I am getting in the first place does not make sense. If I am cropping the images so that their size would be 1024 * 1024 (with a probability of 1), why would it tell me that it's trying to stack images of 3008 * 3008 (note that this is the image size I specified in the training function) and 1024 * 1024.
To answer your question the cropping augmentation is the one that is causing the problem.
from ultralytics.
Hey @azizche! It sounds like you're dealing with a tricky problem there. Since the cropping is set at 100% probability to process images to 1024x1024, but you're seeing some inconsistency with larger dimensions appearing (like 3008x3008), let's double-check:
- Ensure that all preprocessing steps, including resizing and cropping, are applied consistently in your pipeline.
- Verify if the
load_images_and_labels()
function or wherever your dataset is being prepared respects this transformation before the images reach the batching phase.
Here's a quick snippet to ensure all images are uniformly handled:
import albumentations as A
def get_transform():
return A.Compose([
A.RandomSizedBBoxSafeCrop(1024, 1024, p=1),
# add any other transformations here
], bbox_params=A.BboxParams(format='yolo', label_fields=['class_labels']))
# Apply this transform in your data loading function
transform = get_transform()
data = transform(image=image_array, bboxes=bboxes, class_labels=labels)
This ensures every image and corresponding bbox is processed to the desired dimensions. Let’s ensure similar preprocessing is maintained across your workflow. 😊
from ultralytics.
Hi again @glenn-jocher, well I found a problem in code that was causing this mismatch in images' sizes.
In fact, in the original ultralytics code, if the image has no labels, the albumentation augmentation gets skipped. For reference, here is the part of the code that I'm talking about
if self.transform and random.random() < self.p:
new = self.transform(image=im, bboxes=bboxes, class_labels=cls) # transformed
if len(new["class_labels"]) > 0: # skip update if no bbox in new im
labels["img"] = new["image"]
labels["cls"] = np.array(new["class_labels"])
bboxes = np.array(new["bboxes"], dtype=np.float32)
This create an error because images with no labels don't get cropped while images with labels get cropped. So, I changed it to update the image every time (It should be like this by default don't you think?)
Still for some reason, the images reach the collate_fn function (where the stacking happens located in the dataset.py) and an error occurs because of mismatch between images' sizes. I think that might indicate that the albumentation augmentation does not happen consistently to every image. Can that be the case?
Note: I am 100% now that all the images after the augmentation have similar sizes.
from ultralytics.
Oh it worked! There was another if statement I did not see
if len(cls):
labels["instances"].convert_bbox("xywh")
labels["instances"].normalize(*im.shape[:2][::-1])
bboxes = labels["instances"].bboxes
# TODO: add supports of segments and keypoints
if self.transform and random.random() < self.p:
new = self.transform(image=im, bboxes=bboxes, class_labels=cls) # transformed
if len(new["class_labels"]) > 0: # skip update if no bbox in new im
labels["img"] = new["image"]
labels["cls"] = np.array(new["class_labels"])
bboxes = np.array(new["bboxes"], dtype=np.float32)
labels["instances"].update(bboxes=bboxes)
return labels
The augmentation happens only where there are labels. The changes happen under the if(cls) statement. Again I don't think that should be the case. The image transformation needs to happen regardless. Should I submit a PR ensuring that?
Another thing I want to point out to is that what if someone wants to do a crop but with a probabilty less than 1. In that case, we need to expect images with difference sizes. So, I suggest resizing the images after every augmentation.
from ultralytics.
Another issue that I encountered is that the update of the bboxes should happen even when the transformed image has no bounding boxes because the new cropped bounding box can have no bounding boxes present so the labels should be updated accordingly
from ultralytics.
Can you suggest a way to update the bounding boxes of the image to be empty in the albumentation class?
from ultralytics.
Hello! Great question! To update the bounding boxes to be empty when no objects are present in the transformed image, you can modify the condition to update the labels regardless of whether new bounding boxes are detected. Here's a quick example:
if self.transform and random.random() < self.p:
new = self.transform(image=im, bboxes=bboxes, class_labels=cls)
labels["img"] = new["image"]
labels["cls"] = np.array(new["class_labels"]) if new["class_labels"] else np.array([])
bboxes = np.array(new["bboxes"], dtype=np.float32) if new["bboxes"] else np.array([])
labels["instances"].update(bboxes=bboxes)
This ensures that the image and labels are always updated, even if no bounding boxes are present after the transformation. Hope this helps! 😊
from ultralytics.
hello @glenn-jocher, I tried that and unfortunately it did not work. The Instances update function first instantiate a Boxes class:
def update(self, bboxes, segments=None, keypoints=None):
"""Updates instance variables."""
self._bboxes = Bboxes(bboxes, format=self._bboxes.format)
The boxes class asserts that the bounding boxes are not empty
def __init__(self, bboxes, format="xyxy") -> None:
"""Initializes the Bboxes class with bounding box data in a specified format."""
assert format in _formats, f"Invalid bounding box format: {format}, format must be one of {_formats}"
bboxes = bboxes[None, :] if bboxes.ndim == 1 else bboxes
assert bboxes.ndim == 2
assert bboxes.shape[1] == 4
which is surprising! I know that yolo accepts images with empty bounding boxes. But, how so if the Boxes class does not accept empty bouding boxes?
from ultralytics.
@azizche hello! It looks like the issue arises because the Boxes
class expects non-empty bounding boxes due to the assertions. To handle cases with empty bounding boxes, you might consider modifying the Boxes
class to allow for empty inputs. Here's a quick example of how you could adjust the initialization method to bypass the assertion when no bounding boxes are present:
def __init__(self, bboxes, format="xyxy") -> None:
"""Initializes the Bboxes class with bounding box data in a specified format."""
assert format in _formats, f"Invalid bounding box format: {format}, format must be one of {_formats}"
if bboxes.size == 0:
self.bboxes = np.empty((0, 4))
else:
bboxes = bboxes[None, :] if bboxes.ndim == 1 else bboxes
assert bboxes.ndim == 2 and bboxes.shape[1] == 4
self.bboxes = bboxes
This modification checks if bboxes
is empty and, if so, initializes an empty array with the correct shape. This should allow the Boxes
class to handle cases without bounding boxes gracefully. 😊
from ultralytics.
Related Issues (20)
- pretrained models with smaller input resolution HOT 4
- C2f module HOT 1
- How to freeze layers in yolov8? The freeze parameter means freeze first "freeze" layers, right?For example the backbone contains 11 layers,then i need to set freeze=11?but seems it also freezed my last segmentation head(layer 30),that's wield. HOT 2
- Changing the feature extractor HOT 6
- evaluation VS benchmark HOT 3
- default mean/std for yolov8-cls model HOT 2
- How to implement ordinal encoding of classes for yolov8-cls model? HOT 1
- It it possible to increase grid density in FastSAM? HOT 3
- Struggling to Improve mAP Scores on Custom Dataset (YOLOv8) HOT 2
- Trouble detecting multiple classes in same frame HOT 2
- Overfitting HOT 1
- Having question for the label showed by "Plotting label" in the beginning of training. HOT 6
- Adding epochs after training is done HOT 5
- How many classes are used to train "yolov8n-oiv7.pt" model HOT 2
- Thanks for your work,excellent! some question about yolo-world finetune freeze and prompt. HOT 3
- YoloV8 with TensorRT Jetpack 6: dependencies? HOT 2
- Questions about domain adaptation for YOLOv8 HOT 3
- (YOLOv8的anchor机制,可以根据训练样本自动调整anchor吗?anchor是聚类生成,不是设定的吧?)Can the yolov8 training process automatically adjust the anchor size according to the anchor of the training set? Since my detection targets are all small targets, it should be better to adjust anchor HOT 4
- ultralytics 8.2.26 export to openvino int8 quantization, performance drop significantly HOT 12
- Why pad 0.5 here? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ultralytics.