Comments (10)
π Hello @LaoXianYud, thank you for your interest in Ultralytics YOLOv8 π! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.
If this is a π Bug Report, please provide a minimum reproducible example to help us debug it.
If this is a custom training β Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.
Join the vibrant Ultralytics Discord π§ community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.
Install
Pip install the ultralytics
package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.
pip install ultralytics
Environments
YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
- Notebooks with free GPU:
- Google Cloud Deep Learning VM. See GCP Quickstart Guide
- Amazon Deep Learning AMI. See AWS Quickstart Guide
- Docker Image. See Docker Quickstart Guide
Status
If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.
from ultralytics.
@LaoXianYud hello,
Thank you for reaching out and providing detailed information about your issue. To better assist you, could you please provide a minimum reproducible code example? This will help us understand the exact steps you're taking and reproduce the issue on our end. You can find guidelines on creating a minimum reproducible example here. This is crucial for us to investigate and resolve the problem effectively.
Additionally, please ensure that you are using the latest versions of torch
and ultralytics
. You can update your packages using the following commands:
pip install --upgrade torch
pip install --upgrade ultralytics
Regarding your question about the probability distribution differences between YOLOv8 and other versions like YOLOv5 and YOLOv9, it's possible that variations in image preprocessing or model architecture could lead to differences in probability outputs. YOLOv8 might employ different normalization techniques or other preprocessing steps that could affect the confidence scores.
If you can share the specific code snippets you used for training and prediction, it would be very helpful. This will allow us to provide more targeted advice and potentially identify any discrepancies in the preprocessing steps.
Looking forward to your response!
from ultralytics.
To better assist you, could you please provide a minimum reproducible code example?
from ultralytics import YOLO
# Load a model
model = YOLO(r"F:\VSEE\yolov8\runs\classify\train_0619\weights\best.pt") # pretrained YOLOv8n model
# Run batched inference on a list of images
results = model([r"F:\projects\1\tr\1.bmp",
r"F:\projects\1\tr\2.bmp"]) # return a list of Results objects
# Process results list
for result in results:
boxes = result.boxes # Boxes object for bounding box outputs
masks = result.masks # Masks object for segmentation masks outputs
keypoints = result.keypoints # Keypoints object for pose outputs
probs = result.probs # Probs object for classification outputs
obb = result.obb # Oriented boxes object for OBB outputs
As mentioned above, this is the prediction code of the yolov I run.
And the same parameters are used when using Yolov's code to make predictions, as follows:
import argparse
import os
import platform
import sys
from pathlib import Path
import numpy as np
from PIL import Image
import torch
import torch.nn.functional as F
FILE = Path(__file__).resolve()
ROOT = FILE.parents[1] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from models.common import DetectMultiBackend
from utils.augmentations import classify_transforms
from utils.dataloaders import IMG_FORMATS, VID_FORMATS, LoadImages, LoadScreenshots, LoadStreams
from utils.general import (LOGGER, Profile, check_file, check_img_size, check_imshow, check_requirements, colorstr, cv2,
increment_path, print_args, strip_optimizer)
from utils.plots import Annotator
from utils.torch_utils import select_device, smart_inference_mode
@smart_inference_mode()
def run(
weights=ROOT / 'yolov5s-cls.onnx', # model.pt path(s)
source=ROOT / 'data/images', # file/dir/URL/glob/screen/0(webcam)
data=ROOT / 'data/coco128.yaml', # dataset.yaml path
imgsz=(160, 160), # inference size (height, width)
device='', # cuda device, i.e. 0 or 0,1,2,3 or cpu
view_img=False, # show results
save_txt=False, # save results to *.txt
nosave=False, # do not save images/videos
augment=False, # augmented inference
visualize=False, # visualize features
update=False, # update all models
project=ROOT / 'runs/predict-cls', # save results to project/name
name='exp', # save results to project/name
exist_ok=False, # existing project/name ok, do not increment
half=False, # use FP16 half-precision inference
dnn=False, # use OpenCV DNN for ONNX inference
vid_stride=1, # video frame-rate stride
):
source = str(source)
save_img = not nosave and not source.endswith('.txt') # save inference images
is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://'))
webcam = source.isnumeric() or source.endswith('.txt') or (is_url and not is_file)
screenshot = source.lower().startswith('screen')
if is_url and is_file:
source = check_file(source) # download
# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) # increment run
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir
# Load model
device = select_device(device)
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
stride, names, pt = model.stride, model.names, model.pt
imgsz = check_img_size(imgsz, s=stride) # check image size
# Dataloader
bs = 1 # batch_size
if webcam:
view_img = check_imshow(warn=True)
dataset = LoadStreams(source, img_size=imgsz, transforms=classify_transforms(imgsz[0]), vid_stride=vid_stride)
bs = len(dataset)
elif screenshot:
dataset = LoadScreenshots(source, img_size=imgsz, stride=stride, auto=pt)
else:
dataset = LoadImages(source, img_size=imgsz, transforms=classify_transforms(imgsz[0]), vid_stride=vid_stride)
vid_path, vid_writer = [None] * bs, [None] * bs
# Run inference
model.warmup(imgsz=(1 if pt else bs, 3, *imgsz)) # warmup
seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
for path, im, im0s, vid_cap, s in dataset:
with dt[0]:
im = torch.Tensor(im).to(model.device)
im = im.half() if model.fp16 else im.float() # uint8 to fp16/32
if len(im.shape) == 3:
im = im[None] # expand for batch dim
# Inference
with dt[1]:
results = model(im)
# Post-process
with dt[2]:
pred = F.softmax(results, dim=1) # probabilities
# Process predictions
for i, prob in enumerate(pred): # per image
seen += 1
if webcam: # batch_size >= 1
p, im0, frame = path[i], im0s[i].copy(), dataset.count
s += f'{i}: '
else:
p, im0, frame = path, im0s.copy(), getattr(dataset, 'frame', 0)
p = Path(p) # to Path
save_path = str(save_dir / p.name) # im.jpg
txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}') # im.txt
s += '%gx%g ' % im.shape[2:] # print string
annotator = Annotator(im0, example=str(names), pil=True)
# Print results
top5i = prob.argsort(0, descending=True)[:5].tolist() # top 5 indices
s += f"{', '.join(f'{names[j]} {prob[j]:.2f}' for j in top5i)}, "
# Write results
text = '\n'.join(f'{prob[j]:.2f} {names[j]}' for j in top5i)
if save_img or view_img: # Add bbox to image
annotator.text((32, 32), text, txt_color=(255, 255, 255))
if save_txt: # Write to file
with open(f'{txt_path}.txt', 'a') as f:
f.write(text + '\n')
# Stream results
im0 = annotator.result()
if view_img:
if platform.system() == 'Linux' and p not in windows:
windows.append(p)
cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO) # allow window resize (Linux)
cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
cv2.imshow(str(p), im0)
cv2.waitKey(1) # 1 millisecond
# Save results (image with detections)
if save_img:
if dataset.mode == 'image':
cv2.imwrite(save_path, im0)
else: # 'video' or 'stream'
if vid_path[i] != save_path: # new video
vid_path[i] = save_path
if isinstance(vid_writer[i], cv2.VideoWriter):
vid_writer[i].release() # release previous video writer
if vid_cap: # video
fps = vid_cap.get(cv2.CAP_PROP_FPS)
w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
else: # stream
fps, w, h = 30, im0.shape[1], im0.shape[0]
save_path = str(Path(save_path).with_suffix('.mp4')) # force *.mp4 suffix on results videos
vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
vid_writer[i].write(im0)
# Print time (inference-only)
LOGGER.info(f"{s}{dt[1].dt * 1E3:.1f}ms")
# Print results
t = tuple(x.t / seen * 1E3 for x in dt) # speeds per image
LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}' % t)
if save_txt or save_img:
s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
if update:
strip_optimizer(weights[0]) # update model (to fix SourceChangeWarning)
def parse_opt():
parser = argparse.ArgumentParser()
parser.add_argument('--weights', nargs='+', type=str, default=r'F:\VSEE\yolov8\runs\classify\train_0619\weights\best.pt', help='model path(s)')
parser.add_argument('--source', type=str, default=r'F:\projects\1\tr', help='file/dir/URL/glob/screen/0(webcam)')
parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='(optional) dataset.yaml path')
parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[192], help='inference size h,w')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--view-img', action='store_true', help='show results')
parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
parser.add_argument('--augment', action='store_true', help='augmented inference')
parser.add_argument('--visualize', action='store_true', help='visualize features')
parser.add_argument('--update', action='store_true', help='update all models')
parser.add_argument('--project', default=ROOT / 'runs/predict-cls', help='save results to project/name')
parser.add_argument('--name', default='exp', help='save results to project/name')
parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
parser.add_argument('--vid-stride', type=int, default=1, help='video frame-rate stride')
opt = parser.parse_args()
opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1 # expand
print_args(vars(opt))
return opt
def main(opt):
check_requirements(exclude=('tensorboard', 'thop'))
run(**vars(opt))
if __name__ == "__main__":
opt = parse_opt()
main(opt)
from ultralytics.
Hello @LaoXianYud,
Thank you for providing the detailed code examples. This is very helpful for understanding the issue you're encountering.
From your description, it seems that there are discrepancies in the probability distributions when using YOLOv8 compared to YOLOv5 and YOLOv9, despite the classification results being correct. This could indeed be due to differences in image preprocessing or model architecture between the versions.
Here are a few steps to help diagnose and potentially resolve the issue:
-
Ensure Consistent Preprocessing: Verify that the image preprocessing steps are consistent across all versions. Differences in normalization, resizing, or augmentation can affect the model's output probabilities. For YOLOv8, you can refer to the preprocessing steps in the
classify_transforms
function. -
Check Model Configuration: Ensure that the model configurations (e.g., input size, batch size, etc.) are consistent across YOLOv5, YOLOv8, and YOLOv9. Even slight differences can lead to variations in output probabilities.
-
Update to Latest Versions: Make sure you are using the latest versions of the
ultralytics
package andtorch
. This ensures that you have the latest bug fixes and improvements.pip install --upgrade torch pip install --upgrade ultralytics
-
Compare Softmax Outputs: Since you mentioned using
F.softmax
for post-processing in YOLOv5, ensure that the same softmax operation is applied in YOLOv8. This can be done by explicitly applying softmax to the model outputs if not already done. -
Debugging with Sample Images: Run predictions on a few sample images using both YOLOv8 and YOLOv5/YOLOv9, and compare the raw outputs before applying softmax. This can help identify if the issue lies in the raw model outputs or the post-processing steps.
Here is an example of how you might explicitly apply softmax in YOLOv8:
from ultralytics import YOLO
import torch.nn.functional as F
# Load the model
model = YOLO(r"F:\VSEE\yolov8\runs\classify\train_0619\weights\best.pt")
# Run inference
results = model([r"F:\projects\1\tr\1.bmp", r"F:\projects\1\tr\2.bmp"])
# Apply softmax to the raw outputs
for result in results:
probs = F.softmax(result.probs, dim=1)
print(probs)
If the issue persists, please provide additional details such as the specific preprocessing steps used in YOLOv8 and YOLOv5/YOLOv9, and any differences in the model configurations. This will help us further investigate the root cause.
Thank you for your patience and cooperation. Let's work together to resolve this issue! π
from ultralytics.
4. Compare Softmax Outputs:
According to my troubleshooting method, it has been confirmed that the image preprocessing methods of v5 and v8 have no effect on the results, and the image input of both sides is basically the same, and it is speculated that it may be mainly because of the problem of post-processing, that is, v5 uses softmax to obtain the probability, so I want to know how to obtain the probability of the classification model in v8, and where is the specific code in
from ultralytics.
4. Compare Softmax Outputs:
I tested it, and the results are indeed different before and after using softmax, if you don't use softmax, the result you get is v8, and if you use it, it's v5, but in normal use, I will convert pt to onnx.
So how do you remove softmax in onnx and get v8 results?
from ultralytics.
4. Compare Softmax Outputs
Ok, I've now removed the softmax processing of the results, and the yolov8 inference results have been perfectly implemented.
But I have a question, I noticed that softmax is already done in both pt and onnx in v8, and using softmax again at this point the result is definitely wrong, so I would like to know how to avoid this problem, in addition to modifying the code I reasoned aboutοΌ
from ultralytics.
Thank you for your patience and cooperation. Let's work together to resolve this issue! π
The problem has been solved, in the head of Classify, return does a softmax operation on the results in the non-training state, and the results can be returned directly.
from ultralytics.
Hello @LaoXianYud,
I'm glad to hear that you've identified the issue and resolved it! π
Indeed, the softmax
operation is typically applied in the classification head to convert logits to probabilities. If you want to avoid applying softmax
again during inference, especially when converting to ONNX, you can modify the model to return logits directly.
For those who might encounter similar issues, here's a concise way to handle this:
-
Modify the Model: Adjust the model to return logits instead of applying
softmax
in the forward pass. This ensures that the exported ONNX model will also return logits. -
Post-Processing: Apply
softmax
only when necessary during post-processing, ensuring that you don't apply it twice.
Here's a quick example of how you might modify the model:
import torch.nn as nn
class CustomModel(nn.Module):
def __init__(self):
super(CustomModel, self).__init__()
# Define your layers here
def forward(self, x):
# Your forward pass
logits = self.classifier(x)
return logits # Return logits directly
# When exporting to ONNX
model = CustomModel()
dummy_input = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy_input, "model.onnx", export_params=True)
By doing this, you ensure that the ONNX model outputs logits, and you can apply softmax
as needed during inference.
If you have any more questions or need further assistance, feel free to ask. We're here to help! π
from ultralytics.
π Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
- Docs: https://docs.ultralytics.com
- HUB: https://hub.ultralytics.com
- Community: https://community.ultralytics.com
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO π and Vision AI β
from ultralytics.
Related Issues (20)
- RTSP Stream breaks using YOLOv8 live detection HOT 9
- NotImplementedError HOT 3
- Seed setting HOT 2
- OverflowError: Python int too large to convert to C long HOT 12
- Questions regarding 'batch' and 'workers' arguments. HOT 2
- Memory leaks in your examples. HOT 4
- Non-Ascii Characters HOT 4
- False positives problemm HOT 3
- Ultralytics China Tour π¨π³ - August 2024: Meet the Team HOT 3
- Cannot change device ID after load last-trained checkpoints HOT 4
- Is it possible to obtain the coordinates of occluded keypoints? HOT 2
- Issue: Handling Multiple Connected Regions from SAM2 for YOLO Segmentation Training HOT 2
- YOLOv9 Model Fails to Detect 2 Classes: Only Detects 1 Class HOT 4
- yolov8-obb model evaluation οΌwhy is input size is asymmetricalοΌ HOT 5
- 'float' object cannot be interpreted as an integer in preprocess_batch in train.py file HOT 8
- Quantization of PyTorch model for Torch Mobile HOT 3
- Why Yolov10 performance on custom dataset worse than yolov7. What hyperparameters can be adjusted to improve performance? HOT 2
- Collecting the false positive and negative images HOT 2
- Mismatch between bounding boxes and images when using albumentations HOT 4
- Different predictions when use batch of images with different sizes HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ultralytics.