GithubHelp home page GithubHelp logo

Comments (9)

github-actions avatar github-actions commented on July 25, 2024

πŸ‘‹ Hello @LaoXianYud, thank you for your interest in Ultralytics YOLOv8 πŸš€! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a πŸ› Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

from ultralytics.

glenn-jocher avatar glenn-jocher commented on July 25, 2024

@LaoXianYud hello,

Thank you for reaching out and providing detailed information about your issue. To better assist you, could you please provide a minimum reproducible code example? This will help us understand the exact steps you're taking and reproduce the issue on our end. You can find guidelines on creating a minimum reproducible example here. This is crucial for us to investigate and resolve the problem effectively.

Additionally, please ensure that you are using the latest versions of torch and ultralytics. You can update your packages using the following commands:

pip install --upgrade torch
pip install --upgrade ultralytics

Regarding your question about the probability distribution differences between YOLOv8 and other versions like YOLOv5 and YOLOv9, it's possible that variations in image preprocessing or model architecture could lead to differences in probability outputs. YOLOv8 might employ different normalization techniques or other preprocessing steps that could affect the confidence scores.

If you can share the specific code snippets you used for training and prediction, it would be very helpful. This will allow us to provide more targeted advice and potentially identify any discrepancies in the preprocessing steps.

Looking forward to your response!

from ultralytics.

LaoXianYud avatar LaoXianYud commented on July 25, 2024

To better assist you, could you please provide a minimum reproducible code example?

from ultralytics import YOLO

# Load a model
model = YOLO(r"F:\VSEE\yolov8\runs\classify\train_0619\weights\best.pt")  # pretrained YOLOv8n model

# Run batched inference on a list of images
results = model([r"F:\projects\1\tr\1.bmp",
                 r"F:\projects\1\tr\2.bmp"])  # return a list of Results objects

# Process results list
for result in results:
    boxes = result.boxes  # Boxes object for bounding box outputs
    masks = result.masks  # Masks object for segmentation masks outputs
    keypoints = result.keypoints  # Keypoints object for pose outputs
    probs = result.probs  # Probs object for classification outputs
    obb = result.obb  # Oriented boxes object for OBB outputs

As mentioned above, this is the prediction code of the yolov I run.
And the same parameters are used when using Yolov's code to make predictions, as follows:

import argparse
import os
import platform
import sys
from pathlib import Path
import numpy as np
from PIL import Image

import torch
import torch.nn.functional as F

FILE = Path(__file__).resolve()
ROOT = FILE.parents[1]  # YOLOv5 root directory
if str(ROOT) not in sys.path:
    sys.path.append(str(ROOT))  # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd()))  # relative

from models.common import DetectMultiBackend
from utils.augmentations import classify_transforms
from utils.dataloaders import IMG_FORMATS, VID_FORMATS, LoadImages, LoadScreenshots, LoadStreams
from utils.general import (LOGGER, Profile, check_file, check_img_size, check_imshow, check_requirements, colorstr, cv2,
                           increment_path, print_args, strip_optimizer)
from utils.plots import Annotator
from utils.torch_utils import select_device, smart_inference_mode


@smart_inference_mode()
def run(
        weights=ROOT / 'yolov5s-cls.onnx',  # model.pt path(s)
        source=ROOT / 'data/images',  # file/dir/URL/glob/screen/0(webcam)
        data=ROOT / 'data/coco128.yaml',  # dataset.yaml path
        imgsz=(160, 160),  # inference size (height, width)
        device='',  # cuda device, i.e. 0 or 0,1,2,3 or cpu
        view_img=False,  # show results
        save_txt=False,  # save results to *.txt
        nosave=False,  # do not save images/videos
        augment=False,  # augmented inference
        visualize=False,  # visualize features
        update=False,  # update all models
        project=ROOT / 'runs/predict-cls',  # save results to project/name
        name='exp',  # save results to project/name
        exist_ok=False,  # existing project/name ok, do not increment
        half=False,  # use FP16 half-precision inference
        dnn=False,  # use OpenCV DNN for ONNX inference
        vid_stride=1,  # video frame-rate stride
):
    source = str(source)
    save_img = not nosave and not source.endswith('.txt')  # save inference images
    is_file = Path(source).suffix[1:] in (IMG_FORMATS + VID_FORMATS)
    is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://'))
    webcam = source.isnumeric() or source.endswith('.txt') or (is_url and not is_file)
    screenshot = source.lower().startswith('screen')
    if is_url and is_file:
        source = check_file(source)  # download

    # Directories
    save_dir = increment_path(Path(project) / name, exist_ok=exist_ok)  # increment run
    (save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir

    # Load model
    device = select_device(device)
    model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
    stride, names, pt = model.stride, model.names, model.pt
    imgsz = check_img_size(imgsz, s=stride)  # check image size

    # Dataloader
    bs = 1  # batch_size
    if webcam:
        view_img = check_imshow(warn=True)
        dataset = LoadStreams(source, img_size=imgsz, transforms=classify_transforms(imgsz[0]), vid_stride=vid_stride)
        bs = len(dataset)
    elif screenshot:
        dataset = LoadScreenshots(source, img_size=imgsz, stride=stride, auto=pt)
    else:
        dataset = LoadImages(source, img_size=imgsz, transforms=classify_transforms(imgsz[0]), vid_stride=vid_stride)
    vid_path, vid_writer = [None] * bs, [None] * bs

    # Run inference
    model.warmup(imgsz=(1 if pt else bs, 3, *imgsz))  # warmup
    seen, windows, dt = 0, [], (Profile(), Profile(), Profile())
    for path, im, im0s, vid_cap, s in dataset:
        with dt[0]:
            im = torch.Tensor(im).to(model.device)
            im = im.half() if model.fp16 else im.float()  # uint8 to fp16/32
            if len(im.shape) == 3:
                im = im[None]  # expand for batch dim

        # Inference
        with dt[1]:
            results = model(im)

        # Post-process
        with dt[2]:
            pred = F.softmax(results, dim=1)  # probabilities

        # Process predictions
        for i, prob in enumerate(pred):  # per image
            seen += 1
            if webcam:  # batch_size >= 1
                p, im0, frame = path[i], im0s[i].copy(), dataset.count
                s += f'{i}: '
            else:
                p, im0, frame = path, im0s.copy(), getattr(dataset, 'frame', 0)

            p = Path(p)  # to Path
            save_path = str(save_dir / p.name)  # im.jpg
            txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}')  # im.txt

            s += '%gx%g ' % im.shape[2:]  # print string
            annotator = Annotator(im0, example=str(names), pil=True)

            # Print results
            top5i = prob.argsort(0, descending=True)[:5].tolist()  # top 5 indices
            s += f"{', '.join(f'{names[j]} {prob[j]:.2f}' for j in top5i)}, "

            # Write results
            text = '\n'.join(f'{prob[j]:.2f} {names[j]}' for j in top5i)
            if save_img or view_img:  # Add bbox to image
                annotator.text((32, 32), text, txt_color=(255, 255, 255))
            if save_txt:  # Write to file
                with open(f'{txt_path}.txt', 'a') as f:
                    f.write(text + '\n')

            # Stream results
            im0 = annotator.result()
            if view_img:
                if platform.system() == 'Linux' and p not in windows:
                    windows.append(p)
                    cv2.namedWindow(str(p), cv2.WINDOW_NORMAL | cv2.WINDOW_KEEPRATIO)  # allow window resize (Linux)
                    cv2.resizeWindow(str(p), im0.shape[1], im0.shape[0])
                cv2.imshow(str(p), im0)
                cv2.waitKey(1)  # 1 millisecond

            # Save results (image with detections)
            if save_img:
                if dataset.mode == 'image':
                    cv2.imwrite(save_path, im0)
                else:  # 'video' or 'stream'
                    if vid_path[i] != save_path:  # new video
                        vid_path[i] = save_path
                        if isinstance(vid_writer[i], cv2.VideoWriter):
                            vid_writer[i].release()  # release previous video writer
                        if vid_cap:  # video
                            fps = vid_cap.get(cv2.CAP_PROP_FPS)
                            w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
                            h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
                        else:  # stream
                            fps, w, h = 30, im0.shape[1], im0.shape[0]
                        save_path = str(Path(save_path).with_suffix('.mp4'))  # force *.mp4 suffix on results videos
                        vid_writer[i] = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
                    vid_writer[i].write(im0)

        # Print time (inference-only)
        LOGGER.info(f"{s}{dt[1].dt * 1E3:.1f}ms")

    # Print results
    t = tuple(x.t / seen * 1E3 for x in dt)  # speeds per image
    LOGGER.info(f'Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {(1, 3, *imgsz)}' % t)
    if save_txt or save_img:
        s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ''
        LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
    if update:
        strip_optimizer(weights[0])  # update model (to fix SourceChangeWarning)


def parse_opt():
    parser = argparse.ArgumentParser()
    parser.add_argument('--weights', nargs='+', type=str, default=r'F:\VSEE\yolov8\runs\classify\train_0619\weights\best.pt', help='model path(s)')
    parser.add_argument('--source', type=str, default=r'F:\projects\1\tr', help='file/dir/URL/glob/screen/0(webcam)')
    parser.add_argument('--data', type=str, default=ROOT / 'data/coco128.yaml', help='(optional) dataset.yaml path')
    parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[192], help='inference size h,w')
    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--view-img', action='store_true', help='show results')
    parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
    parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
    parser.add_argument('--augment', action='store_true', help='augmented inference')
    parser.add_argument('--visualize', action='store_true', help='visualize features')
    parser.add_argument('--update', action='store_true', help='update all models')
    parser.add_argument('--project', default=ROOT / 'runs/predict-cls', help='save results to project/name')
    parser.add_argument('--name', default='exp', help='save results to project/name')
    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
    parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
    parser.add_argument('--dnn', action='store_true', help='use OpenCV DNN for ONNX inference')
    parser.add_argument('--vid-stride', type=int, default=1, help='video frame-rate stride')
    opt = parser.parse_args()
    opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1  # expand
    print_args(vars(opt))
    return opt


def main(opt):
    check_requirements(exclude=('tensorboard', 'thop'))
    run(**vars(opt))


if __name__ == "__main__":
    opt = parse_opt()
    main(opt)

from ultralytics.

glenn-jocher avatar glenn-jocher commented on July 25, 2024

Hello @LaoXianYud,

Thank you for providing the detailed code examples. This is very helpful for understanding the issue you're encountering.

From your description, it seems that there are discrepancies in the probability distributions when using YOLOv8 compared to YOLOv5 and YOLOv9, despite the classification results being correct. This could indeed be due to differences in image preprocessing or model architecture between the versions.

Here are a few steps to help diagnose and potentially resolve the issue:

  1. Ensure Consistent Preprocessing: Verify that the image preprocessing steps are consistent across all versions. Differences in normalization, resizing, or augmentation can affect the model's output probabilities. For YOLOv8, you can refer to the preprocessing steps in the classify_transforms function.

  2. Check Model Configuration: Ensure that the model configurations (e.g., input size, batch size, etc.) are consistent across YOLOv5, YOLOv8, and YOLOv9. Even slight differences can lead to variations in output probabilities.

  3. Update to Latest Versions: Make sure you are using the latest versions of the ultralytics package and torch. This ensures that you have the latest bug fixes and improvements.

    pip install --upgrade torch
    pip install --upgrade ultralytics
  4. Compare Softmax Outputs: Since you mentioned using F.softmax for post-processing in YOLOv5, ensure that the same softmax operation is applied in YOLOv8. This can be done by explicitly applying softmax to the model outputs if not already done.

  5. Debugging with Sample Images: Run predictions on a few sample images using both YOLOv8 and YOLOv5/YOLOv9, and compare the raw outputs before applying softmax. This can help identify if the issue lies in the raw model outputs or the post-processing steps.

Here is an example of how you might explicitly apply softmax in YOLOv8:

from ultralytics import YOLO
import torch.nn.functional as F

# Load the model
model = YOLO(r"F:\VSEE\yolov8\runs\classify\train_0619\weights\best.pt")

# Run inference
results = model([r"F:\projects\1\tr\1.bmp", r"F:\projects\1\tr\2.bmp"])

# Apply softmax to the raw outputs
for result in results:
    probs = F.softmax(result.probs, dim=1)
    print(probs)

If the issue persists, please provide additional details such as the specific preprocessing steps used in YOLOv8 and YOLOv5/YOLOv9, and any differences in the model configurations. This will help us further investigate the root cause.

Thank you for your patience and cooperation. Let's work together to resolve this issue! 😊

from ultralytics.

LaoXianYud avatar LaoXianYud commented on July 25, 2024

4. Compare Softmax Outputs:

According to my troubleshooting method, it has been confirmed that the image preprocessing methods of v5 and v8 have no effect on the results, and the image input of both sides is basically the same, and it is speculated that it may be mainly because of the problem of post-processing, that is, v5 uses softmax to obtain the probability, so I want to know how to obtain the probability of the classification model in v8, and where is the specific code in

from ultralytics.

LaoXianYud avatar LaoXianYud commented on July 25, 2024

4. Compare Softmax Outputs:

I tested it, and the results are indeed different before and after using softmax, if you don't use softmax, the result you get is v8, and if you use it, it's v5, but in normal use, I will convert pt to onnx.
So how do you remove softmax in onnx and get v8 results?
PixPin_2024-06-26_10-47-28
PixPin_2024-06-26_10-47-57

from ultralytics.

LaoXianYud avatar LaoXianYud commented on July 25, 2024

4. Compare Softmax Outputs

Ok, I've now removed the softmax processing of the results, and the yolov8 inference results have been perfectly implemented.
But I have a question, I noticed that softmax is already done in both pt and onnx in v8, and using softmax again at this point the result is definitely wrong, so I would like to know how to avoid this problem, in addition to modifying the code I reasoned about?

from ultralytics.

LaoXianYud avatar LaoXianYud commented on July 25, 2024

Thank you for your patience and cooperation. Let's work together to resolve this issue! 😊

The problem has been solved, in the head of Classify, return does a softmax operation on the results in the non-training state, and the results can be returned directly.

from ultralytics.

glenn-jocher avatar glenn-jocher commented on July 25, 2024

Hello @LaoXianYud,

I'm glad to hear that you've identified the issue and resolved it! πŸŽ‰

Indeed, the softmax operation is typically applied in the classification head to convert logits to probabilities. If you want to avoid applying softmax again during inference, especially when converting to ONNX, you can modify the model to return logits directly.

For those who might encounter similar issues, here's a concise way to handle this:

  1. Modify the Model: Adjust the model to return logits instead of applying softmax in the forward pass. This ensures that the exported ONNX model will also return logits.

  2. Post-Processing: Apply softmax only when necessary during post-processing, ensuring that you don't apply it twice.

Here's a quick example of how you might modify the model:

import torch.nn as nn

class CustomModel(nn.Module):
    def __init__(self):
        super(CustomModel, self).__init__()
        # Define your layers here

    def forward(self, x):
        # Your forward pass
        logits = self.classifier(x)
        return logits  # Return logits directly

# When exporting to ONNX
model = CustomModel()
dummy_input = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy_input, "model.onnx", export_params=True)

By doing this, you ensure that the ONNX model outputs logits, and you can apply softmax as needed during inference.

If you have any more questions or need further assistance, feel free to ask. We're here to help! 😊

from ultralytics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.