GithubHelp home page GithubHelp logo

Training Abnormality about ultralytics HOT 16 CLOSED

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024
Training Abnormality

from ultralytics.

Comments (16)

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

2

from ultralytics.

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

Uploading 1.png…

from ultralytics.

glenn-jocher avatar glenn-jocher commented on August 27, 2024

@DaCheng1823 hello,

Thank you for reaching out and providing the error details. It looks like there might be a mismatch in the channel dimensions during the model's forward pass.

To help us diagnose the issue more effectively, could you please provide a minimum reproducible example of your code? This will allow us to better understand the context and configuration you're using. You can find guidelines on how to create a reproducible example here.

Additionally, please ensure that you are using the latest version of the Ultralytics package and dependencies. Sometimes, issues are resolved in newer releases.

Looking forward to your response so we can assist you further!

from ultralytics.

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

23
Hello, lines 4, 5, 6, and 7 of the model file seem to be abnormal. What causes this?

from ultralytics.

glenn-jocher avatar glenn-jocher commented on August 27, 2024

Hello @DaCheng1823,

Thank you for providing the details and the screenshot. It appears there might be a configuration issue in the model file, leading to the channel mismatch error.

To help us diagnose the issue more effectively, could you please provide a minimum reproducible example of your code? This will allow us to better understand the context and configuration you're using. You can find guidelines on how to create a reproducible example here.

Additionally, please ensure that you are using the latest version of the Ultralytics package and dependencies. Sometimes, issues are resolved in newer releases.

Looking forward to your response so we can assist you further! 😊

from ultralytics.

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

class ConvNormLayer(nn.Module):
def init(self, ch_in, ch_out, kernel_size, stride, act=None, padding=None, bias=False):
super().init()
self.conv = nn.Conv2d(
ch_in,
ch_out,
kernel_size,
stride,
padding=(kernel_size - 1) // 2 if padding is None else padding,
bias=bias)
self.act = act
self.norm = nn.BatchNorm2d(ch_out)
self.act = nn.Identity() if act is None else getattr(F, self.act)

def forward(self, x):
    return self.act(self.norm(self.conv(x)))

ResNet18、34

class BasicBlock(nn.Module):
expansion = 1

def __init__(self, ch_in, ch_out, stride, shortcut, act='relu', variant='b'):
    super().__init__()

    self.shortcut = shortcut

    self.act = act

    if not shortcut:
        if variant == 'd' and stride == 2:
            self.short = nn.Sequential(OrderedDict([
                ('pool', nn.AvgPool2d(2, 2, 0, ceil_mode=True)),
                ('conv', ConvNormLayer(ch_in, ch_out, 1, 1))
            ]))
        else:
            self.short = ConvNormLayer(ch_in, ch_out, 1, stride)

    self.branch2a = ConvNormLayer(ch_in, ch_out, 3, stride, act=act)
    self.branch2b = ConvNormLayer(ch_out, ch_out, 3, 1, act=None)
    self.act = nn.Identity() if act is None else getattr(F, self.act)

def forward(self, x):
    out = self.branch2a(x)
    out = self.branch2b(out)
    if self.shortcut:
        short = x
    else:
        short = self.short(x)

    out = out + short
    out = self.act(out)

    return out

class Blocks(nn.Module):
def init(self, ch_in, ch_out, block, count, stage_num, act='relu', variant='b'):
super().init()

    if block == "BasicBlock":
        block = BasicBlock
    elif block == "BottleneckBlock":
        block = BottleNeck
    else:
        return False

    self.blocks = nn.ModuleList()
    for i in range(count):
        self.blocks.append(
            block(
                ch_in,
                ch_out,
                stride=2 if i == 0 and stage_num != 2 else 1,
                shortcut=False if i == 0 else True,
                variant=variant,
                act=act)
        )

        if i == 0:
            ch_in = ch_out * block.expansion

def forward(self, x):
    out = x
    for block in self.blocks:
        out = block(out)
    return out

from ultralytics.

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

image
This is my yaml,but i got a strange model

from ultralytics.

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

image

from ultralytics.

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

image

from ultralytics.

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

image
when i dug it , It shows normal

from ultralytics.

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

When I remove the 'b' from the yaml file, it works fine. Is there a bug? If I only set up the RT-DETR network structure to match the structure in the paper but not the related parameter Settings. And then I use this model to improve, can you say RT-DETR as the baseline model?
image

from ultralytics.

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

This exceeds the memory of my GPU.
image

from ultralytics.

glenn-jocher avatar glenn-jocher commented on August 27, 2024

Hello @DaCheng1823,

Thank you for providing the details and the screenshots. It looks like the issue might be related to the specific configuration in your YAML file. When you removed the 'b', it worked fine, which suggests that there might be a bug or a misconfiguration related to that parameter.

Regarding your memory issue, here are a few suggestions to help manage GPU memory usage:

  1. Reduce Batch Size: Lowering the batch size can significantly reduce memory usage.

    batch: 4  # Example of reducing batch size
  2. Image Size: Reducing the image size can also help manage memory usage.

    imgsz: 512  # Example of reducing image size
  3. Mixed Precision Training: If supported, using mixed precision training can help reduce memory usage.

    model.train(data="coco8.yaml", epochs=100, imgsz=640, amp=True)
  4. Model Pruning: Simplifying the model architecture by reducing the number of layers or channels can also help.

Regarding your question about using RT-DETR as a baseline model: If you are using the RT-DETR network structure as described in the paper but modifying the parameters or making improvements, it is still valid to refer to RT-DETR as your baseline model. Just make sure to clearly document the changes and improvements you have made in your work.

If you continue to experience issues, please provide a minimum reproducible example of your code here so we can assist you further.

Feel free to reach out if you have any more questions or need further assistance! 😊

from ultralytics.

DaCheng1823 avatar DaCheng1823 commented on August 27, 2024

When passing a block parameter, it is a string value, but in self.blocks.append, block is an object. How can I change this?block is a class that inherits nn.model
image

from ultralytics.

glenn-jocher avatar glenn-jocher commented on August 27, 2024

Hello @DaCheng1823,

Thank you for your question! It looks like you're trying to dynamically instantiate a class based on a string value. You can achieve this by using Python's getattr function to convert the string to a class reference. Here's an example of how you can modify your code:

class Blocks(nn.Module):
    def __init__(self, ch_in, ch_out, block, count, stage_num, act='relu', variant='b'):
        super().__init__()

        # Dynamically get the class from the string
        block_class = globals()[block]

        self.blocks = nn.ModuleList()
        for i in range(count):
            self.blocks.append(
                block_class(
                    ch_in,
                    ch_out,
                    stride=2 if i == 0 and stage_num != 2 else 1,
                    shortcut=False if i == 0 else True,
                    variant=variant,
                    act=act)
            )

            if i == 0:
                ch_in = ch_out * block_class.expansion

    def forward(self, x):
        out = x
        for block in self.blocks:
            out = block(out)
        return out

In this example, globals()[block] dynamically retrieves the class from the string name. Make sure that the class name provided in the block parameter matches exactly with the class name defined in your code.

If you encounter any further issues, please provide a minimum reproducible example here to help us better understand and assist you.

Feel free to reach out if you have any more questions! 😊

from ultralytics.

github-actions avatar github-actions commented on August 27, 2024

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

from ultralytics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.