Currently we have an Item class that is handling all

Record is also very specialized already.. <p dir=

If working with mixins, we cannot rely on constructors like so: <div class="highli

So, mixins ended up working quite well: <div class="highlight highlight-source-pyt

Previously we went from Record -> <code class="not

We can define a parser for each type of model: <div class="highlight highlight-sou

How to combine ImageInfo and <code class="notranslate

Rework of Item about icevision HOT 13 CLOSED

airctic commented on July 30, 2024

Rework of Item

from icevision.

Comments (13)

lgvaz commented on July 30, 2024

The next step is to think how each record would be transformed to the specific item

from icevision.

lgvaz commented on July 30, 2024

If we take close attention, Item already is a very specialized version for torchvision RCNNs.. This makes the point of sub-dividing into classes even stronger

from icevision.

lgvaz commented on July 30, 2024

Record is also very specialized already..

this is like so because Parsers can't return a specific item already.

We can try having the parser return one specific type of item, defined by the user, linked to the model..

The records can be built with mixins

class MaskRecordMixin(ABC):
    ...

class BBoxRecordMixin(ABC):
    ...

class MaskRCNNRecord(MaskRecordMixin, BBoxRecordMixin, Record):
    ...

from icevision.

lgvaz commented on July 30, 2024

If working with mixins, we cannot rely on constructors like so:

ImageInfo(imageid=42, filepath='mypath/hello.jpg', h=420, w=480)

Because mixins cannot have constructors...

Instead we would need to do something like:

info = ImageInfo()
info.imageid = 42
info.filepath = 'mypath/hello.jpg'

But that is bad because it does not immediately tell the user what to modify

from icevision.

lgvaz commented on July 30, 2024

So, mixins ended up working quite well:

class ParserMixin(ABC):
    @abstractmethod
    def collect_parse_funcs(self, funcs=None):
        return funcs or {}


class ImageidParserMixin(ParserMixin):
    def collect_parse_funcs(self, funcs=None):
        funcs = super().collect_parse_funcs(funcs)
        return {"imageid": self.imageid, **funcs}

    @abstractmethod
    def imageid(self, o) -> int:
        pass

This way we can stack as many mixins as we want:

class DefaultImageInfoParser(
    ImageInfoParser,
    ImageidParserMixin,
    FilepathParserMixin,
    SizeParserMixin,
    SplitParserMixin,
    ABC,
):
    pass

from icevision.

lgvaz commented on July 30, 2024

The result of the parser is currently a dict, differently from what we had before as a Record.

I'm still not sure if it's better to have a dict or a different Record for each use case. The problem of using Record is that we will have a lot of different permutations: BBoxRecord, BBoxMaskRecord, MaskRecord, and so on

Using the dynamic feature of the dict can is a good factor here.

Now, the problem is, how to make this standard across all library? When using dict we loose auto completion =/

from icevision.

lgvaz commented on July 30, 2024

Previously we went from Record -> Item, and this was what was used in the transforms and subsequently fed to the model

Should we also substitute Item to a dict?

In the Dataset we have to "prepare" our data, meaning we have to open images, convert masks and etc..

The thing is that again, we are not sure of the keys we have in the dict... Mixins again? =)

We can do something like

    def __getitem__(self, i):
        record = self.records[i]
        data = {k: prepare_data(k, v) for k, v in record.items()}

class BBoxMixin:
    def prepare_data(self, k, v):
        if k == "bbox":
            return v
        return super().prepare_data(k, v)

from icevision.

lgvaz commented on July 30, 2024

We only implement the previously mentioned mixins for keys that need to change (like image and mask), else we can return the default value

from icevision.

lgvaz commented on July 30, 2024

We can define a parser for each type of model:

MaskRCNNParser
DetrParser

from icevision.

lgvaz commented on July 30, 2024

How to combine ImageInfo and Annotation?

Like before, this will be done inside a class DataParser, but it's not so clear what this class should return. In the previous version we had a dataclass Record being returned, should we stick with the same?

Record can always have the fields: imageid, image_info, annotation

from icevision.

lgvaz commented on July 30, 2024

The split should be decided inside DataParser

from icevision.

lgvaz commented on July 30, 2024

Instead of separating imageid, image_info and annotation we can just combine everything in a single dict.

This way we can actually remove the difference between an AnnotationParser and a ImageInfoParser, all is parsers!

from icevision.

lgvaz commented on July 30, 2024

The dream is to have something to be able to have something like this:

class MyParser(DefaultImageInfoParser, FasterRCNNParser):
    pass

You basically can mix and match all you want =)

from icevision.

Rework of Item about icevision HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs