GithubHelp home page GithubHelp logo

Rework of Item about icevision HOT 13 CLOSED

airctic avatar airctic commented on July 30, 2024
Rework of Item

from icevision.

Comments (13)

lgvaz avatar lgvaz commented on July 30, 2024

The next step is to think how each record would be transformed to the specific item

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

If we take close attention, Item already is a very specialized version for torchvision RCNNs.. This makes the point of sub-dividing into classes even stronger

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

Record is also very specialized already..

this is like so because Parsers can't return a specific item already.

We can try having the parser return one specific type of item, defined by the user, linked to the model..

The records can be built with mixins

class MaskRecordMixin(ABC):
    ...

class BBoxRecordMixin(ABC):
    ...

class MaskRCNNRecord(MaskRecordMixin, BBoxRecordMixin, Record):
    ...

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

If working with mixins, we cannot rely on constructors like so:

ImageInfo(imageid=42, filepath='mypath/hello.jpg', h=420, w=480)

Because mixins cannot have constructors...

Instead we would need to do something like:

info = ImageInfo()
info.imageid = 42
info.filepath = 'mypath/hello.jpg'

But that is bad because it does not immediately tell the user what to modify

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

So, mixins ended up working quite well:

class ParserMixin(ABC):
    @abstractmethod
    def collect_parse_funcs(self, funcs=None):
        return funcs or {}


class ImageidParserMixin(ParserMixin):
    def collect_parse_funcs(self, funcs=None):
        funcs = super().collect_parse_funcs(funcs)
        return {"imageid": self.imageid, **funcs}

    @abstractmethod
    def imageid(self, o) -> int:
        pass

This way we can stack as many mixins as we want:

class DefaultImageInfoParser(
    ImageInfoParser,
    ImageidParserMixin,
    FilepathParserMixin,
    SizeParserMixin,
    SplitParserMixin,
    ABC,
):
    pass

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

The result of the parser is currently a dict, differently from what we had before as a Record.

I'm still not sure if it's better to have a dict or a different Record for each use case. The problem of using Record is that we will have a lot of different permutations: BBoxRecord, BBoxMaskRecord, MaskRecord, and so on

Using the dynamic feature of the dict can is a good factor here.

Now, the problem is, how to make this standard across all library? When using dict we loose auto completion =/

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

Previously we went from Record -> Item, and this was what was used in the transforms and subsequently fed to the model

Should we also substitute Item to a dict?

In the Dataset we have to "prepare" our data, meaning we have to open images, convert masks and etc..

The thing is that again, we are not sure of the keys we have in the dict... Mixins again? =)

We can do something like

    def __getitem__(self, i):
        record = self.records[i]
        data = {k: prepare_data(k, v) for k, v in record.items()}
class BBoxMixin:
    def prepare_data(self, k, v):
        if k == "bbox":
            return v
        return super().prepare_data(k, v)

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

We only implement the previously mentioned mixins for keys that need to change (like image and mask), else we can return the default value

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

We can define a parser for each type of model:

MaskRCNNParser
DetrParser

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

How to combine ImageInfo and Annotation?

Like before, this will be done inside a class DataParser, but it's not so clear what this class should return. In the previous version we had a dataclass Record being returned, should we stick with the same?

Record can always have the fields: imageid, image_info, annotation

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

The split should be decided inside DataParser

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

Instead of separating imageid, image_info and annotation we can just combine everything in a single dict.

This way we can actually remove the difference between an AnnotationParser and a ImageInfoParser, all is parsers!

from icevision.

lgvaz avatar lgvaz commented on July 30, 2024

The dream is to have something to be able to have something like this:

class MyParser(DefaultImageInfoParser, FasterRCNNParser):
    pass

You basically can mix and match all you want =)

from icevision.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.