lhackel-tub / configilm Goto Github PK

View Code? Open in Web Editor NEW

33.0 33.0 4.0 236.76 MB

A Library for configurable combination of pre-configured and possibly pre-trained Image and Language Models

Home Page: https://lhackel-tub.github.io/ConfigILM/

License: MIT License

Python 100.00%

configilm's People

Contributors

Stargazers

Watchers

Forkers

rsim-tu-berlin lamoe291 kai-tub elseviersoftwarex

configilm's Issues

Rename ILMTypes to Image as well

ILMTypes are currently called Vision... instead of Image... even tho everything else is called Image...
They should be renamed

Tests are very slow

There are a lot of tests and they are very small. Optimizing the tests could help development speed

Installation conflict with torchvision and torch 2.0.0

The current naming suggests that any Vision-related task is included. However, the Library is specific to Images. Therefore the library should be renamed to ConfigILM - including all references for vision within.

Google Colab is not working

The example written for google colab is not working as the package cannot be installed

Add benchmark models from the literature to be able to test fast against new methods

If the library implements some benchmark models using the configurations, it would be easier for end users to fast test against the existing methods. In addition these benchmarks would show as examples how the library could be used.

Possible starting points could be:

Add a simple training script without logging

The current Scripts use WandB for logging. However, this requires a WandB account and Internet access to work properly. To showcase how to use the framework, logging is not necessary.

Extend Docstrings

Documentation (docstrings) should be extended/created for

Examples for pure Pytorch

Currently, the example scripts are only written in pytorch lightning. Add example scripts (one per use case) in pure pytorch as well.

can't load full RSVQALR dataset

The RSVQALR dataset cannot be loaded because there is a KeyError in _get_question_answers when loading the answers. The problem here is, that inactive answers are still part of the answer list before they are filtered and therefore the question_id key is accessed for inactive elements which don't have this attribute

Training is very slow

The current version uses a pytorch/huggingface (or maybe other dependencies) combination that is very slow, as shown in this and this issue.

The current workaround would be to use version 0.3.0 and manually update the packages (e.g., timm) after installing ConfigILM until a fix is given in the dependencies or otherwise known.

psutil not a requirement

psutil is not a requirement when installing, however it is used in RSVQAxBEN Data module. The import fails when not installed additionally

API Documentation

A documentation of the api including the descriptions of the docstrings should be added in the guide to look up e.g. parameter names for specific parts of the library.

optional omitting of ground truth label entries

In BENLMDBReader there should be an option that allows to omit specified classes in the returned ground truth label vectors (and reduce the ground truth vectors accordingly). This can be helpful when working with subsets of BigEarthNet that do not contain all the classes.

torchvision antialias user warning thrown for default data modules

The transformations in RSVQAxBEN and BEN Datamodules throw user warnings due to changed default behavior in torchvision 0.17. This can be mitigated using anitalias=true in all Compose calls.

Choose More Intuitive Parameter Names For Loading Pretrained Models

The parameters load_hf_if_available and load_timm_if_available of the configilm.ILMConfiguration class are switches for loading a pretrained huggingface or timm model according to this docstring. Their purpose could be made clearer and more explicit by renaming them to something like load_pretrained_hf_if_available and load_pretrained_timm_if_available respectively.

This is actually done in a less user-facing function here.

Add support for multiple python versions

Supporting only 3.10 and up seems very restrictive. Maybe one could increase this threshold and also test against different versions?

Classes in DataModules

The provided Datamodules do not expose a way of changing the number of classes in the datasets. Therefore datamodules are not usable when a different number of classes is required or wanted

Create an abstract class that all datamodules inherit from

The datamodules contain a lot of redundant code. To make this more flexible and less duplicate, an abstract class (based on Lightning datamodule) should be implemented that all datamodules inherit from.
Also Tests should use this class to test all basic functionality and only the individual additional functionality should be in the individual test classes

Include definitions of fusion functions

Some existing fusion methods could be added for simplicity such that more model combinations can be directly applied without having to re-implement the fusion functions.

Possible candidates:

MUTAN
MIDF (implementation)

Multi-Dim Fusion

The current fusion implementation only supports fusion types with the same input and output dimensions. However, there are some fusion types - e.g. MUTAN like in this implementation where the fusion output dimension can be different than the input dimension.

Update to lightning

Right now, the library uses pytorch_lightning, which was renamed to lightning. To reflect this change, the dependency should be updated.

Add HRVQA dataset

Dataset HRVQA was released and could be a nice addition.

wheel size reduction

The current state of the library cannot be published as there are to many examples in the mock data. Only up to 100MB can be published, therefore the number of examples in the mock data has to be reduces significantly

RSVQA quantization

The datasets as implemented use every answer as its own class. However, in the paper the datasets use quantization (II.B, page 5) so that e.g. are buckets and number buckets are created, which results in drastically fewer answer classes. A flag should be added to the datasets to allow the same quantization approach

Document Design Ideas of Library

The documentation seems to explain features only by example so far. I would like to see a more abstract documentation page which answers questions like:

What are the main user-facing classes of this library? (Answer as far as I understand: ILMConfiguration and ConfigILM)
What are the roles of these classes?

Logos in Documentation not displayed correctly

The logos in the Documentation are not displayed. Instead, the alt text is shown. This is expected, as relative paths are not trivial in static HTML.
https://pradyunsg.me/furo/customisation/footer/

example_scripts and baselines

The folder example_scripts should be called only scripts and the scripts from baselines folder should also be moved to scripts

Reduce Dependencies

The current state of dependencies always installs everything. This should not be the case and some dependencies should be optional - therefore the name extra for the subpackage

Example Script only works with Classification

The current state of the example scripts only works for classification. The scripts should be divided into one example for classification and one for VQA.

Add RSVQA DataSet and DataModule

Currently the original RSVQA data sets are not included. As they are publicly available, they should be added.

ben19_list_to_onehot does not use bigearthnetcommon ben_19_labels_to_multi_hot

The function should use the libraries function to assert the same functionality.

In File: extra/BEN_lmdb_utils.py:117

Option to return patch-names for BENDataSet

For visualization of image retrieval results using BEN, it is necessary to assign patch-names to samples returned from the BENDataSet.__getitem__() method. A convenient extension to this class would therefore be to (optionally) return the patch-names (key) in addition to image and labels.

lhackel-tub / configilm Goto Github PK

configilm's People

Contributors

Stargazers

Watchers

Forkers

configilm's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs