Comments (3)
Hi, there are two issues you need to think about. 1. Image generator in a batch. I personally think padding should be avoided as it can either change image quality or change convolution results. I grouped images based on their resolutions, such that the images served in each batch have same resolution. 2. Basically TRIQ can handle arbitrary resolutions. However, then I developed the first version of TRIQ (i.e., the repo now), I simply used the largest resolution in the image set. Therefore, you can define the maximum_position_encoding (line 127 in transformer_iqa.py) according to your images. This value should be set to HW/(3232) + 1, and H,W are the largest resolution of your images. However, I have also improved TRIQ and got comparable performance, in which I used a spatial pooling method. I will release it later. For now, if you want to test TRIQ, you can just set maximum_position_encoding.
from triq.
The grouping solution sounded the best to me. But the problem with our dataset was that we don't have equal distribution of classes for the images of every possible resolution. Meaning, for image size of (2000x2000) we might have only three of the four classes. This is why I couldn't go ahead with the grouping approach.
The best solution ahead was using padding only during training phase which would give a fixed size feature map on top of which transformer could be used. What do you think about this?
from triq.
The grouping solution sounded the best to me. But the problem with our dataset was that we don't have equal distribution of classes for the images of every possible resolution. Meaning, for image size of (2000x2000) we might have only three of the four classes. This is why I couldn't go ahead with the grouping approach.
The best solution ahead was using padding only during training phase which would give a fixed size feature map on top of which transformer could be used. What do you think about this?
Hi, the way I handle the situation is that I carefully split train_val_test sets, and then use augmentation (in my case I only use horizontal flip) to make sure the images in each batch have same resolution. I first group images in terms of their resolution. Then in each batch (probably cannot use a large batch size), I just serve the image with same resolutions. If the number of images with same resolution is less than batch size, then I use both duplication of the images and their horizontally flipped images to fill. I personally don't think padding is a good solution, as it potentially changes image quality and definitely changes the convolutional results.
from triq.
Related Issues (20)
- Training HOT 10
- AttributeError: 'MyCSVLogger' object has no attribute 'file_flags' HOT 1
- Combined database normalisation HOT 1
- Accuracy and loss function visualisation HOT 1
- Could you please provide me a copy of the CSIQ dataset? HOT 1
- Does the sequence of datasets need to shuffle? In the code, shuffle is set False HOT 8
- OOM HOT 4
- δΈθ½θΏθ‘ image_quality_prediction.py HOT 1
- The test set HOT 2
- Issue with Training - Generator error HOT 5
- Same output for every input image HOT 6
- training HOT 2
- Input HOT 5
- plcc HOT 4
- TRIQ failure on images of particular size range HOT 1
- request for trained model
- Save model config data
- save model architecture HOT 1
- dataset HOT 1
- About dataset HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from triq.