roatienza / straug Goto Github PK

Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

License: Apache License 2.0

Python 100.00%

str data-augmentation scene-text-recognition

straug's Issues

(Request) Add a parameter for borderValue using in warp (TPS transforms)

As per title, it would be much convinient if you provide a seperate parameter for borderValue using in cv2 warpImage method. The default value is 0, i.e. the black fill color.

Gaussian blur kernel size for small images

Currently, the kernel size is fixed to 31x31 (

straug/blur.py

Line 38 in 43f9ca9

kernel = (31, 31)

)

This causes an error internally in the call to reflection_pad2d():
RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (15, 15) at dimension 3 of input 4

if one of the image's dimensions is less than the kernel size.

Should the kernel size be a percentage of the image's dimensions instead, e.g. 30-50% of the smaller dimension?

question about training speed

thanks for your excellent job! it seems that the training is very slow when i use the straug(6x times slower than that without straug). What about the real speed when you test? The following is my aug-code.

class RecStraugRandAug(object):
    def __init__(self, num_aug=2, prob=0.5, **kwargs):
        super().__init__()
        self.num_aug = num_aug
        self.prob = prob
        try:
            from straug.blur import GaussianBlur, DefocusBlur, MotionBlur, GlassBlur
            from straug.camera import Contrast, Brightness, JpegCompression, Pixelate
            from straug.geometry import Perspective, Shrink, Rotate
            from straug.noise import GaussianNoise, ShotNoise, ImpulseNoise, SpeckleNoise
            from straug.pattern import Grid, VGrid, HGrid, RectGrid, EllipseGrid
            from straug.process import Posterize, Solarize, Invert, Equalize, AutoContrast, Sharpness, Color
            from straug.warp import Stretch, Distort, Curve
            from straug.weather import Fog, Snow, Frost, Rain, Shadow
            self.augs = [
                [GaussianBlur(), DefocusBlur(), MotionBlur(), GlassBlur()],
                [Contrast(), Brightness(), JpegCompression(), Pixelate()],
                [Perspective(), Shrink(), Rotate()],
                [GaussianNoise(), ShotNoise(), ImpulseNoise(), SpeckleNoise()],
                [Grid(), VGrid(), HGrid(), RectGrid(), EllipseGrid()],
                [Posterize(), Solarize(), Invert(), Equalize(), AutoContrast(), Sharpness(), Color()],
                [Stretch(), Distort(), Curve()],
                [Fog(), Snow(), Frost(), Rain(), Shadow()],
            ]
        except Exception as ex:
            print(f"exception: {ex}, you can install straug using `pip install straug`")
            exit(-1)
    
    def __call__(self, data):
        img = Image.fromarray(data["image"])
        for idx in range(self.num_aug):
            aug_type_idx = np.random.randint(0, len(self.augs))
            aug_idx = np.random.randint(0, len(self.augs[aug_type_idx]))
            img = self.augs[aug_type_idx][aug_idx](img, mag=random.randint(-1,2), prob=self.prob)
        data["image"] = np.array(img)
        return data

RandAugment

hi thanks for your work

are you implement the RandAugment?

in your paper:

geometry = [Rotate(), Perspective(), Shrink()]
noise = [GaussianNoise()]
blur = [MotionBlur()]
augmentations = [geometry, noise, blur]
img = RandAugment(img, augmentations, N=3)

How to deal with underfitting?

Hello, I am a fresh researcher and recently I noticed your code which is very useful to solve my problem to some extent. My project is also scene text recognition while the dataset is much more irregular. I think your paper give me a constructive guidance. However, there is still some problems that when the N(number of functions in each channel) is going to be larger, maybe 3 or 4, the model preforms hardly to be fitted. the accuracy of training set is always surrounding about 90%. For more, if I add a preprocessing method of random cut, the accuracy is always surrounding about 80%. Could you give me some suggestions to deal with such problems? Thanks.

roatienza / straug Goto Github PK

straug's Issues

(Request) Add a parameter for borderValue using in warp (TPS transforms)

Gaussian blur kernel size for small images

question about training speed

RandAugment

How to deal with underfitting?

CRNN Data enhancement

speed is very low

Questions about paper

enhance a dataset

how to glared image

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs