juliendenize / torchaug Goto Github PK

View Code? Open in Web Editor NEW

23.0 2.0 2.0 2.48 MB

Library to perform efficient vision data augmentations for CPU/GPU per-sample/batched data.

Home Page: https://torchaug.readthedocs.io/en/latest/

License: Other

Python 100.00%

data-augmentation deep-learning gpu image-processing pytorch video-processing

torchaug's Introduction

Efficient vision data augmentations for CPU/GPU per-sample/batched data.

Under active development, subject to API change

Torchaug

Introduction

Torchaug is a data augmentation library for the Pytorch ecosystem. It is meant to deal efficiently with tensors that are either on CPU or GPU and either per sample or on batches.

It enriches Torchvision (v2) that has been implemented over Pytorch and Pillow to, among other things, perform data augmentations. Because it has been implemented first with per-sample CPU data augmentations in mind, it has several drawbacks to make it efficient:

For data augmentations on GPU, some CPU/GPU synchronizations cannot be avoided.
For data augmentations applied on batch, the randomness is sampled for the whole batch and not each sample.

Torchaug removes these issues and its transforms are meant to be used in place of Torchvision. It is based on the code base of Torchvision and therefore follows the same nomenclature as Torchvision with functional augmentations and transforms class wrappers. However, Torchaug does not support transforms on Pillow images.

More details can be found in the documentation.

To be sure to retrieve the same data augmentations as Torchvision, the components are tested to match Torchvision outputs. We made a speed comparison here.

If you find any unexpected behavior or want to suggest a change please open an issue.

How to use

Install Torchaug.

pip install torchaug

Import data augmentations from the torchaug.transforms package just as for Torchvision.

from torchaug.transforms import (
    RandomColorJitter,
    RandomGaussianBlur,
    SequentialTransform
)


transform = SequentialTransform([
    RandomColorJitter(...),
    RandomGaussianBlur(...)
])

For a complete list of transforms please see the documentation.

How to contribute

Feel free to contribute to this library by making issues and/or pull requests. For each feature you implement, add tests to make sure it works. Also, please update the documentation.

Credits

We would like to thank the authors of Torchvision for generously opening their source code. Portions of Torchaug were originally taken from Torchvision, which is released under the BSD 3-Clause License. Please see their repository and their BSD 3-Clause License for more details.

LICENSE

Torchaug is licensed under the CeCILL-C license.

torchaug's People

Contributors

Stargazers

Watchers

Forkers

engelba vfdev-5

torchaug's Issues

Remove headers of the files

In v0.4.1 the license was changed to CeCILL-C and the copyright was transferred to CEA as it was first developed in my spare time and as part of my diploma thesis and now as part of my job.

For this release, I modified the header of every files to mention the following:

# @Copyright: CEA-LIST/DIASI/SIALV/ (2023-    )
# @Author: CEA-LIST/DIASI/SIALV/ <[email protected]>
# @License: CECILL-C

With some of them also crediting Torchvision by adding:

#
# Code partially based on Torchvision (BSD 3-Clause License), available at:
#   https://github.com/pytorch/vision

We should consider removing these mentions to get rid of templates:

The COPYRIGHT file already contains a template that people can use to refer to Torchaug and that contains the credits to Torchvision. Torchvision is also credited in the README that is packaged along the code on Pypi.
Is there a legal value of such mentions ? and even if yes, does it have to be attached to each file when the LICENSE is packaged ?

AttributeError from flat_inputs in RandomApplyTransform

These lines of code

params = self._get_params(
    [inpt for (inpt, needs_transform) in zip(flat_inputs, needs_transform_list) if needs_transform],
    num_chunks=1,
    chunks_indices=(torch.tensor([0], device=flat_inputs[0].device),),
)[0]

raise an AttributeError: object has no attribute 'device' flat_inputs[0] is not a subclass of tensor.

To get the input device, we should follow an heuristic to retrieve the device from the first subclass of tensor found.

Support for Tensor of different shapes

Some models accept input of different shapes (object detection, segmentation, ...)

Torchaug should be able to deal with list of Tensors as a new data type:

class TANestedTensor():
   ...

class BatchimagesNestedTensor(TANestedTensor):
   ...

Refractor on torchvision v2

Torchvision released the v2 transform API.

https://pytorch.org/vision/0.17/transforms.html#v1-or-v2-which-one-should-i-use

Torchaug will not only support this new API but be refractor on top of it to support enhancements and augmentations on bounding boxes and masks.

"ModuleNotFoundError: No module named 'torchvision.tv_tensors'" when importing batch_transforms

Got the error "ModuleNotFoundError: No module named 'torchvision.tv_tensors'" when doing importing torchaug.batch_transforms.

Thanks!

Increase width of the documentation content

Description

Furo has a width for content that does not fit well with our content as Torchaug follows a line length of 119 characters and not 79.

Therefore, navigating through the code requires using a horizontal slider.

Desired improvement

Increase the width of the documentation content.

Updating video_format via wrappers is too late

Description

VideoWrapper and BatchVideoWrapper updates the video_format of its transforms such as VideoNormalize, but it is too late if the behavior of the transform is fixed at instantiation by video_format argument and not updated thanks to a setter method.

Minimal bug reproduction

import torch
from torchaug.batch_transforms import BatchVideoWrapper
from torchaug.transforms import VideoNormalize

transform = BatchVideoWrapper([
    VideoNormalize([0.225, 0.225, 0.225], 225)
], video_format="TCHW")

print(transform)
transform(torch.randn(2, 4, 3, 224, 224))

>>> BatchVideoWrapper(
    inplace=False,
    same_on_frames=True,
    video_format=TCHW,
    transforms=ModuleList(
      (0): VideoNormalize(mean=[[[0.22499999403953552]], [[0.22499999403953552]], [[0.22499999403953552]]], std=[[[225]]], cast_dtype=None, inplace=True, value_check=False, video_format=TCHW)
    )
)
>>> ...
>>> The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 2

while the following works:

import torch
from torchaug.batch_transforms import BatchVideoWrapper
from torchaug.transforms import VideoNormalize

transform = BatchVideoWrapper([
    VideoNormalize([0.225, 0.225, 0.225], 225, video_format="TCHW")
], video_format="TCHW")

print(transform)
transform(torch.randn(2, 4, 3, 224, 224))

>>> BatchVideoWrapper(
    inplace=False,
    same_on_frames=True,
    video_format=CTHW,
    transforms=ModuleList(
      (0): VideoNormalize(mean=[[[0.22499999403953552]], [[0.22499999403953552]], [[0.22499999403953552]]], std=[[[225]]], cast_dtype=None, inplace=True, value_check=False, video_format=TCHW)
    )
)
>>> ...
>>> The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 2

Fix proposition

Add a setter to all video transforms for the video_format attribute.

Improve onboarding

To make people contribute there is a need for

better documentation to contribute: environment, style guides, ...
stable pre-commit process with github workflows

Pin memory in dataloaders remove the typing of tensors

Context

Since the release of Pytorch 2.3.0, using pin_memory = True in the dataloaders remove the typing of Torchaug tensors which is impractical.

Reproduce

import torch
from torchaug.data.dataloader import default_collate
from torchaug.ta_tensors import Image
from torch.utils.data import Dataset, DataLoader


class MyDataset(Dataset):
    def __init__(self):
        super().__init__()
    
    def __getitem__(self, index):
        return Image(torch.rand(1, 3, 224, 224))
    
    def __len__(self):
        return 10

dataset = MyDataset()
dataloader = DataLoader(dataset, batch_size=2, collate_fn=default_collate, pin_memory=True)

batch = next(iter(dataloader))
print(type(batch))
>>> torch.Tensor

dataloader = DataLoader(dataset, batch_size=2, collate_fn=default_collate, pin_memory=False)

batch = next(iter(dataloader))
print(type(batch))
>>> torchaug.ta_tensors._batch_images.BatchImages

Suggestion for fix

Until it is fixed remove the pin_memory: short-term solution which can cause slow downs
Rewrite partly the dataloader and expose it in Torchaug
Recast the types of the tensors after collation which forces metadata to be stored for several Torchaug tensors (eg: number of masks per samples)

Inplace error after expand

Description

When an operation is performed with an expand and is followed by an in-place operation it raises an error. This raises an error in BatchWrapper if no copies are made (for example when the probability of making the first operation is 1).

Minimal bug reproduction

import torch
from torchaug.batch_transforms import BatchImageWrapper, BatchRandomGrayScale
from torchaug.transforms import Normalize

transform = BatchImageWrapper([
    BatchRandomGrayScale(1.), # Call expand
    Normalize(225, 225)
])

transform(torch.randn(4, 3, 224, 224))

>>> RuntimeError: unsupported operation: more than one element of the written-to tensor refers to a single memory location. Please clone() the tensor before performing the operation.

Fix proposition

Add .contiguous() call in wrappers to avoid this situation.

[Bug] Chunks are not correctly formed

Issue

Chunks are not correctly formed for BatchLabels, BatchBoundingBoxes and BatchMasks.

Example for BatchMasks:

def get_chunk(self, chunk_indices: torch.Tensor) -> BatchMasks:
        chunk_idx_sample = torch.tensor(
            [0] + [self.idx_sample[chunk_indice + 1] - self.idx_sample[chunk_indice] for chunk_indice in chunk_indices]
        )

        chunk_idx_sample = chunk_idx_sample.cumsum(0).tolist()

        return BatchMasks(
            self[chunk_indices], # wrong select
            idx_sample=chunk_idx_sample,
            device=self.device,
            requires_grad=self.requires_grad,
        )

Suggestions

Create a _BatchConcatenatedTATensor private class to inherit from
Fix the select for each classes
Make tests for such classes

Add Pillow and OpenCV support for conversion

Unlike Torchvision, Torchaug do not support transforms for Pillow images as it does not make sense for batches.

However it makes sense to be able to convert pillow images and opencv to images thanks to, among other things, the to_image function.

juliendenize / torchaug Goto Github PK

torchaug's Introduction

Torchaug

Introduction

How to use

How to contribute

Credits

LICENSE

torchaug's People

Contributors

Stargazers

Watchers

Forkers

torchaug's Issues

Description

Desired improvement

Description

Minimal bug reproduction

Fix proposition

Context

Reproduce

Suggestion for fix

Description

Minimal bug reproduction

Fix proposition

Issue

Suggestions

Recommend Projects

Recommend Topics

Recommend Org

Jobs