GithubHelp home page GithubHelp logo

juliendenize / torchaug Goto Github PK

View Code? Open in Web Editor NEW
23.0 2.0 2.0 2.48 MB

Library to perform efficient vision data augmentations for CPU/GPU per-sample/batched data.

Home Page: https://torchaug.readthedocs.io/en/latest/

License: Other

Python 100.00%
data-augmentation deep-learning gpu image-processing pytorch video-processing

torchaug's Introduction

Efficient vision data augmentations for CPU/GPU per-sample/batched data.

Under active development, subject to API change

PyPI python PyPI version documentation codecov License

Torchaug

Introduction

Torchaug is a data augmentation library for the Pytorch ecosystem. It is meant to deal efficiently with tensors that are either on CPU or GPU and either per sample or on batches.

It enriches Torchvision (v2) that has been implemented over Pytorch and Pillow to, among other things, perform data augmentations. Because it has been implemented first with per-sample CPU data augmentations in mind, it has several drawbacks to make it efficient:

  • For data augmentations on GPU, some CPU/GPU synchronizations cannot be avoided.
  • For data augmentations applied on batch, the randomness is sampled for the whole batch and not each sample.

Torchaug removes these issues and its transforms are meant to be used in place of Torchvision. It is based on the code base of Torchvision and therefore follows the same nomenclature as Torchvision with functional augmentations and transforms class wrappers. However, Torchaug does not support transforms on Pillow images.

More details can be found in the documentation.

To be sure to retrieve the same data augmentations as Torchvision, the components are tested to match Torchvision outputs. We made a speed comparison here.

If you find any unexpected behavior or want to suggest a change please open an issue.

How to use

  1. Install Torchaug.
pip install torchaug
  1. Import data augmentations from the torchaug.transforms package just as for Torchvision.
from torchaug.transforms import (
    RandomColorJitter,
    RandomGaussianBlur,
    SequentialTransform
)


transform = SequentialTransform([
    RandomColorJitter(...),
    RandomGaussianBlur(...)
])

For a complete list of transforms please see the documentation.

How to contribute

Feel free to contribute to this library by making issues and/or pull requests. For each feature you implement, add tests to make sure it works. Also, please update the documentation.

Credits

We would like to thank the authors of Torchvision for generously opening their source code. Portions of Torchaug were originally taken from Torchvision, which is released under the BSD 3-Clause License. Please see their repository and their BSD 3-Clause License for more details.

LICENSE

Torchaug is licensed under the CeCILL-C license.

torchaug's People

Contributors

juliendenize avatar pre-commit-ci[bot] avatar vfdev-5 avatar

Stargazers

Yuanhang Zhang avatar Robin Cole avatar Evan Dufraisse avatar  avatar Nicolas Granger avatar Xavier avatar  avatar Adrian Popescu avatar Corentin Vannier avatar  avatar  avatar  avatar Eva Feillet avatar Xavier Jimenez avatar Guillaume Lapouge avatar  avatar Baptiste Engel avatar Rémi Marsal avatar Quentin Bouniot avatar Guillaume THOMAS avatar Grégoire Petit avatar Fritz Poka avatar  avatar

Watchers

Kostas Georgiou avatar  avatar

Forkers

engelba vfdev-5

torchaug's Issues

Remove headers of the files

In v0.4.1 the license was changed to CeCILL-C and the copyright was transferred to CEA as it was first developed in my spare time and as part of my diploma thesis and now as part of my job.

For this release, I modified the header of every files to mention the following:

# @Copyright: CEA-LIST/DIASI/SIALV/ (2023-    )
# @Author: CEA-LIST/DIASI/SIALV/ <[email protected]>
# @License: CECILL-C

With some of them also crediting Torchvision by adding:

#
# Code partially based on Torchvision (BSD 3-Clause License), available at:
#   https://github.com/pytorch/vision

We should consider removing these mentions to get rid of templates:

  • The COPYRIGHT file already contains a template that people can use to refer to Torchaug and that contains the credits to Torchvision. Torchvision is also credited in the README that is packaged along the code on Pypi.
  • Is there a legal value of such mentions ? and even if yes, does it have to be attached to each file when the LICENSE is packaged ?

AttributeError from flat_inputs in RandomApplyTransform

These lines of code

params = self._get_params(
    [inpt for (inpt, needs_transform) in zip(flat_inputs, needs_transform_list) if needs_transform],
    num_chunks=1,
    chunks_indices=(torch.tensor([0], device=flat_inputs[0].device),),
)[0]

raise an AttributeError: object has no attribute 'device' flat_inputs[0] is not a subclass of tensor.

To get the input device, we should follow an heuristic to retrieve the device from the first subclass of tensor found.

Support for Tensor of different shapes

Some models accept input of different shapes (object detection, segmentation, ...)

Torchaug should be able to deal with list of Tensors as a new data type:

class TANestedTensor():
   ...

class BatchimagesNestedTensor(TANestedTensor):
   ...

Increase width of the documentation content

Description

Furo has a width for content that does not fit well with our content as Torchaug follows a line length of 119 characters and not 79.

Therefore, navigating through the code requires using a horizontal slider.

Desired improvement

Increase the width of the documentation content.

Updating video_format via wrappers is too late

Description

VideoWrapper and BatchVideoWrapper updates the video_format of its transforms such as VideoNormalize, but it is too late if the behavior of the transform is fixed at instantiation by video_format argument and not updated thanks to a setter method.

Minimal bug reproduction

import torch
from torchaug.batch_transforms import BatchVideoWrapper
from torchaug.transforms import VideoNormalize

transform = BatchVideoWrapper([
    VideoNormalize([0.225, 0.225, 0.225], 225)
], video_format="TCHW")

print(transform)
transform(torch.randn(2, 4, 3, 224, 224))

>>> BatchVideoWrapper(
    inplace=False,
    same_on_frames=True,
    video_format=TCHW,
    transforms=ModuleList(
      (0): VideoNormalize(mean=[[[0.22499999403953552]], [[0.22499999403953552]], [[0.22499999403953552]]], std=[[[225]]], cast_dtype=None, inplace=True, value_check=False, video_format=TCHW)
    )
)
>>> ...
>>> The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 2

while the following works:

import torch
from torchaug.batch_transforms import BatchVideoWrapper
from torchaug.transforms import VideoNormalize

transform = BatchVideoWrapper([
    VideoNormalize([0.225, 0.225, 0.225], 225, video_format="TCHW")
], video_format="TCHW")

print(transform)
transform(torch.randn(2, 4, 3, 224, 224))

>>> BatchVideoWrapper(
    inplace=False,
    same_on_frames=True,
    video_format=CTHW,
    transforms=ModuleList(
      (0): VideoNormalize(mean=[[[0.22499999403953552]], [[0.22499999403953552]], [[0.22499999403953552]]], std=[[[225]]], cast_dtype=None, inplace=True, value_check=False, video_format=TCHW)
    )
)
>>> ...
>>> The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 2

Fix proposition

Add a setter to all video transforms for the video_format attribute.

Improve onboarding

To make people contribute there is a need for

  • better documentation to contribute: environment, style guides, ...
  • stable pre-commit process with github workflows

Pin memory in dataloaders remove the typing of tensors

Context

Since the release of Pytorch 2.3.0, using pin_memory = True in the dataloaders remove the typing of Torchaug tensors which is impractical.

Reproduce

import torch
from torchaug.data.dataloader import default_collate
from torchaug.ta_tensors import Image
from torch.utils.data import Dataset, DataLoader


class MyDataset(Dataset):
    def __init__(self):
        super().__init__()
    
    def __getitem__(self, index):
        return Image(torch.rand(1, 3, 224, 224))
    
    def __len__(self):
        return 10

dataset = MyDataset()
dataloader = DataLoader(dataset, batch_size=2, collate_fn=default_collate, pin_memory=True)

batch = next(iter(dataloader))
print(type(batch))
>>> torch.Tensor

dataloader = DataLoader(dataset, batch_size=2, collate_fn=default_collate, pin_memory=False)

batch = next(iter(dataloader))
print(type(batch))
>>> torchaug.ta_tensors._batch_images.BatchImages

Suggestion for fix

  1. Until it is fixed remove the pin_memory: short-term solution which can cause slow downs
  2. Rewrite partly the dataloader and expose it in Torchaug
  3. Recast the types of the tensors after collation which forces metadata to be stored for several Torchaug tensors (eg: number of masks per samples)

Inplace error after expand

Description

When an operation is performed with an expand and is followed by an in-place operation it raises an error. This raises an error in BatchWrapper if no copies are made (for example when the probability of making the first operation is 1).

Minimal bug reproduction

import torch
from torchaug.batch_transforms import BatchImageWrapper, BatchRandomGrayScale
from torchaug.transforms import Normalize

transform = BatchImageWrapper([
    BatchRandomGrayScale(1.), # Call expand
    Normalize(225, 225)
])

transform(torch.randn(4, 3, 224, 224))

>>> RuntimeError: unsupported operation: more than one element of the written-to tensor refers to a single memory location. Please clone() the tensor before performing the operation.

Fix proposition

Add .contiguous() call in wrappers to avoid this situation.

[Bug] Chunks are not correctly formed

Issue

Chunks are not correctly formed for BatchLabels, BatchBoundingBoxes and BatchMasks.

Example for BatchMasks:

def get_chunk(self, chunk_indices: torch.Tensor) -> BatchMasks:
        chunk_idx_sample = torch.tensor(
            [0] + [self.idx_sample[chunk_indice + 1] - self.idx_sample[chunk_indice] for chunk_indice in chunk_indices]
        )

        chunk_idx_sample = chunk_idx_sample.cumsum(0).tolist()

        return BatchMasks(
            self[chunk_indices], # wrong select
            idx_sample=chunk_idx_sample,
            device=self.device,
            requires_grad=self.requires_grad,
        )

Suggestions

  1. Create a _BatchConcatenatedTATensor private class to inherit from
  2. Fix the select for each classes
  3. Make tests for such classes

Add Pillow and OpenCV support for conversion

Unlike Torchvision, Torchaug do not support transforms for Pillow images as it does not make sense for batches.

However it makes sense to be able to convert pillow images and opencv to images thanks to, among other things, the to_image function.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.