GithubHelp home page GithubHelp logo

sitta's Introduction

SITTA

The repo contains official PyTorch Implementation of the paper SITTA: Single Image Texture Translation for Data Augmentation.

European Conference on Computer Vision (ECCV) Workshops, 2022

Authors:

Overview

Recent advances in image synthesis enables one to translate images by learning the mapping between a source domain and a target domain. Existing methods tend to learn the distributions by training a model on a variety of datasets, with results evaluated largely in a subjective manner. Relatively few works in this area, however, study the potential use of semantic image translation methods for image recognition tasks. In this paper, we explore the use of Single Image Texture Translation (SITT) for data augmentation. We first propose a lightweight model for translating texture to images based on a single input of source texture, allowing for fast training and testing. Based on SITT, we then explore the use of augmented data in long-tailed and few-shot image classification tasks. We find the proposed method is capable of translating input data into a target domain, leading to consistent improved image recognition performance. Finally, we examine how SITT and related image translation methods can provide a basis for a data-efficient, augmentation engineering approach to model training.

Usage

Environment

CUDA 10.1, pytorch 1.3.1

Dataset Preparation

dataset url
0 SITT leaves images from Plant Pathology 2020 download

Running

bash run.sh

If you find this repo useful, please cite:

@InProceedings{10.1007/978-3-031-25063-7_1,
author="Li, Boyi
and Cui, Yin
and Lin, Tsung-Yi
and Belongie, Serge",
editor="Karlinsky, Leonid
and Michaeli, Tomer
and Nishino, Ko",
title="SITTA: Single Image Texture Translation for Data Augmentation",
booktitle="Computer Vision -- ECCV 2022 Workshops",
year="2023",
publisher="Springer Nature Switzerland",
address="Cham",
pages="3--20",
abstract="Recent advances in data augmentation enable one to translate images by learning the mapping between a source domain and a target domain. Existing methods tend to learn the distributions by training a model on a variety of datasets, with results evaluated largely in a subjective manner. Relatively few works in this area, however, study the potential use of image synthesis methods for recognition tasks. In this paper, we propose and explore the problem of image translation for data augmentation. We first propose a lightweight yet efficient model for translating texture to augment images based on a single input of source texture, allowing for fast training and testing, referred to as Single Image Texture Translation for data Augmentation (SITTA). Then we explore the use of augmented data in long-tailed and few-shot image classification tasks. We find the proposed augmentation method and workflow is capable of translating the texture of input data into a target domain, leading to consistently improved image recognition performance. Finally, we examine how SITTA and related image translation methods can provide a basis for a data-efficient, ``augmentation engineering'' approach to model training.",
isbn="978-3-031-25063-7"
}

sitta's People

Contributors

boyiliee avatar

Stargazers

 avatar Siobhan avatar  avatar liboxiao avatar Yongjin Jo avatar  avatar Youngmin KIM avatar Jon Chun avatar Sanctuary avatar An-zhi WANG avatar ZhiHua avatar Ye Du avatar  avatar Michael Ramos avatar Jacob A Rose avatar Mingjia Li avatar Simone Azeglio avatar  avatar Thangylvp avatar Nhan Nguyen avatar  avatar Tabris avatar Jun avatar  avatar Jakub Langr avatar Nicolás Metallo avatar  avatar Tilden Ji avatar Fei avatar Aditya Kumar avatar Xa9aX ツ avatar  avatar  avatar Kunat Pipatanakul avatar Chanran Kim avatar Jiazhi Yang avatar Olga avatar Ceyda Cinarel (재이다) avatar 爱可可-爱生活 avatar Vishnu Kool avatar ChengruZhu avatar  avatar Kevin Musgrave avatar Luming Tang avatar Gustave Cortal avatar  avatar  avatar yaxingwang avatar Ziqi Zhou avatar Gaurav avatar 法米 avatar  avatar Derrick avatar  avatar Roger GOU avatar  avatar Tianwei Yin avatar  avatar

Watchers

 avatar Jimmy Yu avatar Fei avatar simongao avatar vicente avatar Ceyda Cinarel (재이다) avatar Jun avatar  avatar  avatar  avatar Matt Shaffer avatar

sitta's Issues

Thoughts about the textures

Hello! very great project! Here I wonder can the texture latent vector be altered by disturbing the texture latent vector, instead of generating the texture latent vector using the Texture Encoder in your model. In this way, there is no need to find the image pairs that contain the same objects like birds, and people can arbitrarily change the textures of a singe image.

hyperparameters

Dear @Boyiliee ,
I am trying to implement your paper in code.
Some hyperparameters are omitted from the paper , so it is difficult to implement.
Could you please tell me batch size and lambda values ​​of loss?
Thank you.

demo data

Dear @Boyiliee ,

Thanks for sharing the code.

Would it be possible for you to provide some demo data to illustrate how to use the code with a new dataset?

Question about the kl divergence loss

According to the paper, Kl divergence loss is computed between textures t_A, t_ba and ** t_B, t_ab**, however the computation:

loss_netG_A_texture = -0.5 * (F.kl_div(t_A, t_ba) + F.kl_div(t_ba, t_A))
loss_netG_B_texture = -0.5 * (F.kl_div(t_B, t_ab) + F.kl_div(t_ab, t_B))

Looks more like the JS divergence but with a negative sign.

Also the inputs for the Kl divergence loss are supposed to be in the log softmax space but the textures t_A, t_ba, t_B, t_ab come from the ReLU "space", is this the reason behind the negative sign in the equation? or am I losing some detail about the implementation?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.