GithubHelp home page GithubHelp logo

blackfeather-wang / isda-for-deep-networks Goto Github PK

View Code? Open in Web Editor NEW
576.0 15.0 91.0 51.05 MB

An efficient implicit semantic augmentation method, complementary to existing non-semantic techniques.

Python 93.45% Shell 0.29% Cuda 3.31% C 0.70% C++ 2.24%

isda-for-deep-networks's People

Contributors

blackfeather-wang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

isda-for-deep-networks's Issues

Questions about visualization on other datasets

Hi, ISDA is innovative.
But I want to use it to improve performances on other datasets(not imagenet).
Can I still use your visualization method (BigGAN) to visualize my dataset? If not, would you give me some suggestions on how to visualize my dataset?
Thanks a lot!

What's the difference between cifar-ISDA and imagenet-ISDA

cifar-ISDA:

    def forward(self, model, fc, x, target_x, ratio):
        features = model(x)
        y = fc(features)
        self.estimator.update_CV(features.detach(), target_x)
        isda_aug_y = self.isda_aug(fc, features, y, target_x, self.estimator.CoVariance.detach(), ratio)
        loss = self.cross_entropy(isda_aug_y, target_x)
        return loss, y

imagenet-ISDA:

    def forward(self, model, x, target_x, ratio):
        y, features = model(x, isda=True)
        # y = fc(features)
        self.estimator.update_CV(features.detach(), target_x)
        isda_aug_y = self.isda_aug(model.module.fc, features, y, target_x, self.estimator.CoVariance.detach(), ratio)
        loss = self.cross_entropy(isda_aug_y, target_x)
        return loss, y

why there is difference?
To apply ISDA to other models, the final fully connected layer does not needs to be explicitly defined in imagenet? right?
It's a little bit like centerLoss needs to return x and feature, right?

Question about the effect of ISDA on shallow layers of the network

Hi, the ISDA is a nice work and very impressive.

I have a question about the effect of ISDA on shallow layers of the network. ISDA augments the deep features, which is the last conv layer of the network before FC. So it seems the augmented features are only used to train a robust FC layer. I'd like to know how ISDA will affect the training of previous conv layers.

Looking forward to your reply. Thanks!

RuntimeError: copy_if failed to synchronize: device-side assert triggered

weight_CV: tensor([[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],

    [[nan, nan, nan,  ..., nan, nan, nan],
     [nan, nan, nan,  ..., nan, nan, nan],
     [nan, nan, nan,  ..., nan, nan, nan],
     ...,
     [nan, nan, nan,  ..., nan, nan, nan],
     [nan, nan, nan,  ..., nan, nan, nan],
     [nan, nan, nan,  ..., nan, nan, nan]],

训练自己的数据的时候报错了,weight_cv一直都是NAN
我尝试过以下这些方案解决,结果不行:
1、调整学习率
2、对FC权重和特征进行归一化

我的分类数是3000多类,我看代码weight_cv,是对label的一个onehot转换。按理应该不会出现这种错误才对。
希望作者能指点一下,谢谢

confused about where the augmentation is done

Hi

I read your paper. it's a very wonderful work

what i am confusing is related to #33

in #33, you said that because last fc layer(classifier) is implemented outside of whole model so y is embedding feature not logit

then why you update covariance using features variable not the y variable before entering isda_aug method?

what i understood by reading your paper is that the proposed augmentation is done on the embedding feature ai with covariance matrix yi which is constructed by calculating element wise variance of features whose classes are yi. am i right?

according to my understanding, covariance should be updated by using y variable

Thanks a lot!

How does ISDA work?

Hi,I'm reading your parper and your code, very useful work. but I'm wondering how does ISDA work exactly, I'm not fully understand it yet. My question is, does ISDA work while training a model? like training a detection model, and mean while we use ISDA? or we use ISDA firstly, generate data, then we training the model?
I now thinking it is the first way to use, but I'm not sure. Am I right?
thanks a lot

code for semi-supervised learning setting?

Hi,

Thanks for sharing the codes. Is this code only for supervised learning?
How about the code of ISAD on unlabeled samples? Could you also show them?

I would really appreciate it.

Thanks.

how to search the hyper-parameter lamda0

Thank u very much for open the sourcecode.

I have read the paper , and trained it in vehicle-classification task (with 20 id and 10thousand images)and vehicle-reid task (with 200 id and 10 thousand images) with our own dataset,
and it worked respectively with resnest50 / resnet50 with lamda_0 = 7.5 .
but when i change the backbone with resnet101 ,it didn't work with lamda=7.5,

  1. Do u have any good idea to search the lamda_0?
  2. Does it have relationship with backbone and task?
    2.1) Does much ids in dataset can choose the lamda_0 larger?
    i found u choose lamda_0=0.5 on CIfar and 7.5 on imagenet , seems the more id with more difficult task can choose the lamda_0 large? (cause CIfar only has 10/100id ,while the imagenet id much larger)
    2.2) How to choose lamda with deeper network? such as rensnet-101/152

How does it explain in segmentation?

Wonderful work! But still left me with a question.
It does make sense in image classification task, as the main object in the image can have semantic transforms. But in segmentation task, is ISDA operated based on each pixel? How to explain semantic transform of an alone pixel?

About the covariance matrix

It seems that you only use the diagonal elements of the covariance matrix. I wonder why not use the complete matrix to augment the feature? Looking forward to your reply.

Questions about ISDA for ImageNet

Thank you for your great work!!!
Sorry to bother you, I am confused on the implementation of ISDA for ImageNet, as shown in the following image.
image
Why do you only consider the diagonal elements of sigma2?

what if replace last fc with conv in segmentation task

Dear author

Thank you very much for your release. I have a question to ask. In the last dim of the network, you use an fc (feature_dim, class_dim) as the example to represent your theory in code. What if the conv operate (feature_dim, class_dim, kernel_size, kernel_size) has to be used in the specific task. If there would have any differences? because I was confused here whether we should consider the parameter of kernel or not. Meanwhile, the code would be changed.
##is that right? NxW_ij = weight_m.expand(N,C,A) --》 NxW_ij = weight_m.expand(N,C,A,K,K)?

Looking forward to your reply
Best wishes

question about equation 7 in the paper

Hi,

thanks for the work. I saw the equation 7 in the paper that the probablity for the cross entropy is not pure softmax activations. In the numerator, the input of exp is wx+b, while in the denormerator, the input of the exps are wx+b+(\lam vt \sigma v)/2. Why do you use nn.CrossEntropyLoss directly in the paper ? Should it be based on softmax ?

Also,As for as I know the covariance matrix of gaussian distribution is a square matrix, why is it defined with the shape of class_num x feature_num ?

RuntimeError: size mismatch

Thank you for this repo.
I am trying to implement ISDA with my own model on the CIFAR-10 dataset, but for some reason I am getting issues.
My model is defined below :

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(64, 256, 5)
        self.dp1 = nn.Dropout2d(p=.25) 
        self.fc1 = nn.Linear(256 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        residual = x
        x = self.pool(mish(self.conv1(x)))
        x = self.pool(mish(self.conv2(x)))
        x = x.view(-1, 256 * 5 * 5)
        x = self.dp1(x)
        if residual.shape == x.shape:
            x += residual
        x = mish(self.fc1(x))
        x = mish(self.fc2(x))
        x = self.fc3(x)
        
        return x


net = Net()

batch_size = 4
feature_num = 84
num_classes = len(classes) #10
fc = Full_layer(feature_num, num_classes)

I have done every in the train function right but for some reason I am getting
RuntimeError: size mismatch, m1: [4 x 10], m2: [84 x 10] at /pytorch/aten/src/TH/generic/THTensorMath.cpp:197

long-tailed problem

你好,
谢谢你的分享。我有一个问题,这个增强方法对于long-tailed problem是否work,或者说,如果很多类别中的样本仅仅只有极少的几张图片,这个方法能起作用吗?

Performance improvements on ImageNet

I found that you update the Arxiv paper on Aril 25 and the performance on ImageNet is largely improved, from 0.28% gain to 1.1% gain on ResNet-50. Is there any special change that leads to the larger improvements? Thanks.

Understanding number of features?

Hi thanks for sharing the code. I had a look and i cannot understand why in ./networks/resnet.py at line 169 the value of self.feature_num=64 is the same for all model and instead for wideresnet and other networks the value is different.

Thanks

Do we need to augment the label data as well?

Thanks for sharing your code!!!
I was wondering if I use your ISDA method to augment the training data for the Cityscapes' semantic segmentation task,
do I need to augment the label data as well? Because after reading your code, I only found the augmented image data.
Looking forward to your reply!

Question about ISDA implementation in object detection

Thanks for your brilliant work! May I ask how ISDA is implemented in the object detection task? I can see the results of object detection on MS COCO, but object detection also includes regression, may I ask how ISDA is applied to the bounding box regression in object detection? (Or maybe ISDA is only applied to classification?)

Lastly, I think it will be great if you could please provide the code of ISDA implementation on the object detection task. Thanks a lot!

Question about changing datasets

Your work is wonderful,it helps me a lot! I want to apply your code to the Tiny-ImageNet and CelebA Datasets, but I found the code error, and there are many places to change.Is there a convenient way to change the data set?I'd appreciate it if you could reply me.

Code error:
image

augmentation features or logits?

Hi, thanks a lot for your works. There are two questions that confused me a lot:

(1) I think CV_temp is the covariance matrix corresponding to each class,
but following augmentations happened on y, and y is logits. If features shape is [N, A], then y's shape is [N, C].
As your paper said, we should augment in features, should we augment on features not y?
As the sigma2 is [N, C] , features is [N, A], If I'd like to augment directly on features, where should I correct?

(2)And what's the meaning of these two steps on sigma2 computation?

Thanks a lot !
image

image classification has this error,can you help me ?

hello,a errar happens:

y, features = model(x, isda=True)
File "C:\ProgramData\Anaconda3\envs\chango1\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'isda'

can you help me?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.