donkeyshot21 / uno Goto Github PK

Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

License: MIT License

Python 100.00%

uno's Issues

a lot of questions about how to reproduce and cite the experimental results

Thank you for sharing the code of your paper! I have a lot of questions about how to reproduce and cite the experimental results of your paper.

My first question is how to reproduce the results on the original paper. I noticed that a new version (UNO v2) was released with higher performance, and meanwhile, a lot of hyperparameters have been changed. I wonder what values these hyperparameters were set in your original paper. Here is my guess:

For multi-view, only two large crops were used to build the swapped predictions loss. The two small crops were added in UNO v2.
For the base learning rate, it was originally set to base_lr=0.1 as described in the paper, not the value base_lr=0.4 in the current commit.
For the batch size, it was originally set to 512 as described in the paper. (Actually, I am not sure since it was set to 256 in the earlier commits of this repo).
For the discovery epochs, it was originally set to max_epochs=200 for all datasets as described in the paper, not like max_epochs=500 for cifar10/cifar100-20/cifar100-50 and max_epochs=60 for ImageNet in the current commit.
As for data augmentations, Solarize and Equalize were just added at UNOv2, not used in the original paper.

My second question is how to cite the experimental results. I noticed that a lot of training tricks (i.e., doubled training epochs, using extra two small crops for multi-view, and more data augmentation transformations) were used in UNOv2. However, some tricks have already made an unfair comparison with the previous work. For example, the representative previous work RS[1,2] just used batch_size=128, max_epoch=200, and without any complex augmentations used in your paper.

And as shown in your update as well as in my experiments, some changes, like the number of discovery epochs or the batch size, had significant effects on the final performance. So I am really confused about how to make fair comparisons...

References:
[1] Automatically discovering and learning new visual categories with ranking statistics. ICLR 2020.
[2] Autonovel: Automatically discovering and learning novel visual categories. TPAMI, 2021.

Reproducing the paper results

Dear authors,

Thank you for your exciting work and very clean code. I am having trouble reproducing the results mentioned in the paper and would appreciate it if you could help me.

1. Reproducing the UNO results from table 4.
I was trying to get the scores on the samples of novel classes from the test split (Table 4 in the paper).

I have executed the commands for CIFAR10, CIFAR80-20, CIFAR50-50, and used Wandb for logging. However, the results on all datasets did not match the ones that I see in the paper. I took the results from incremental/unlabel/test/acc.

.	Paper (avg/best)	Reproduced (avg/best)
CIFAR10	93.3 / 93.3	90.8 / 90.8
CIFAR80-20	72.7 / 73.1	65.3 / 65.3
CIFAR50-50	50.6 / 50.7	44.9 / 45.7

Potential issues:

I am not using the exact versions of the packages mentioned in your ReadMe, and for that reason, I have run the CIFAR80-20 experiment twice, manually setting the seed (as in RankStats repo), however, I obtained very similar results. I also would not suspect a ~7% difference on CIFAR80-20 just to to the package version.
I may be using the wrong metric from wandb (I have used incremental/unlabel/test/acc). However, if you check my screenshot, for CIFAR80-20 all the other metrics are significantly different anyway (the value close to 72.7/73.1 does not appear anywhere).

2. How exactly the RankStats algorithm was evaluated on CIFAR50-50.

Could you please share if you performed any hyperparameter tuning for the CIFAR50-50 dataset when running the RankStats algorithm on it? I made multiple experiments and my training was very unstable, the algorithm always ends up scoring ~20/17 on known/novel classes.

Thanks a lot for your time.

UNO_V2 results

For uno_v2 results on cifar100, is the acc you provide the mean of multiple runs and take the average for the best head? Do you also have the standard deviation?
And I also want to know if the ACC is chosen by the best validation epoch or the last epoch?

loss_per_head seems wrong

Hi, I think the calculation of loss_per_head is wrong.

def cross_entropy_loss(self, preds, targets): # n_heads, batch_size, logits
        preds = F.log_softmax(preds / self.hparams.temperature, dim=-1)
        return -torch.mean(torch.sum(targets * preds, dim=-1))

this code averages both 0 and 1 dims so the loss is not head-wise. When conducting self.loss_per_head += loss_cluster.clone().detach(), each head loss adds the same mean value.

 return -torch.mean(torch.sum(targets * preds, dim=-1), dim=-1) # revised

I think this would be good. If there's anything wrong with my understanding, please correct me. Thanks.

License

Hi,

Thanks for your amazing work and codes. Can you kindly add the License to your repo.

Thanks,
Joseph

A question about the Eq.4

Excellent work. I have a question about Eq.4.
Is the optimization of this Eq.4 using the Sinkhorn-Knopp algorithm equivalent to cross-entropy loss and label smoothing?
What are the advantages of using the Sinkhorn-Knopp algorithm instead of cross-entropy loss?
Thank you! Expect your reply.

Apply to a custom dataset

Hi, thanks for your great job!
I wonder if this implementation can be use to detect a total new class without train, just use the pre-trained model?
Thank you!

How to implement the unconcat version?

Hi Enrico,

Thanks for your work and clean code!
From your paper, I find that acc increases dramatically after concatenating label's and unlabel's logistics.
And I really want to see how to implement the unconcat version.
Looking forwards to your reply.

swapped_prediction computation

UNO/main_discover.py

Line 180 in 50022c9

loss_cluster.append(self.swapped_prediction(logits, targets))

Thanks for your nice work~

However, I have a question why we compute swapped_prediction for the length of num_head. It seems swapped_prediction is the same computation.

Hope your reply

How long does ImageNet experiments take?

Hi Enrico,

Thanks again for your nice work and clean code.

I am just wondering long does the ImageNet experiments take? Lets say the ImageNet-A. And, what GPU are you using?

Thanks,
Joseph

Save inference images

If i want to save the inference images to the folder given,waht should i do?

Issues with saving and loading checkpoints when using multiple gpus.

Hi, thank you for sharing your code.

I'm trying to follow your instructions but when I run discovery code, it fails to load pretrained model.
My environment is

Ubuntu 16.04 LTS
2ea Nvidia RTX3090
python 3.8, cuda 11.0, pytorch 1.7.1, torchvision 0.8.2
same version of pytorch-lightning and lightning-bolts as the repo

My errors are

Traceback (most recent call last):
File "main_discover.py", line 280, in
main(args)
File "main_discover.py", line 266, in main
model = Discoverer(**args.dict)
File "main_discover.py", line 70, in init
state_dict = torch.load(self.hparams.pretrained, map_location=self.device)
File "/home/dircon/anaconda3/envs/uno/lib/python3.8/site-packages/torch/serialization.py", line 594, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/dircon/anaconda3/envs/uno/lib/python3.8/site-packages/torch/serialization.py", line 853, in _load
result = unpickler.load()
File "/home/dircon/anaconda3/envs/uno/lib/python3.8/site-packages/torch/serialization.py", line 845, in persistent_load
load_tensor(data_type, size, key, _maybe_decode_ascii(location))
File "/home/dircon/anaconda3/envs/uno/lib/python3.8/site-packages/torch/serialization.py", line 833, in load_tensor
storage = zip_file.get_storage_from_record(name, size, dtype).storage()
RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading file data/94820505364016: invalid header or archive is corrupted

I believe it's due to distributed data parallel(ddp) but how can I stop from multiple cards to save the model?

Clarification question on num_large_crops

Hi Enrico,

Just a clarification question. If I understand correctly, num_large_crops basically controls the number of augmented versions of an image. Can you confirm?

As num_large_crops is set to 2, we basically have two two augmented versions of the same image in each mini-batch. These are indeed used for swapped prediction. Is my understanding correct?

Please do get back when you get a chance.

Thanks,
Joseph

The results on CIFAR10

Hi, author:

Thanks for your nice code. I benifit a lot from your code.

For UNOv2, I can reimplement the results of CIFAR100, but I can't reimplement the results of CIFAR10 (93.6±0.2), which is worse than UNOv1 (96.1±0.5). I have try a lot times but I havn't figure out why.

I would very appreciate it if you can provide some help.

donkeyshot21 / uno Goto Github PK

uno's Issues

a lot of questions about how to reproduce and cite the experimental results

Reproducing the paper results

UNO_V2 results

loss_per_head seems wrong

License

A question about the Eq.4

Apply to a custom dataset

How to implement the unconcat version?

swapped_prediction computation

How long does ImageNet experiments take?

Save inference images

Issues with saving and loading checkpoints when using multiple gpus.

Clarification question on num_large_crops

The results on CIFAR10

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs