nayeemrizve / trssl Goto Github PK

"Towards Realistic Semi-Supervised Learning" by Mamshad Nayeem Rizve, Navid Kardan, Mubarak Shah (ECCV 2022)

License: MIT License

Python 100.00%

deep-learning machine-learning novel-class-discovery open-world pytorch semi-supervised-learning clustering open-world-learning open-world-semi-supervised-learning optimal-transport

trssl's People

Contributors

Stargazers

Watchers

Forkers

ngnquanq emasa yuni1314

trssl's Issues

The training loss would encounter "nan" value

Hello, when I run your codes on CIFAR-10 dataset, I find that the training loss obtains the "nan" value. Can you tell me how to solve this problem?

python train.py --dataset cifar10 --lbl-percent 10 --novel-percent 50 --arch resnet18

Epoch: [68 | 200] LR: 0.100000
Training |################################| (1024/1024) Data: 0.075s | Batch: 0.190s | Total: 0:03:14 | ETA: 0:00:01 | Loss: nan
test epoch: 68/ 200. itr: 40/ 40. btime: 0.024s.: 100%|██████████████████████████████████████████████████████████████████████████████| 40/40 [00:01<00:00, 38.94it/s]
test epoch: 68/ 200. itr: 20/ 20. btime: 0.032s.: 100%|██████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 26.34it/s]
test epoch: 68/ 200. itr: 20/ 20. btime: 0.032s. loss: nan. top1: 10.24. top5: 80.46. : 100%|████████████████████████████████████████| 20/20 [00:00<00:00, 26.47it/s]
UncrGen Iter: 186/ 186. Data: 0.270s. Batch: 0.395s.: 100%|██████████████████████████████████████████████████████████████████████████| 186/186 [01:13<00:00, 2.52it/s]
Files already downloaded and verified
Files already downloaded and verified
epoch: 67, acc-seen: 10.24
epoch: 67, acc-novel: 0.0, nmi-novel: 0.0
epoch: 67, acc-all: 10.02, nmi-all: 0.0008095158137027309, best-acc: 10.280000000000001
NaN or Inf found in input tensor.

question about the results

Hi, thanks for your great work!
I have some questions:

about the performance reported in Table 1 and 2, for four datasets (CIFAR and IN), are you assuming the unknown class number?
about the performance for imbalanced setting, the code in SK algorithm in this repo seems assume the ground-truth imbalance distribution? While your paper uses iterative prior update strategy. Is my understanding correct?
Looking for your reply.

Best regards,

Can't Reproduce Result in CIFAR-100 with 10% label

Hi,

Thanks for the great work. I download the code and directly run it several times without modifying. 'CUDA_VISIBLE_DEVICES=2,3 python3 train.py --dataset cifar100 --lbl-percent 10 --novel-percent 50 --arch resnet18 '.
For seen class, i can get the ACC 68% , however, for the novel class i can't get a result better than 49% while the paper had 52.1%. And even with random seed, i can't reproduce Model Training Results. And the final results are not so stable . Did i miss something?

Thanks!

On the question of why Sinkhorn's algorithm only needs to pass logits in the actual code, but not the prior vector ρ

Hello！
Thank you for your great work.I have a question that in the pseudo-code of the paper Sinkhorn's algorithm needs to pass two parameters logits and a priori distribution vector ρ, but in the real code implementation I found that only one parameter logits is passed. so I would like to ask you why it is possible to do so?
Your code is as follows：
def forward(self, logits): # get assignments q = logits / self.epsilon M = torch.max(q) q -= M q = torch.exp(q).t() return self.iterate(q)

Reproducing the paper results

Thank you for your exciting work and very clean code. I am having trouble reproducing the results mentioned in the paper and would appreciate it if you could help me.

Reproducing the results of cifar100
I was trying to get the scores on the samples from Table 2 in the paper. However, the results on all datasets did not match the ones that I see in the paper. I ran your code twice with "python3 train.py --dataset cifar100 --lbl-percent 10 --novel-percent 50 --arch resnet18". The result on seen，novel， all class is 1-2% lower than that in your paper.
logcifar100_label10_1.txt
logcifar100_label10_2.txt
With 50% label，The performance on the CIFAR-100 dataset with 50% label，the paper suggested adjusting the temperature to 0.2. However，the results on novel class is 41% while 49% in paper.
logcifar100_label50_t0.2.txt
with temperature set to 0.1, the result on novel class is 1% lower than that in your paper .
logcifar100_label50_t0.1.txt
I noticed that the CosineAnnealingLR with warm-up mentioned in your paper has been removed.

Thanks a lot for your time.

About the results on Fine-grained datasets: aircraft.

I reprocted your code on aircraft as your setting. But my reprocted results is Seen: 60.98% Unseen: 36.83%,All:48.90; But in your paper, your result is
$YBO_ 51{HJUW H8( ZQ)L_2$
About the learning rate adjust strategy: "We use a cosine annealing based learning rate scheduler accompanied by a linear warmup, where we set the base learning rate to 0.1 and set the warmup length to 10 epochs". In the latest version of code, you remove it. It does affect performance. So should this operation apply all dataset?
About the fine-grained datasets preprocess:
In your papers, the preprocess is as follows:

The input resolution of CIFAR-10 and CIFAR-100 images is 32×32; Tiny ImageNet images are slightly larger, i.e., 64×64. For the fine-grained datasets the images vary in size and aspect ratio. Therefore, for computational efficiency, we pre-process the images for fine-grained datasets and resize them to 256×256 resolution; this pre-processing operation is performed for both train and test images in all of our experiments.

But in your code, the data transformer is as follows:

 self.transform_train = transforms.Compose([
        transforms.RandomResizedCrop(224, (0.5, 1.0)),
        transforms.RandomHorizontalFlip(),
        transforms.RandomApply([transforms.ColorJitter(0.4, 0.4, 0.2, 0.1)], p=0.5),
        transforms.RandomGrayscale(p=0.2),
        transforms.RandomApply([GaussianBlur([0.1, 2.0])], p=0.2),
        transforms.ToTensor(),
        transforms.Normalize(imgnet_mean, imgnet_std),
    ])

 self.transform_val = transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=imgnet_mean, std=imgnet_std)
        ])

So the size of inputs is 224*224 or 256*256? Thanks~

Number of novel classes

Hello!

I am interested in your paper. I would like to consult that in your paper, you claimed the number of novel classes does not need to be known in advance. However, I noticed that in your code, when you defined a network, the dimension of the output of the network was exactly the same as the total number of classes of the dataset. For example, CIFAR-10 has an output dimension of 10. I am wondering if you have considered the case that the output dimension of the network is larger than the number of all classes?

Thanks!

Dataset utilized

First of all, thanks for your open code! I encounter some issues when running your code. Which ImageNet100 dataset do you utilize in your paper? To my acknowledgment, it seems that the ImageNet100 in ORCA(the baselines) uses different divisions from the published ImageNet100 in Kaggle. Can you post the categories of ImageNet100 in your paper?

About data splits

Thanks for the great work!
I windor the author how to split the train/val dataset on ImageNet-100 and aircraft ?

and why in your paper, the number of training samples is not an integer multiple of the number of classes

nayeemrizve / trssl Goto Github PK

trssl's People

Contributors

Stargazers

Watchers

Forkers

trssl's Issues

The training loss would encounter "nan" value

question about the results

Can't Reproduce Result in CIFAR-100 with 10% label

On the question of why Sinkhorn's algorithm only needs to pass logits in the actual code, but not the prior vector ρ

Reproducing the paper results

About the results on Fine-grained datasets: aircraft.

Number of novel classes

Dataset utilized

About data splits

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs