Comments (7)
It was indeed an issue specifically with the libraries you mentioned, - I completely overlooked that they affect the optimization, etc.
Thank you for your prompt reply and time @DonkeyShot21.
from uno.
As a side comment, I noticed that to evaluate your algorithm in the "task-agnostic" setup you generate predictions and solve the assignment problem separately for samples of known and novel classes.
Correct me if I'm wrong, but this might be the wrong way to perform such evaluation because one should have no knowledge on whether the group of samples belongs to known or novel classes, which you assume when performing such evaluation. The way I expected it to be is to use a joint dataloader that contains all the samples mixed and to apply Hungarian on all the resulting predictions at once.
It probably has negligible implications for this work because the classifier is trained well and the assignment problem solver assigns only one logit index to each class. But, as far as I see, assuming some novel class gets confused for the known class the majority of the time (I mean the highest logit it gets comes from the first head), such evaluation would not catch that as an error.
It might be, however, that such evaluation is expected and I just didn't get it from the paper, so just sharing my thoughts here.
from uno.
Happy to help! I added a note in the README that warns about package versions.
Regarding the evaluation, I think the procedure I am following is correct because I am first concatenating the logits (preds_inc
) and then I am taking the max of those concatenated logits. By doing this I lose the information about the task. Then in the compute()
method of the metric class, I compute the best mapping on all classes (and not separately).
from uno.
Ok, now I understand. You are right, this is a potential problem. However, the assignments are quite stable (they are computed on the whole validation set) and, as you said, the potential issue never happens in practice. I remember I tried once to remove the unwanted assignments (the ones that contradicted the labeled head), but the results were exactly the same, while the code was more complicated, so I just removed it. Also, if I remember correctly, in Ranking Statistics they use the same evaluation procedure, so I just sticked to that.
from uno.
I have just noticed that the batch size mentioned in the paper is 512, while the one in the ReadMe is 256. I suspect this could be the issue, will test it soon.
from uno.
Hi! I have just rerun an experiment with batch size 256 on CIFAR100-80_20 and I got 71.7 for incremental/unlab/test/acc/avg
, which is very close to the result published in the paper with batch size 512.
Also if you check the shape of the curve, it looks very different from your curves. I guess you are doing something wrong. It is very likely due to the version of pytorch-lightning
and/or lightning-bolts
. Try to use the exact versions I specified in the README.
Regarding ranking statistics, I remember I had some problems with hyperparameters too but in the end it was quite easy to get it running. I am going through my logs now and it seems I used this command for RS+:
auto_novel.py --dataset_root ./data/datasets/CIFAR/ --exp_root ./data/experiments/ --warmup_model_dir ./data/experiments/supervised_learning/resnet_rotnet_cifar100-50.pth --lr 0.1 --gamma 0.1 --weight_decay 1e-4 --step_size 340 --batch_size 256 --epochs 400 --rampup_length 300 --rampup_coefficient 25 --num_labeled_classes 50 --num_unlabeled_classes 50 --dataset_name cifar100 --IL --increment_coefficient 0.05 --seed 0 --model_name resnet_IL_cifar100 --mode train --comment RS+-50_50
I did not modify their code much so it should just work.
from uno.
The (potential) problem, I believe, is that the linear_sum_assignment
is computed not for all the data at once.
In the case when one tests on data with novel classes only, assume the highest logits (predictions) for all the samples come from the first head. Then the accuracy should be zero because you explicitly train the first head to predict known classes. However, because you test only on novel classes, the linear_sum_assignment
could "distribute"/"match" logits from the first head to the novel classes and the accuracy would be above zero. There would be no problem with assigning those first-head logits to novel classes because no "known" images are being fed to linear_sum_assignment
and those first-head logits are "free" to be assigned to something else.
However, if you would have a mixed dataset, then linear_sum_assignment
would most likely have assigned those first-head logits to known classes, because they would dominate. And the novel classes would have to be assigned to something else, marking all the first-head predictions for novel classes as errors, and the final accuracy would be much lower.
But, once again, I think it doesn't affect evaluation for balanced and moderately large datasets like CIFAR/ImageNet.
from uno.
Related Issues (14)
- License HOT 2
- How long does ImageNet experiments take? HOT 2
- How to implement the unconcat version? HOT 7
- Clarification question on num_large_crops HOT 2
- UNO_V2 results HOT 1
- The results on CIFAR10 HOT 2
- Issues with saving and loading checkpoints when using multiple gpus.
- a lot of questions about how to reproduce and cite the experimental results HOT 6
- Save inference images HOT 2
- swapped_prediction computation HOT 9
- Apply to a custom dataset HOT 2
- A question about the Eq.4 HOT 2
- loss_per_head seems wrong HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from uno.