wyharveychen / closerlookfewshot Goto Github PK
View Code? Open in Web Editor NEWsource code to ICLR'19, 'A Closer Look at Few-shot Classification'
License: Other
source code to ICLR'19, 'A Closer Look at Few-shot Classification'
License: Other
Hi, I want to use Resnet in backbone.py for another task. But when I use multi-gpu, I got
'AttributeError: 'Tensor' object has no attribute 'fast' .
It seems like that the added attribute "fast" can't be automatically loaded to multi-gpu by
model = nn.DataParallel(model.to(device))
. Would you pls give me some suggestions? Thanks a lot.
Thanks for releasing the code.
We are trying to replicate your results on CUB dataset for comparison purposes.
The write_CUB_filelist.py doesnt seem to have assigned a seed. So the filelist would change every run.
Can you please release the json you generated for running the experiments in the paper?
I get this error on training
json.decoder.JSONDecodeError: Invalid \escape: line 1 column 4570 (char 4569)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 21 and 7 in dimension 1 at /opt/conda/conda-bld/pytorch_1549635019666/work/aten/src/TH/generic/THTensorMoreMath.cpp:1307
Hello,
This code base is amazing to learn few-shot learning! I was able to follow the train/save_feature/test for the baseline and meta algorithms separately and obtained similar results as the paper. I'd like to try the domain adaptation on these methods too. How should I do it to see adaptation from miniImagenet to CUB? What's the command line to do it? For example,
Is #3 correct to use to call baseline finetune on new domain CUB data from the trained embedding using miniImagenet?? How do I specify it (call miniImagenet trained embedding) from the command line?
Thank you very much in advance!!
Hi, I wanted to know if you had run baseline/baseline++ over the tiered ImageNet dataset. If so, can you share the performance results over it also?
Getting around 80.78 (85.17 in paper) % accuracy on 5 shot , ResNet10 baseline++ with augmentation on.
Command used to train : python train.py --dataset CUB --model ResNet10 --method baseline++ --train_aug
Command used to test : python test.py --dataset CUB --model ResNet10 --method baseline++ --train_aug
Hi, thanks for releasing code!
Just to clarify, in the paper you train "60,000 episodes for 1-shot tasks" - say I'm running ProtoNets on the CUB dataset - by default each epoch consists of 100 sampled episodes, correct? So the results in the paper are reported with the best model (according to val) after 600 epochs?
In training stage of Baseline/Baseline++ method, it seems that you always set num_class as 200. But every dataset has specific number of base class(for example, miniImageNet has 64 base class). Is it set to 200 for convenience(someone do not need to modify this parameter when conduct experiments both in CUB and miniImageNet)? But i think will it be better to set num_class as a optional command line argument or set num_class as 64 for miniImage and 50 for CUB?
When I run the code with the train_aug=True,I got the error:
TypeError: init() missing 1 required positional argument: 'size'
Could you help me solve it?Thanks
Hello, Thanks for your great work.
I have run your code with the following command for several times, but the final accuracy is all about 73.99, which is slower than 75.68 reported in the paper. Am I missing something? Or I have to tune some hyperparameters?
for file in train save_features test; do
python ./${file}.py --dataset miniImagenet --model ResNet18 --method baseline++ --train_aug --n_shot 5
done
Thanks very much.
in the case of image classification,
we have a training set, with x inputs, and the loss is computed by whether our neural network was correctly able to predict the label or not, and accordingly taking the difference between the ground truth and the predicted label.
so, we get a training set loss, and then test this trained model on some test images, and then compute the test set loss.
when using maml, we have epidoses (tasks) within our meta training set, and each of these episodes contain a training set (support set) and a test set (query sample), and similarly episodes within our meta test set
contain a training set and a test set.
I am a bit confused what 1 shot 5 way or 5 shot 5 way means, and how is the loss being computed?
does 1 shot 5 way mean that we have 1 image for each of the 5 classes in the training set (support set), and we compute softmax regression probability over these 5 classes and compute training set loss accordingly.
and then give our trained model, a test input from the query sample, and compute the test set loss?
while 5 shot 5 way means
we have 5 images for each of the 5 classes in the training set (support set), so 25 images, and we compute softmax regression probability over these 5 classes and the training set loss for one epoch would be the summation of loss obtained for these 25 images given to our model.
and our test set (query sample) also contains 5 images of the same class, and these 5 images are given to our trained model, and we take summation of loss obtained on these 5 images?
I am confused, can you clarify, what does the loss mean when training maml?
thanks
CloserLookFewShot/data/datamgr.py
Line 13 in b3b5b8b
I am not sure how much performance change will happen, but when we are not using ImageNet pretrained model, I guess we should not normalize with the mean and standard deviation computed from ImageNet. When we train weights from scratch, we should normalize using mean and std computed from each dataset. Maybe using ImageNet mean for mini-Imagenet is fine because it's part of ImageNet but for CUB dataset, I guess we should use different one.
Thank you for your impressive work, and codes too!
Given using a ResNet18 backbone leads to a ~5% raise to accuracy, as is shown in your paper, I tried to train RelationNet with a backbone of ResNet18, however the model quickly overfits and ends up to a ~60% test accuracy after 60,000 episodes of training.
I was using your code with Adam as optimizer, LR=0.001 and reduces the LR by 25% when it comes to a plateau. weight_decay=0.0005 and grad clipping=(1, L2 norm) were also performed.
I wonder where's the difference between my setting and yours, and how you achieved the 70% accuracy in ResNet18+RelationNet. Thanks a lot!
after a lot of attemptions,i finally found a robust parameters to re-implement the refined cosine distance method:
1.we rewrite the cosine distance code.
2. we set scale factor s = 30 all the time for training.
3. we use sgd optimizer with m=0.9 and lr=0.1 at begining,and decay with 0.1 at 100,200,300 epoch
4.for conv4 network we use weight-decay = 5e-4,for ResNetX network we use weight-decay = 1e-3
5.when doing finetune,we set the scale factor s = 5.
here is the result on ResNet18:
mini-imagenet:
1 shot 56.78+0.84
5 shot 75.46+0.63
Thanks for your work!
I'm not clear about the effect of feature backbone in the neural network. Can I understand that it extract features and send the features to different models, like matching net and relation net?
And I found that in the relationnet.py, you have defined the RelationModule, it is also some layers that extract features. So, I don't know the relationship between the RelationModule and the feature backbone, like Conv-4, Conv-6, Restnet-18?
And you changed the mse loss of relation network to softmax to expedite training.
I want to ask whether the accuracy will be improved with the softmax loss?
Thanks a lot!
for dataset in dataset_list:
with open(datasetmap[dataset] + ".txt", "r") as lines:
for i, line in enumerate(lines):
label = line.replace('\n','')
folderlist.append(label)
filelists[dataset][label] = [ join(data_path,label, f) for f in listdir( join(data_path, label))]
The last line in the snippet above is giving the error:
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'C:\cygwin64\home\Adil\CloserLookFewShot\filelists\omniglot\images\'
I am a newbie to python and pytorch, so any help would be appreciated.
In fine-tuning stage and adaption stage(chapter 4.5 in paper), the code in 'test.py' use test data set and randomly split data into support set and query set in each episode without overlap for fine-tuning and evaluation, while iterating for many episodes. I think the support set and query set in different episodes may overlap, which means the evaluation data may be use to fine-tune model, thus cause the test accuracy higher than it should be. So, can you explain it?
thanks~
Hey,
The imagenet link to download in order to use the mini-imagenet does not seem to be working at the moment, would it be possible to upload your own .tar.gz file somewhere and download from there?
Regards
hello,
very nice work~
I am running the demo with CUB data, however, in test stage, an error happend:(And I found that the code just updated a day ago~)
Traceback (most recent call last):
File "test.py", line 108, in
model.load_state_dict(tmp['state'])
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BaselineFinetune:
Unexpected key(s) in state_dict: "classifier.weight", "classifier.bias".
I used the below command to try running the code:
python ./train.py --dataset miniImagenet --model Conv4 --method baseline --train_aug
but I run into the error:
Traceback (most recent call last):
File "./train.py", line 100, in
base_datamgr = SimpleDataManager(image_size, batch_size = 16)
File "/media/data2/manolis/CloserLookFewShot/data/datamgr.py", line 53, in init
super(SimpleDataManager, self).init()
TypeError: super() argument 1 must be type, not classobj
Any insight on that?? Thanks in advance.
Hi there,
If I have the right understanding that for the baseline and baseline++ model trained with base classes, in the paper reported results the model used is the epoch399 if trained with miniImagenet?
I am wondering that for all the model structures, using the same epoch seems fair to compare in the paper however is there overfitting problem which may influence the evaluation performance as the model selection is not based on validation set for the base classes?
Thanks!
hello, I run the code of baseline and baseline++ on CUB. However, the results are lower than the reported ones.
python train.py --dataset CUB --model ResNet10 --method baseline++ --train_aug
python save_features.py --dataset CUB --model ResNet10 --method baseline++ --train_aug
python test.py --dataset CUB --model ResNet10 --method baseline++ --train_aug
the results are around 80%, large lower than 85%, is there any thing wrong? thanks!
hi, thank your for the nice work first.
i have some problems in the code that confuses me. To compute the cosine distance, the implementation is the following
x_norm = torch.norm(x, p=2, dim =1).unsqueeze(1).expand_as(x)
x_normalized = x.div(x_norm+ 0.00001)
L_norm = torch.norm(self.L.weight.data, p=2, dim =1).unsqueeze(1).expand_as(self.L.weight.data)
self.L.weight.data = self.L.weight.data.div(L_norm + 0.00001)
cos_dist = self.L(x_normalized) #matrix product by forward function
scores = self.scale_factor* (cos_dist)
I notice that in the forward pass , the parameter of Linear function is replaced directly with the normed one. It is not simply a process to compute cosine distance.
It means at training time, the paramater theta is updated by:
theta= normed (theta)
theta=theta - learning rate* theta
,not
theta=theta - learning rate* theta.
This is a bit different from the illustration in the paper.
Hi, I re-do the Conv4-ProtoNet 5-way-5-shot experiment,
but the accuracy is about 4% more than the paper.
python ./train.py --dataset CUB --model Conv4 --method protonet --train_aug
python ./save_features.py --dataset CUB --model Conv4 --method protonet --train_aug
python ./test.py --dataset CUB --model Conv4 --method protonet --train_aug
I ran several experiments with the same setting, getting results with 74-75% accuracy.
But the paper shows the result with 70.77%,
Is there something wrong?
Thanks!
Thanks for your share of code, when I ran code :
"""
python ./train.py --dataset miniImagenet --model Conv4 --method baseline --train_aug
python ./save_features.py --dataset miniImagenet --model Conv4 --method baseline --train_aug
python ./test.py --dataset miniImagenet --model Conv4 --method baseline --train_aug
"""
only get 58.7%. However, accoding to paper, it can reach 62.5%. I'm wondering if something wrong in my operation.
Hi,
Thanks for sharing the code. For the training of each meta-learning algorithm, do you use multi-GPU? What's the GPU memory consumption for each algorithm from your experience? (Currently, I cannot run the training of MAML on a single Titan GPU)
Thanks!
Very nice job!!! I tried to use resnet (10, 12, 18) as a backbone of my model, but it didn't improve performance as we expected. In contrast, In our experiments, deeper resnet was mush easier overfitting than 4-conv which made us very confused. I wonder if you ever encountered a similar problem during your experiment. Thank you very much.
Hi,
Firstly, I'd like to say that I really like your paper. You've some excellent work.
My situation is that I got my master thesis whose goal is to implement and comparatively evaluate few-shot learning methods but then your paper arrived. I have not started my thesis yet, but for specifc reasons, the topic might be hard to change.
I am still discussing with my supervisor but It is possible that I will extend your project. Thus, I would like to ask if you are still continuing this work. I will look for a different direction if you are.
Thank you.
Hi there,
Thanks for the great code and fantastic work during the ICLR 2019. Recently, I use the original code to test on Matching Network and found the result is higher than the reported result about 1 per, does this situation is normally?
Hi,
It is a nice work.
Now I am facing some problems when running your code. So I want to make sure that I use the same pytorch as you. Which pytorch version did you use when you implement this project?
Thank's for your work!
I have a question that is it classifier at test stage is new?
I can't find classifier loading trained weights in code.
Hi, I've run the baseline Conv4 on Omniglot. I've notice that for 1-shot, the experiment result is far off from the reported 94%. Any idea why?
Time: 20190416-134341, Setting: omniglot-novel-Conv4S-baseline 5shot 5way_test, Acc: 600 Test Acc = 98.62% +- 0.15%
Time: 20190416-133451, Setting: omniglot-novel-Conv4S-baseline 1shot 5way_test, Acc: 600 Test Acc = 89.30% +- 0.58%
The scripts used to run these experiments are:
python3 train.py --dataset omniglot --model Conv4 --method baseline --num_classes 4112
python3 save_features.py --dataset omniglot --model Conv4 --method baseline
python3 test.py --dataset omniglot --model Conv4 --method baseline --n_shot 5
python3 test.py --dataset omniglot --model Conv4 --method baseline --n_shot 1
I have train your code with loss_type = 'dist' & train_aug(baseline++),but the 5 shot accuracy on mini-imagenet is only 56%,much lower than your report(66%),could you give a resonable explaination?
Hi,
Thanks for your contribution and the code is really well organized in a good writing style.
From the code, I got the following understanding:
Take miniImageNet and Protonet training as an example:
The number of epoches: 400 for 5-shot.
In each epoch, the number of batches is: n_episodes(default is 100 in the code);
So the total number of episodes are 40,000(400 * 100), which is the same with what you report in the paper;
However, I have confusion with the following and please correct me if the understanding is not correct:
In each batch, you use episodicBatchSampler. This BatchSampler will be called n_episodes times for a batch, more specifically the function iter will be called n_episodes times.
However, in the function iter, there is also a iteration over range(n_episodes); and each time in this iteration the yield result is a N-way iterator.
Does this mean that if the definition of a episode is one set of N-way classes, the actual number of episodes for a epoch is 100*100?
Thanks!
Hi~
Thanks for your work!
I found in your code, all the data is normalized before they are sent into the models. I want ask that is data normalization necessary and is it important to the final performance?
Besides, I found in order for data normalization, you need to determine the mean and std of the data. Do you use the entire dataset to calculate the mean and std? Or are they computed using only training data?
Thanks a lot!
Hello, I have run your code
python ./train.py --dataset miniImagenet --method protonet --model ResNet18 --train_aug
python ./save_features --dataset miniImagenet --method protonet --model ResNet18 --train_aug
python ./test.py --dataset miniImagenet --method protonet --model ResNet18 --train_aug
but I got the max test accuracy is 72.30%, while the reported is 73.68.
the version of my pytorch is 1.0.0.
could you give me some explains and advice? Thank you!
Hi :
I am wondering if I want to train baseline++ with few shot training method, I just change the dataloader class ,right? Because now in train.py, the baseline and baseline++ are not trained in few shot method.
Can I use the miniImagenet from
https://drive.google.com/file/d/1fJAK5WZTjerW7EWHHQAR9pRJVNg1T1Y7/view?usp=sharing
It's used in MetaOptnet. https://github.com/kjunelee/MetaOptNet. And it's much smaller.
Hello @wyharveychen,
In test.py
, baselinefinetune.py
was called for both baseline and baseline++, however, in train.py
baselinetrain.py
was called. I'm not sure which one is the result used in your paper Figure 1 Baseline and Baseline++ few-shot classification methods
. It seems to me that either one is no longer up-to-date. Thanks!
UPDATE
I re-read the section and figured this out, thanks!
hello, I ran your code on miniImageNet, when I use the dataset with image size 8484, the results can not reach the reported accuracy, then I read you code which forms the dataset, I found you make miniImageNet from original image of ImageNet, so I wonder the image size in the ./filelists/miniImagenet is 8484 or larger?
I changed the \ to double \ in the json files
but got this error
C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\site-packages\torchvision\transforms\transforms.py:208: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
"please use transforms.Resize instead.")
Traceback (most recent call last):
File "./train.py", line 177, in
model = train(base_loader, val_loader, model, optimization, start_epoch, stop_epoch, params)
File "./train.py", line 32, in train
model.train_loop(epoch, base_loader, optimizer ) #model are called by reference, no need to return
File "D:\idm\CloserLookFewShot-master\CloserLookFewShot-master\methods\baselinetrain.py", line 38, in train_loop
for i, (x,y) in enumerate(train_loader):
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 819, in iter
return _DataLoaderIter(self)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 560, in init
w.start()
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\context.py", line 313, in _Popen
return Popen(process_obj)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\popen_spawn_win32.py", line 66, in init
reduction.dump(process_obj, to_child)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\reduction.py", line 59, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function at 0x000001EB82AC5598>: attribute lookup on data.dataset failed
File "", line 1, in
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 106, in spawn_main
exitcode = _main(fd)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 116, in _main
self = pickle.load(from_parent)
EOFError: Ran out of input
what to do?
Hi Harvey,
In your paper it was stated that
"For each class, we pick k labeled instances as our support set and 16 instances for the query set for a k-shot task"
However in your feature_evaluation
function in test.py
code, the default number of query instances per class was 15: n_query = 15
.
Since the setting used by Ravi et al. was also 15 samples per class for D_test, was it a minor typological error in your paper?
Does the learning rate always stay the same (default 0.001 in the Adam optimizer)? If not, could you tell me how you set in your experiments? Thanks!
I try to run script 'download_miniImagenet.sh' and the program broke down with error messages below:
--2019-05-27 09:53:56-- https://raw.githubusercontent.com/twitter/meta-learning-lstm/master/data/miniImagenet/test.csv%0D
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.76.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.76.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2019-05-27 09:53:56 ERROR 404: Not Found.
': [Errno 2] No such file or directorygenet_filelist.py
': [Errno 2] No such file or directoryilelist.py
I think the main problem is file 'test.csv' is not available on server. If my guess is correct, could you please upload this file to github since it is not so big? If there are other causes, could you give me a hand? Thanks
Hi, I have a question regarding your augmentation methods. In your code, scaling and center crop is still applied when augmentation is False in both training and validation stage. In the repo jakesnell/prototypical-networks, only scaling is applied. Is this intended? If it is, any specific reason?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.