wyharveychen / closerlookfewshot Goto Github PK

View Code? Open in Web Editor NEW

1.1K 1.1K 271.0 170 KB

source code to ICLR'19, 'A Closer Look at Few-shot Classification'

License: Other

Python 98.11% Shell 1.89%

closerlookfewshot's People

Contributors

Stargazers

Watchers

Forkers

godplusone floodsung dtennant hyzcn vangogh0318 jayelm syzlhh majianzhu wangxggc wshenx vzhangmeng726 leibai peterzcc swall0w rsip4sh ryanai3 kaihuatang jakesnell lijuny ryfan-rs staceycy noelcodella kaiduohong vainaixr yongkyung-oh resbyte bhaney shashank3959 ehosseiniasl yuanmler firmamentqj gumpfly istarjun a7b23 sisung davidtranno1 conference-submission-anon adamlouisky caddyless enyalien russell0 gdh756462786 yejg2017 jack-willturner lifeng9472 iamsile wangyu johnnyasd12 cbeutenmueller jperezrua dangpzanco ml-lab oirs zhufz zhao9797 tjuxjh kapitsa2811 xjtushujun michelleowen wangkua1 getterk96 junhyun-nam arnoutdevos jeetkanjani7 crashmoon yangwf1 uooga bokunwang megayeye ohkumatakumi kobybibas adbugger shantanuj chenlk96 liyang328 nilakshdas liam0949 dreadlord1984 andylc quantacode njucly rloner yangzhou12 ondrejbohdal zialiu thomaslin1990 wangyuchn yuanmengzhixing xychen9459 licj1 aegon007 nikkiccq changliu816 forks-learning ma-torabi pancakeawesome legitqx gouttham vitvicky artificially-ai

closerlookfewshot's Issues

How to support multi-gpu?

Hi, I want to use Resnet in backbone.py for another task. But when I use multi-gpu, I got

'AttributeError: 'Tensor' object has no attribute 'fast' .

It seems like that the added attribute "fast" can't be automatically loaded to multi-gpu by
model = nn.DataParallel(model.to(device)). Would you pls give me some suggestions? Thanks a lot.

Random seed not assigned: Trying to replicate the CUB results

Thanks for releasing the code.

We are trying to replicate your results on CUB dataset for comparison purposes.
The write_CUB_filelist.py doesnt seem to have assigned a seed. So the filelist would change every run.

CloserLookFewShot/filelists/CUB/write_CUB_filelist.py

Line 25 in 4ca19c7

random.shuffle(classfile_list_all[i])

Can you please release the json you generated for running the experiments in the paper?

json decode error on default training

I get this error on training
json.decoder.JSONDecodeError: Invalid \escape: line 1 column 4570 (char 4569)

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 21 and 7 in dimension 1 at /opt/conda/conda-bld/pytorch_1549635019666/work/aten/src/TH/generic/THTensorMoreMath.cpp:1307

How to run domain adaptation test?

Hello,

This code base is amazing to learn few-shot learning! I was able to follow the train/save_feature/test for the baseline and meta algorithms separately and obtained similar results as the paper. I'd like to try the domain adaptation on these methods too. How should I do it to see adaptation from miniImagenet to CUB? What's the command line to do it? For example,

python ./train.py --dataset miniImagenet --model Conv4 --method baseline --train_aug
python ./save_feature.py --dataset miniImagenet --model Conv4 --method baseline --train_aug
python ./test.py --dataset CUB --model Conv4 --method baseline --train_aug --adaptation

Is #3 correct to use to call baseline finetune on new domain CUB data from the trained embedding using miniImagenet?? How do I specify it (call miniImagenet trained embedding) from the command line?

Thank you very much in advance!!

experiments over tiered ImageNet dataset

Hi, I wanted to know if you had run baseline/baseline++ over the tiered ImageNet dataset. If so, can you share the performance results over it also?

Can't reach specified accuracy on CUB dataset ResNet10

Getting around 80.78 (85.17 in paper) % accuracy on 5 shot , ResNet10 baseline++ with augmentation on.
Command used to train : python train.py --dataset CUB --model ResNet10 --method baseline++ --train_aug
Command used to test : python test.py --dataset CUB --model ResNet10 --method baseline++ --train_aug

Clarify # of epochs

Hi, thanks for releasing code!

Just to clarify, in the paper you train "60,000 episodes for 1-shot tasks" - say I'm running ProtoNets on the CUB dataset - by default each epoch consists of 100 sampled episodes, correct? So the results in the paper are reported with the best model (according to val) after 600 epochs?

params.num_class in Baseline/Baseline++ method

In training stage of Baseline/Baseline++ method, it seems that you always set num_class as 200. But every dataset has specific number of base class(for example, miniImageNet has 64 base class). Is it set to 200 for convenience(someone do not need to modify this parameter when conduct experiments both in CUB and miniImageNet)? But i think will it be better to set num_class as a optional command line argument or set num_class as 64 for miniImage and 50 for CUB?

datamgr get error

When I run the code with the train_aug=True,I got the error:
TypeError: init() missing 1 required positional argument: 'size'
Could you help me solve it?Thanks

Can't reach reported accuracy on MiniImagenet dataset of baseline++ with ResNet18

Hello, Thanks for your great work.
I have run your code with the following command for several times, but the final accuracy is all about 73.99, which is slower than 75.68 reported in the paper. Am I missing something? Or I have to tune some hyperparameters?

for file in train save_features test; do
    python ./${file}.py --dataset miniImagenet --model ResNet18 --method baseline++ --train_aug --n_shot 5
done

Thanks very much.

doubt regarding 1 shot n way

in the case of image classification,

we have a training set, with x inputs, and the loss is computed by whether our neural network was correctly able to predict the label or not, and accordingly taking the difference between the ground truth and the predicted label.

so, we get a training set loss, and then test this trained model on some test images, and then compute the test set loss.

when using maml, we have epidoses (tasks) within our meta training set, and each of these episodes contain a training set (support set) and a test set (query sample), and similarly episodes within our meta test set
contain a training set and a test set.

I am a bit confused what 1 shot 5 way or 5 shot 5 way means, and how is the loss being computed?

does 1 shot 5 way mean that we have 1 image for each of the 5 classes in the training set (support set), and we compute softmax regression probability over these 5 classes and compute training set loss accordingly.

and then give our trained model, a test input from the query sample, and compute the test set loss?

while 5 shot 5 way means

we have 5 images for each of the 5 classes in the training set (support set), so 25 images, and we compute softmax regression probability over these 5 classes and the training set loss for one epoch would be the summation of loss obtained for these 25 images given to our model.

and our test set (query sample) also contains 5 images of the same class, and these 5 images are given to our trained model, and we take summation of loss obtained on these 5 images?

I am confused, can you clarify, what does the loss mean when training maml?

thanks

Should not use ImageNet mean and std?

CloserLookFewShot/data/datamgr.py

Line 13 in b3b5b8b

 normalize_param = dict(mean= [0.485, 0.456, 0.406] , std=[0.229, 0.224, 0.225]), 

I am not sure how much performance change will happen, but when we are not using ImageNet pretrained model, I guess we should not normalize with the mean and standard deviation computed from ImageNet. When we train weights from scratch, we should normalize using mean and std computed from each dataset. Maybe using ImageNet mean for mini-Imagenet is fine because it's part of ImageNet but for CUB dataset, I guess we should use different one.

Why the batch_size is 16 in baseline and baseline++ ,Can I change the value to 128 or 256?

Details on training with ResNet

Thank you for your impressive work, and codes too!
Given using a ResNet18 backbone leads to a ~5% raise to accuracy, as is shown in your paper, I tried to train RelationNet with a backbone of ResNet18, however the model quickly overfits and ends up to a ~60% test accuracy after 60,000 episodes of training.
I was using your code with Adam as optimizer, LR=0.001 and reduces the LR by 25% when it comes to a plateau. weight_decay=0.0005 and grad clipping=(1, L2 norm) were also performed.

I wonder where's the difference between my setting and yours, and how you achieved the 70% accuracy in ResNet18+RelationNet. Thanks a lot!

finally find property super parameters for mini-imagenet

after a lot of attemptions,i finally found a robust parameters to re-implement the refined cosine distance method:
1.we rewrite the cosine distance code.
2. we set scale factor s = 30 all the time for training.
3. we use sgd optimizer with m=0.9 and lr=0.1 at begining,and decay with 0.1 at 100,200,300 epoch
4.for conv4 network we use weight-decay = 5e-4,for ResNetX network we use weight-decay = 1e-3
5.when doing finetune,we set the scale factor s = 5.

here is the result on ResNet18:
mini-imagenet:
1 shot 56.78+0.84
5 shot 75.46+0.63

The effect of feature backbone, like Conv-4, Resnet18

Thanks for your work!

I'm not clear about the effect of feature backbone in the neural network. Can I understand that it extract features and send the features to different models, like matching net and relation net?

And I found that in the relationnet.py, you have defined the RelationModule, it is also some layers that extract features. So, I don't know the relationship between the RelationModule and the feature backbone, like Conv-4, Conv-6, Restnet-18?

And you changed the mse loss of relation network to softmax to expedite training.
I want to ask whether the accuracy will be improved with the softmax loss?

Thanks a lot!

write_omniglot_filelist

for dataset in dataset_list:
with open(datasetmap[dataset] + ".txt", "r") as lines:
for i, line in enumerate(lines):
label = line.replace('\n','')
folderlist.append(label)
filelists[dataset][label] = [ join(data_path,label, f) for f in listdir( join(data_path, label))]

The last line in the snippet above is giving the error:

OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'C:\cygwin64\home\Adil\CloserLookFewShot\filelists\omniglot\images\'

I am a newbie to python and pytorch, so any help would be appreciated.

There is an error in the code that generates the json file in ./filelists/miniImagenet .

Doubt about the correctness of testing accuracy of fine-tuning stage and adaption stage.

In fine-tuning stage and adaption stage(chapter 4.5 in paper), the code in 'test.py' use test data set and randomly split data into support set and query set in each episode without overlap for fine-tuning and evaluation, while iterating for many episodes. I think the support set and query set in different episodes may overlap, which means the evaluation data may be use to fine-tune model, thus cause the test accuracy higher than it should be. So, can you explain it?

can you release the miniImageNet directly? the ILSVRC is too large...

thanks~

imagenet link does not work

Hey,

The imagenet link to download in order to use the mini-imagenet does not seem to be working at the moment, would it be possible to upload your own .tar.gz file somewhere and download from there?

Regards

test.py _ classifier.weight

hello,

  very nice work~
  I am running the demo with CUB data, however, in test stage, an error happend:(And I found that the code just updated a day ago~)

Traceback (most recent call last):
File "test.py", line 108, in
model.load_state_dict(tmp['state'])
File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BaselineFinetune:
Unexpected key(s) in state_dict: "classifier.weight", "classifier.bias".

Error running the code

I used the below command to try running the code:
python ./train.py --dataset miniImagenet --model Conv4 --method baseline --train_aug

but I run into the error:
Traceback (most recent call last):
File "./train.py", line 100, in
base_datamgr = SimpleDataManager(image_size, batch_size = 16)
File "/media/data2/manolis/CloserLookFewShot/data/datamgr.py", line 53, in init
super(SimpleDataManager, self).init()
TypeError: super() argument 1 must be type, not classobj

Any insight on that?? Thanks in advance.

Model Selection for Baseline and Baseline++ Training with Base Classes

Hi there,

If I have the right understanding that for the baseline and baseline++ model trained with base classes, in the paper reported results the model used is the epoch399 if trained with miniImagenet?

I am wondering that for all the model structures, using the same epoch seems fair to compare in the paper however is there overfitting problem which may influence the evaluation performance as the model selection is not based on validation set for the base classes?

Thanks!

the accuracies about baseline and baseline++

hello, I run the code of baseline and baseline++ on CUB. However, the results are lower than the reported ones.
python train.py --dataset CUB --model ResNet10 --method baseline++ --train_aug
python save_features.py --dataset CUB --model ResNet10 --method baseline++ --train_aug
python test.py --dataset CUB --model ResNet10 --method baseline++ --train_aug
the results are around 80%, large lower than 85%, is there any thing wrong? thanks!

questions about cosine distance in the code

hi, thank your for the nice work first.
i have some problems in the code that confuses me. To compute the cosine distance, the implementation is the following

    x_norm = torch.norm(x, p=2, dim =1).unsqueeze(1).expand_as(x)

    x_normalized = x.div(x_norm+ 0.00001)

    L_norm = torch.norm(self.L.weight.data, p=2, dim =1).unsqueeze(1).expand_as(self.L.weight.data)

    self.L.weight.data = self.L.weight.data.div(L_norm + 0.00001)

    cos_dist = self.L(x_normalized) #matrix product by forward function

    scores = self.scale_factor* (cos_dist)

I notice that in the forward pass , the parameter of Linear function is replaced directly with the normed one. It is not simply a process to compute cosine distance.
It means at training time, the paramater theta is updated by:
theta= normed (theta)
theta=theta - learning rate* theta
,not
theta=theta - learning rate* theta.

This is a bit different from the illustration in the paper.

About the ProtoNet with Conv4 backbone result

Hi, I re-do the Conv4-ProtoNet 5-way-5-shot experiment,
but the accuracy is about 4% more than the paper.

python ./train.py --dataset CUB --model Conv4 --method protonet --train_aug
python ./save_features.py --dataset CUB --model Conv4 --method protonet --train_aug
python ./test.py --dataset CUB --model Conv4 --method protonet --train_aug

I ran several experiments with the same setting, getting results with 74-75% accuracy.
But the paper shows the result with 70.77%,
Is there something wrong?

Thanks!

Did you use weight decay for ResNet backbone to get the results in your paper?

About accuracy of baseline in miniImagenet

Thanks for your share of code, when I ran code :
"""
python ./train.py --dataset miniImagenet --model Conv4 --method baseline --train_aug
python ./save_features.py --dataset miniImagenet --model Conv4 --method baseline --train_aug
python ./test.py --dataset miniImagenet --model Conv4 --method baseline --train_aug
"""
only get 58.7%. However, accoding to paper, it can reach 62.5%. I'm wondering if something wrong in my operation.

GPU memory issue

Hi,

Thanks for sharing the code. For the training of each meta-learning algorithm, do you use multi-GPU? What's the GPU memory consumption for each algorithm from your experience? (Currently, I cannot run the training of MAML on a single Titan GPU)

Thanks!

Some question about resnet

Very nice job!!! I tried to use resnet (10, 12, 18) as a backbone of my model, but it didn't improve performance as we expected. In contrast, In our experiments, deeper resnet was mush easier overfitting than 4-conv which made us very confused. I wonder if you ever encountered a similar problem during your experiment. Thank you very much.

Extend your work for my thesis?

Hi,

Firstly, I'd like to say that I really like your paper. You've some excellent work.

My situation is that I got my master thesis whose goal is to implement and comparatively evaluate few-shot learning methods but then your paper arrived. I have not started my thesis yet, but for specifc reasons, the topic might be hard to change.

I am still discussing with my supervisor but It is possible that I will extend your project. Thus, I would like to ask if you are still continuing this work. I will look for a different direction if you are.

Thank you.

Some Question for the evaluation Results

Hi there,

Thanks for the great code and fantastic work during the ICLR 2019. Recently, I use the original code to test on Matching Network and found the result is higher than the reported result about 1 per, does this situation is normally?

pytorch version

Hi,
It is a nice work.
Now I am facing some problems when running your code. So I want to make sure that I use the same pytorch as you. Which pytorch version did you use when you implement this project?

At finetue stage, is that classifier is new, not finetune from trained?

Thank's for your work!
I have a question that is it classifier at test stage is new?
I can't find classifier loading trained weights in code.

Can't reach reported accuracy on Omniglot

Hi, I've run the baseline Conv4 on Omniglot. I've notice that for 1-shot, the experiment result is far off from the reported 94%. Any idea why?

Time: 20190416-134341, Setting: omniglot-novel-Conv4S-baseline 5shot 5way_test, Acc: 600 Test Acc = 98.62% +- 0.15%
Time: 20190416-133451, Setting: omniglot-novel-Conv4S-baseline 1shot 5way_test, Acc: 600 Test Acc = 89.30% +- 0.58%

The scripts used to run these experiments are:

python3 train.py --dataset omniglot --model Conv4 --method baseline --num_classes 4112
python3 save_features.py --dataset omniglot --model Conv4 --method baseline
python3 test.py --dataset omniglot --model Conv4 --method baseline --n_shot 5
python3 test.py --dataset omniglot --model Conv4 --method baseline --n_shot 1

can't reach the accuracy as report

I have train your code with loss_type = 'dist' & train_aug（baseline++）,but the 5 shot accuracy on mini-imagenet is only 56%,much lower than your report(66%)，could you give a resonable explaination?

Questions on SetDataManager

Hi,

Thanks for your contribution and the code is really well organized in a good writing style.

From the code, I got the following understanding:
Take miniImageNet and Protonet training as an example:
The number of epoches: 400 for 5-shot.
In each epoch, the number of batches is: n_episodes(default is 100 in the code);
So the total number of episodes are 40,000(400 * 100), which is the same with what you report in the paper;

However, I have confusion with the following and please correct me if the understanding is not correct:

In each batch, you use episodicBatchSampler. This BatchSampler will be called n_episodes times for a batch, more specifically the function iter will be called n_episodes times.

However, in the function iter, there is also a iteration over range(n_episodes); and each time in this iteration the yield result is a N-way iterator.

Does this mean that if the definition of a episode is one set of N-way classes, the actual number of episodes for a epoch is 100*100?

Thanks!

Dataset Normalization

Hi~

Thanks for your work!

I found in your code, all the data is normalized before they are sent into the models. I want ask that is data normalization necessary and is it important to the final performance?

Besides, I found in order for data normalization, you need to determine the mean and std of the data. Do you use the entire dataset to calculate the mean and std? Or are they computed using only training data?

Thanks a lot!

accuracy about miniImagenet protonet ResNet18 train_aug

Hello, I have run your code
python ./train.py --dataset miniImagenet --method protonet --model ResNet18 --train_aug
python ./save_features --dataset miniImagenet --method protonet --model ResNet18 --train_aug
python ./test.py --dataset miniImagenet --method protonet --model ResNet18 --train_aug
but I got the max test accuracy is 72.30%, while the reported is 73.68.
the version of my pytorch is 1.0.0.
could you give me some explains and advice? Thank you!

Baseline and Baseline++ #way#shot method

Hi :
I am wondering if I want to train baseline++ with few shot training method, I just change the dataloader class ,right? Because now in train.py, the baseline and baseline++ are not trained in few shot method.

Can I use the miniImagenet from here?

Can I use the miniImagenet from
https://drive.google.com/file/d/1fJAK5WZTjerW7EWHHQAR9pRJVNg1T1Y7/view?usp=sharing
It's used in MetaOptnet. https://github.com/kjunelee/MetaOptNet. And it's much smaller.

Is `baselinefinetune.py` or `baselinetrain.py` used in your paper?

Hello @wyharveychen,

In test.py, baselinefinetune.py was called for both baseline and baseline++, however, in train.py baselinetrain.py was called. I'm not sure which one is the result used in your paper Figure 1 Baseline and Baseline++ few-shot classification methods. It seems to me that either one is no longer up-to-date. Thanks!

UPDATE
I re-read the section and figured this out, thanks!

pixel size of the miniImagenet in this work

hello, I ran your code on miniImageNet, when I use the dataset with image size 8484, the results can not reach the reported accuracy, then I read you code which forms the dataset, I found you make miniImageNet from original image of ImageNet, so I wonder the image size in the ./filelists/miniImagenet is 8484 or larger?

issues with '\' and '/' when running on windows

I changed the \ to double \ in the json files
but got this error

C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\site-packages\torchvision\transforms\transforms.py:208: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
"please use transforms.Resize instead.")
Traceback (most recent call last):
File "./train.py", line 177, in
model = train(base_loader, val_loader, model, optimization, start_epoch, stop_epoch, params)
File "./train.py", line 32, in train
model.train_loop(epoch, base_loader, optimizer ) #model are called by reference, no need to return
File "D:\idm\CloserLookFewShot-master\CloserLookFewShot-master\methods\baselinetrain.py", line 38, in train_loop
for i, (x,y) in enumerate(train_loader):
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 819, in iter
return _DataLoaderIter(self)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\site-packages\torch\utils\data\dataloader.py", line 560, in init
w.start()
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\context.py", line 313, in _Popen
return Popen(process_obj)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\popen_spawn_win32.py", line 66, in init
reduction.dump(process_obj, to_child)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\reduction.py", line 59, in dump
ForkingPickler(file, protocol).dump(obj)

_pickle.PicklingError: Can't pickle <function at 0x000001EB82AC5598>: attribute lookup on data.dataset failed

File "", line 1, in
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 106, in spawn_main
exitcode = _main(fd)
File "C:\Users\anonymous218\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 116, in _main
self = pickle.load(from_parent)
EOFError: Ran out of input

what to do?

Discrepancy on query size

Hi Harvey,

In your paper it was stated that

"For each class, we pick k labeled instances as our support set and 16 instances for the query set for a k-shot task"

However in your feature_evaluation function in test.py code, the default number of query instances per class was 15: n_query = 15.

Since the setting used by Ravi et al. was also 15 samples per class for D_test, was it a minor typological error in your paper?

Does the learning rate always stay the same (default 0.001 in the Adam optimizer)?

Does the learning rate always stay the same (default 0.001 in the Adam optimizer)? If not, could you tell me how you set in your experiments? Thanks!

test.csv is unavailable

I try to run script 'download_miniImagenet.sh' and the program broke down with error messages below:

--2019-05-27 09:53:56-- https://raw.githubusercontent.com/twitter/meta-learning-lstm/master/data/miniImagenet/test.csv%0D
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.76.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.76.133|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2019-05-27 09:53:56 ERROR 404: Not Found.

': [Errno 2] No such file or directorygenet_filelist.py
': [Errno 2] No such file or directoryilelist.py

I think the main problem is file 'test.csv' is not available on server. If my guess is correct, could you please upload this file to github since it is not so big? If there are other causes, could you give me a hand? Thanks

Augmentation

Hi, I have a question regarding your augmentation methods. In your code, scaling and center crop is still applied when augmentation is False in both training and validation stage. In the repo jakesnell/prototypical-networks, only scaling is applied. Is this intended? If it is, any specific reason?

wyharveychen / closerlookfewshot Goto Github PK

closerlookfewshot's People

Contributors

Stargazers

Watchers

Forkers

closerlookfewshot's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs