clarifai / few-shot-ctm Goto Github PK

View Code? Open in Web Editor NEW

154.0 154.0 31.0 442 KB

Few shot learning

License: Other

Python 99.25% Shell 0.75%

few-shot-ctm's People

Contributors

Stargazers

Watchers

few-shot-ctm's Issues

pre-trained

The res-net used in code is from pytorch official webiste,which pre-trained from image-net. Is it fair compared to other models?

it seems difficult to reproduce

I run all the possible config in the code you provide with a 5way-1shot setting on mini-imagenet, but result is far from you reported in the paper.
And when i run with resnet backbone, the code seems to download pretrain model from pytorch official website, so I doubt the you really get the 62.05 without augmentation in 5way-1shot?

is there any reproducibility problem?

It seems many people think it's hard to got the result.....

I am interested in your work. But I just train a 4-conv layer with your method, and log esimated the left time is 40 days. Could you release the model reported in your paper, thx

How to use multi graphics card to run it

when i run the code with " python main.py --gpu_id 0 1",I got this error :
AttributeError: 'DataParallel' object has no attribute 'forward_CTM'
so how to use multi graphics card to run it

Why i can't run the code with default setting

I just run as the readme writes
python main.py --yaml_file configs/demo/mini/20way_1shot.yaml
and i got this error
Traceback (most recent call last): File "main.py", line 204, in <module> main() File "main.py", line 28, in main opts = Config(config['options']['ctrl.yaml_file'], config['options']) File "/home/few-shot-ctm/core/config.py", line 111, in __init__ merge_cfg_from_file(yaml_file, self) File "/home/few-shot-ctm/tools/general_utils.py", line 278, in merge_cfg_from_file _merge_a_into_b(yaml_cfg, _config) File "/hom/few-shot-ctm/tools/general_utils.py", line 334, in _merge_a_into_b _merge_a_into_b(v, b[k], stack=stack_push) File "/home/few-shot-ctm/tools/general_utils.py", line 324, in _merge_a_into_b raise KeyError('Non-existent config key: {}'.format(full_key)) KeyError: 'Non-existent config key: ctmnet.pred_source'
python version 3.6.8 pytorch version 1.2.0
Any idea?

difference between paper and implement

Thanks for your detailed code. If i understand right, as paper 3.2.4 said,

the combination is an element wise multiplication of .....

but in core.model, the implement is torch.matmul whatever the self.dnet_supp_manner is. There is a distinct difference both in calculation and instinctive explanation. If i misunderstand sth, please tell me.

by the way, in my understanding, self.dnet_supp_manner = 1 corresponds to option1 in paper 3.2.4, self.dnet_supp_manner = 3 corresponds to option 2, self.dnet_supp_manner = 2 corresponds to confusion of option 1 and option2.

Is the result difficult to reproduce？

After taking a look at everyone's questions, I would like to ask, is it necessary to run the code?

I think the author pre-trained the model using 64 meta-training set + 16 meta-testing set, that's the reason of high performance.

Until now, the author still has not provided the pre-training codes of the model. A reasonable guess is that he used 80 classes (64 + 16) to pre-trained the model. So, the proposed method may not actually improve performance, just a work of pre-trained the model on meta-training and meta-testing set.

How to implement Concentrator, Projector and Reshaper when use ResNet18 as backbone

As I can see in the code, when I use ResNet18 as backbone, the Concentrator, Projector and Reshaper are same which is

self.main_component = nn.Sequential(self._make_layer(Bottleneck, out_size*2, 3, stride=1), 
                                    self._make_layer(Bottleneck, out_size, 2, stride=1))
self.projection = nn.Sequential(self._make_layer(Bottleneck, out_size*2, 3, stride=1),
                                    self._make_layer(Bottleneck, out_size, 2, stride=1))
self.reshaper = nn.Sequential(self._make_layer(Bottleneck, out_size*2, 3, stride=1),
                                    self._make_layer(Bottleneck, out_size, 2, stride=1))

Am I right? If so, it's inconsistent with the paper because the Projector and Reshaper should be simple CNN as described. Besides, the model capacity will be much larger than before. May I misunderstand the code?

how long we need to train？

i just run the command line python main.py --yaml_file configs/demo/mini/20way_1shot.yaml，and change self.ctrl.device = 'cuda' , ctrl.device = 'cuda'.The code is almost running 48 hours in one GPU. And just get

Evaluating at epoch 216, step 212, with eval_length 600 ... (be patient)
Current accuracy is 0.1530 < previous best accuracy is 0.1740 (ep55, iter212)
Can you give me some constructive advice?

code problem

Why a such simple method can be implemented in a such complicated way, can you release a simpler version of project that can reproduce the result in paper ?

clarifai / few-shot-ctm Goto Github PK

few-shot-ctm's People

Contributors

Stargazers

Watchers

Forkers

few-shot-ctm's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs