jmkim0309 / fewshot-egnn Goto Github PK

License: MIT License

Python 100.00%

fewshot-egnn's Introduction

fewshot-egnn

Introduction

The current project page provides pytorch code that implements the following CVPR2019 paper:
Title: "Edge-labeling Graph Neural Network for Few-shot Learning"
Authors: Jongmin Kim, Taesup Kim, Sungwoong Kim, Chang D.Yoo

Institution: KAIST, KaKaoBrain
Code: https://github.com/khy0809/fewshot-egnn
Arxiv: https://arxiv.org/abs/1905.01436

Abstract: In this paper, we propose a novel edge-labeling graph neural network (EGNN), which adapts a deep neural network on the edge-labeling graph, for few-shot learning. The previous graph neural network (GNN) approaches in few-shot learning have been based on the node-labeling framework, which implicitly models the intra-cluster similarity and the inter-cluster dissimilarity. In contrast, the proposed EGNN learns to predict the edge-labels rather than the node-labels on the graph that enables the evolution of an explicit clustering by iteratively updating the edgelabels with direct exploitation of both intra-cluster similarity and the inter-cluster dissimilarity. It is also well suited for performing on various numbers of classes without retraining, and can be easily extended to perform a transductive inference. The parameters of the EGNN are learned by episodic training with an edge-labeling loss to obtain a well-generalizable model for unseen low-data problem. On both of the supervised and semi-supervised few-shot image classification tasks with two benchmark datasets, the proposed EGNN significantly improves the performances over the existing GNNs.

Citation

If you find this code useful you can cite us using the following bibTex:

@article{kim2019egnn,
  title={Edge-labeling Graph Neural Network for Few-shot Learning},
  author={Jongmin Kim, Taesup Kim, Sungwoong Kim, Chang D. Yoo},
  journal={arXiv preprint arXiv:1905.01436},
  year={2019}
}

Platform

This code was developed and tested with pytorch version 1.0.1

Setting

You can download miniImagenet dataset from here.

Download 'mini_imagenet_train/val/test.pickle', and put them in the path 'tt.arg.dataset_root/mini-imagenet/compacted_dataset/'

In train.py, replace the dataset root directory with your own: tt.arg.dataset_root = '/data/private/dataset'

Training

# ************************** miniImagenet, 5way 1shot *****************************
$ python3 train.py --dataset mini --num_ways 5 --num_shots 1 --transductive False
$ python3 train.py --dataset mini --num_ways 5 --num_shots 1 --transductive True

# ************************** miniImagenet, 5way 5shot *****************************
$ python3 train.py --dataset mini --num_ways 5 --num_shots 5 --transductive False
$ python3 train.py --dataset mini --num_ways 5 --num_shots 5 --transductive True

# ************************** miniImagenet, 10way 5shot *****************************
$ python3 train.py --dataset mini --num_ways 10 --num_shots 5 --meta_batch_size 20 --transductive True

# ************************** tieredImagenet, 5way 5shot *****************************
$ python3 train.py --dataset tiered --num_ways 5 --num_shots 5 --transductive False
$ python3 train.py --dataset tiered --num_ways 5 --num_shots 5 --transductive True

# **************** miniImagenet, 5way 5shot, 20% labeled (semi) *********************
$ python3 train.py --dataset mini --num_ways 5 --num_shots 5 --num_unlabeled 4 --transductive False
$ python3 train.py --dataset mini --num_ways 5 --num_shots 5 --num_unlabeled 4 --transductive True

Evaluation

The trained models are saved in the path './asset/checkpoints/', with the name of 'D-{dataset}-N-{ways}-K-{shots}-U-{num_unlabeld}-L-{num_layers}-B-{batch size}-T-{transductive}'. So, for example, if you want to test the trained model of 'miniImagenet, 5way 1shot, transductive' setting, you can give --test_model argument as follow:

$ python3 eval.py --test_model D-mini_N-5_K-1_U-0_L-3_B-40_T-True

Result

Here are some experimental results presented in the paper. You should be able to reproduce all the results by using the trained models which can be downloaded from here.

miniImageNet, non-transductive

Model	5-way 5-shot acc (%)
Matching Networks [1]	55.30
Reptile [2]	62.74
Prototypical Net [3]	65.77
GNN [4]	66.41
(ours) EGNN	66.85

miniImageNet, transductive

Model	5-way 5-shot acc (%)
MAML [5]	63.11
Reptile + BN [2]	65.99
Relation Net [6]	67.07
MAML + Transduction [5]	66.19
TPN [7]	69.43
TPN (Higher K) [7]	69.86
(ours) EGNN	76.37

tieredImageNet, non-transductive

Model	5-way 5-shot acc (%)
Reptile [2]	66.47
Prototypical Net [3]	69.57
(ours) EGNN	70.98

tieredImageNet, transductive

Model	5-way 5-shot acc (%)
MAML [5]	70.30
Reptile + BN [2]	71.03
Relation Net [6]	71.31
MAML + Transduction [5]	70.83
TPN [7]	72.58
(ours) EGNN	80.15

miniImageNet, semi-supervised, 5-way 5-shot

Model	20%	40%	60%	100%
GNN-LabeledOnly [4]	50.33	56.91	-	66.41
GNN-Semi [4]	52.45	58.76	-	66.41
EGNN-LabeledOnly	52.86	-	-	66.85
EGNN-Semi	61.88	62.52	63.53	66.85
EGNN-LabeledOnly (Transductive)	59.18	-	-	76.37
EGNN-Semi (Transductive)	63.62	64.32	66.37	76.37

miniImageNet, cross-way experiment

Model	train way	test way	Accuracy
GNN	5	5	66.41
GNN	5	10	N/A
GNN	10	10	51.75
GNN	10	5	N/A
EGNN	5	5	76.37
EGNN	5	10	56.35
EGNN	10	10	57.61
EGNN	10	5	76.27

References

[1] O. Vinyals et al. Matching networks for one shot learning.
[2] A Nichol, J Achiam, J Schulman, On first-order meta-learning algorithms.
[3] J. Snell, K. Swersky, and R. S. Zemel. Prototypical networks for few-shot learning.
[4] V Garcia, J Bruna, Few-shot learning with graph neural network.
[5] C. Finn, P. Abbeel, and S. Levine. Model-agnostic meta-learning for fast adaptation of deep networks.
[6] F. Sung et al, Learning to Compare: Relation Network for Few-Shot Learning.
[7] Y Liu, J Lee, M Park, S Kim, Y Yang, Transductive propagation network for few-shot learning.

fewshot-egnn's People

Contributors

Stargazers

Watchers

fewshot-egnn's Issues

5way 5shot experiment of miniImageNet

Hello,
your work is really good.
But as i try a 5way 5shot experiment in miniImageNet(transductive method), the result of my experiment could not achieve 76.37% as you reported. I did this experiment just with the code 'python3 trainer.py --dataset mini --num_ways 5 --num_shots 5 --trainsductive True' in readme. Since i don't have a file named 'trainer.py', so i changed 'trainer.py' to 'train.py'. Were you having some other adjustments when doing experiments?
Thank you very much!

how can i get the pickle file of tiered-imagenet dataset?

I can't get the pickle file of tiered-imagenet dataset. Give me some help please.Thanks!

The tieredImageNet dataset

About the evaluation setup

Hello, I'm just a little confused about the evaluation setup.

In Section 4.2 of your paper, it is said that 'For evaluation, each test episode was formed by randomly sampling 15 queries for each of 5 classes, and the performance is averaged over 600 randomly generated episodes from the test set.' I think it means every test episode has (5 * 15 =) 75 queries and 75 graphs are formed from each test episode under the non-transductive setting.

However, according to your released code, it seems that every val/test episode only has (5 * 1 =) 5 queries. And you randomly sample 10,000 episodes for validating/testing.

I'm just wondering which evaluation setup you use when obtaining the results in your paper. And have you tried them both? If so, is there any difference between the results obtained by following these two evaluation setups? Thank you!

the node and edge feature

Hi! Thanks for your amazing work. I was trying to load the feature ,but I was wondering what should I set the node and edge feature if I load the feature of my own which is extracted based skeleton.

How to get the images from mini_imagenet_train.pickle?

Hi,
How to get the images from mini_imagenet_train.pickle? Thanks!

How to get tiered-Imagenet dataset

Hi,
I did not find a link to tiered-Imagenet dataset in README.md.

Hi

a question about BCE loss

hello, i find full_edge_loss_layers = [self.edge_loss((1-full_logit_layer[:, 0]), (1-full_edge[:, 0])) for full_logit_layer in full_logit_layers]
why it is self.edge_loss((1-full_logit_layer[:, 0]), (1-full_edge[:, 0]))
not self.edge_loss(full_logit_layer[:, 0], full_edge[:, 0])

Feature value of x_ij in the egde matrix should be equal to the value of x_ji?

wget dataset

How to use the wget command to download ‘mini_imagenet_test.pickle’？

How the pickel is generated from origin images

Hi, this is good work and very helpful. But I wonder how you generate the pickle files from original images. Because when I use my pickles (generated based on original images through Resize(84) and CenterCrop(84) from torchvsion.transforms). The performance decrease significantly from 77% to 73%

How to open log file of DA-DL-02 format?

First, thanks for your great job!
I have trouble to open the log file to see print information.
For example: events.out.tfevents.1566895592.DA-DL-02
I have already searched it on google, but I can't find solutions.

May I ask if I want to test with my own image data, should I first change it to CSV file format and then to pickle data format?

about tiered-imagenet dataset

Hi! Your work is really good!
But could you please provide the .csv file and the link to download tiered-imagenet images?
Thanks.

How to train the model under semi-supervised learning

Hi, very nice work! I just have a little question. How can I train this model under semi-supervised learning? I cannot find the arg for semi-supervised learning in the files. Thx!

Node label sequence

Hello:
In the 5-W 1-S setting, the query set label of each batch during training and testing is [0,1,2,3,4], no scrambling is performed,Will this make the network remember this setting, and the accuracy will increase?
In other papers (GNN, relational network) that I read for few shot learning, the labels of the query set are out of order, so I follow this idea of out of order and only use the source code of each batch test query set label randomly scramble, maybe [1,4,2,0,3], [1,2,4,0,3], [2,0,4,3,1], etc., init_edge is also based on the modified label,the sequence generated is still a 10*10 symmetric matrix, and the accuracy value is only about 43%, which is far from the 66.27% accuracy of my source code.I also scrambled during the training and testing phases, and the result was about 43%.What I thought about the graph network at the beginning was that the order of the node labels should have no effect on the accuracy rate, because we made the form of the data into the graph, data structured, and relative, but this huge accuracy difference makes me,I don't quite understand it. Did I set it wrong?
thank you very much!

The google drive need authority, I can not download dataset and model from it.

I think you have done great work, could you please cancel the limitation of google drive.

About the generalization of Tranductive Setting

Hi, It is a nice work! But I have a question about the training and testing settings.

For training, a task has one query data for each class (total 1 * 5 query data of a task) in the task. When testing, the performance of the model should not change much with the query number of each task. But my experiment shows that the performance will decrease a lot when there are more than one query data for each class in the task (>1 * 5 query data in a task) when testing. Specifically, the performance decreases to 11% when each test episode is formed by sampling 15 queries for each of 5 classes.

I am wondering if the model is overfitting for the specific setting: 1 query for each of 5 classes for a task.

root question

Hi, your model is quite amazing. But as I use windows, my root is always invalid, what more is supposed to change except tt.arg.dataset_root in train.py. Only change this with the absolute path of the file mini_imagenet_train.pickle always rise FileNotFoundError.

https://github.com/khy0809/fewshot-egnn/blob/205fa80ec7cb12550f7b52a63f921171f92dac4c/data.py#L111

which assigns the wrong class labels ([0, ..., #ways]) to the picked data points. Shouldn't the support as well as the query label be assigned task_class_list[c_idx]?