mboudiaf / pytorch-meta-dataset Goto Github PK

View Code? Open in Web Editor NEW

56.0 7.0 9.0 1.4 MB

A non-official 100% PyTorch implementation of META-DATASET benchmark for few-shot classification

Python 99.01% Makefile 0.99%

few-shot-learning few-shot-classifcation meta-dataset pytorch tfrecorddataset

pytorch-meta-dataset's People

Contributors

Stargazers

Watchers

Forkers

evanrgreen pikus16 mosymosy aamer98 laplacekorea antreasantoniou brando90 hummarow vieozhu

pytorch-meta-dataset's Issues

ResNet structure

Hi,

The original tensorflow implementation uses the standard structure for the first convolution layer, i.e., 7x7 kernel size, stride 2, padding 3 and a 3x3 max pooling layer after that (link) while in your implementation this layer is used with 3x3 kernel size and without max pooling (link). In this way the resulted feature map is way larger and costs more memory. I also notice that in the PAMI version of TIM the authors claim that the pytorch version of baselines are much better than the original version. I wonder if the performance boost comes from this modification. The 'larger' version of resnet seems not so practical for meta-dataset, since it will lead to OOM when being trained with the ProtoNet or other episodic methods. I don't know if I have any misunderstanding about the code.

Thanks.

Training the fine-tuned base line with standard supervised learning with union/concatenation of labels

Hi @mboudiaf, I wanted to train the fine-tuned baseline from meta-data set (MDS) i.e. concatenate/union all the data sets and all the labels and then train in normal supervised learning. Is the right way to do this this:

pytorch-meta-dataset/example.py

Line 173 in c6d6922

batch_loader = DataLoader(dataset=batch_dataset,

I am mainly asking because there needs to be some sort of relabling that takes into account all the data set labels and wanted to know how that was done.

Thank you!

squeeze() needs to be added to support, support_labels, query, query_labels

When support, query, support_labels, query_labels are returned from the DataLoader, the 1st dimension is 1 in size which is redundant and will usually not work properly when fed into a network. A squeeze(x, dim=0) will fix this.

how to compute data set size form tfrecrods for mds (within python)?

Meta-batch size hard coded to 1

Hi, thank you for your implementation.

The meta-training dataloader seems to have a batch size 1 hard coded in it. I would like to train MAML on this and the default meta batch size there is 4. So I would like to know if there is any particular reason as to why the meta batch size is hard coded to 1.

Thank you!

Normalize should map image tensors in the range -1 to 1 to be compatible with Meta-Dataset

Instead of the ImageNet friendly normalize transform:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

use the following:

normalize = transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])

as Meta-Dataset guarantees that all images will be normalized in the range -1 to 1.

How can we use all classes all the time for episodic training even if the number of examples is small?

Buggy code in example.py?

example.py has the following lines of code:

        use_bilevel_ontology_list = [False]*len(datasets)
        # Enable ontology aware sampling for Omniglot and ImageNet.
        if 'omniglot' in datasets:
            use_bilevel_ontology_list[datasets.index('omniglot')] = True
        if 'imagenet' in datasets:
            use_bilevel_ontology_list[datasets.index('imagenet')] = True

        use_bilevel_ontology_list = use_bilevel_ontology_list
        use_dag_ontology_list = [False]*len(datasets)

shouldn't it be:

        use_bilevel_ontology_list = [False]*len(datasets)
        use_dag_ontology_list = [False]*len(datasets)

        # Enable ontology aware sampling for Omniglot and ImageNet.
        if 'omniglot' in datasets:
            use_bilevel_ontology_list[datasets.index('omniglot')] = True
        if 'ilsvrc_2012' in datasets:
            use_dag_ontology_list[datasets.index('imagenet')] = True

as the 'imagenet' dataset is actually called 'ilsvrc_2012' and it should use the DAG ontology.

Feature Request: Ideally the episodes generated would be repeatable for a specified seed.

The official meta-dataset reader is not deterministic and repeatable, which is frustrating as test runs cannot be directly compared. If your reader could be deterministic (given a seed), that would be a huge win.

no bash file

Hi, thanks for such a nice contribution
I am just a bit confused that in your data preparation instruction, it says once you obtain the converted data, you can run:
bash scripts/make_records/make_index_files.sh <path_to_converted_data>

However, I cannot find /make_records/make_index_files.sh file in this repo

How to run the code correctly?

Hi， Thank you for your outstanding work！I met the following problems when trying to use your code:When I've processed all the data，I just used the command :bash scripts/train.sh protonet resnet18 ilsvrc_2012, but got the error：“use_hierarchy" is incompatible with "num_ways". So I tried to set the num_ways to -1 to avoid this error, but strange things still happend: RuntimError:stack expects each tensor to be equal size, but got [50, 3, 84, 84] at entry 0 and [30, 3, 84, 84] at entry 1.This is the first question which puzzled me.What's more, How do I get a pre-trained model on ImageNet by training from scratch?There does not seem to be an option to obtain a pre-trained model in the method options.I am looking forward to your reply. Thank you!

version of tensorflow-gpu being used?

error:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.11.0 requires absl-py>=1.0.0, but you have absl-py 0.11.0 which is incompatible.
tensorflow-gpu 2.11.0 requires absl-py>=1.0.0, but you have absl-py 0.11.0 which is incompatible.
datasets 2.6.1 requires tqdm>=4.62.1, but you have tqdm 4.54.1 which is incompatible.

Unexpected behavior from min_examples_in_class

The official meta-dataset does not have the parameter min_examples_in_class exposed (as far as I know). I set it to 1 as I wanted any class to have at least one example and I get the error message when I turn on use_bilevel_hierarchy for Omniglot (which is standard for Meta-Dataset).

"use_bilevel_hierarchy" is incompatible with "min_examples_in_class"

I don't understand why this restriction is required.

where is make_index_files.sh

where could I find scripts/make_records/make_index_files.sh? Thank you very much.

DataConfig and EpisodeConfig constructors should directly accept arguments as opposed to encapsulated in an argparse.Namespace

First of all, thanks for writing this library. I have been using the official MetaDataset TensorFlow reader in conjunction with PyTorch (which works, but is very resource hungry), so I thought I would try out your library. It now works, but I needed to work around several issues. I'll file a series of issues for these.

The first issue I hit was the fact that the DataConfig and EpisodeConfig constructors took their arguments via an argparse.Namespace. This won't work for a real meta-dataset application as you typically need to set up several DataLoaders (say for train, validate and test) and you usually need to specify different parameters for each (e.g. max_support_set_size). My workaround was to modify your code and make constructors with a conventional set of arguments. Would be great if you could make the change!

Too many unexpected Errors.

I tried to run this repository to reproduce the results. However, there are so many errors. For example, when I set gpu: [1] and ran, the code was ran on gpu=0. At the same time, when running forward_call, there is an error regarding asynchronously report...

No dataset_spec file found in directory

I was trying to run the code, but I got the No dataset_spec file found in directory error. I pass the directory where it stores all the converted datasets (ilsvrc_2021/aircraft./...) into the bash file. It seems it directly look for the .json file each dataset. But the json files is each subfolder of the converted datasets folder. Do I need to move all the json files out, or I pass each dataset directory separately during training?

how long does make index take?

how long does:

# - make_index_files.sh (takes...a while according to patrick)
#cd $HOME/pytorch-meta-dataset/
cd $HOME/diversity-for-predictive-success-of-meta-learning/pytorch-meta-dataset/
chmod +x make_index_files.sh
./make_index_files.sh

take?

shuffle buffer issue?

I suspect that your reader code is affected by the meta-dataset shuffle buffer issue 54. I did a full run with your reader and the results were mostly consistent with what I would get with get using the official meta-dataset reader except for traffic signs (and a couple of other datasets) where the results were more optimistic that if the data is not shuffled. In a quick look through your code, it seems that the shuffle buffer mechanism is not used.

Main feature differences between pytorch-meta-dataset and original meta-dataset?

As I am currently using your great library extensively - I was wondering, what are the main feature differences between your dataloader and the one provided by the original meta-dataset authors (https://github.com/google-research/meta-dataset)?

I was wondering because I just recently saw that in the original meta-dataset library, the end of this notebook (https://github.com/google-research/meta-dataset/blob/main/Intro_to_Metadataset.ipynb) describes how to integrate their dataloader with PyTorch, such as an epsiodic dataloader that supports fixed ways, support-shot and query-shot. They also have a batch dataloader that samples batches from the datasets in a "non-episodic manner".

These features seem quite similar to your PyTorch meta-dataset wrapper, so I was wondering what was the main motivation of creating your PyTorch wrapper library. One thing I did notice is that they state that "If we want to use fixed num_ways... We advise using single dataset for using this feature". I assume your dataset supports this unlike the original meta-dataset library, since I have been using fixed ways with multiple data sources with no issue. Are there other major feature differences between your library and the original meta-dataset library?

Thanks again for providing a great tool!
Patrick

Sampling from episodic loader gives error - "Key image doesn't exist (select from [])!"

When sampling from the episodic loader, all usually goes fine until I get the following error:

Traceback (most recent call last):
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 145, in get_next
    sample_dic = next(self.class_datasets[class_id])
TypeError: 'TFRecordDataset' object is not an iterator
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 219, in get_next
    dataset = next(self.dataset_list[source_id])
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 121, in __iter__
    sample_dic = self.get_next(class_id)
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 148, in get_next
    sample_dic = next(self.class_datasets[class_id])
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/utils.py", line 23, in cycle_
    yield next(iterator)
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/tfrecord/reader.py", line 222, in example_loader
    feature_dic = extract_feature_dict(example.features, description, typename_mapping)
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/tfrecord/reader.py", line 162, in extract_feature_dict
    raise KeyError(f"Key {key} doesn't exist (select from {all_keys})!")
KeyError: "Key image doesn't exist (select from [])!"
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 201, in __iter__
    next_e = self.get_next(rand_source)
  File "/home/patrick/pytorch-meta-dataset/pytorch_meta_dataset/pipeline.py", line 222, in get_next
    dataset = next(self.dataset_list[source_id])
StopIteration

Just for info - I used an older version of your repo (https://github.com/mboudiaf/pytorch-meta-dataset/tree/c6d6922003380342ab2e3509425d96307aa925c5). I am sampling from the episodic loader. I use

episodic_dataset = pipeline.make_episode_pipeline(dataset_spec_list=all_dataset_specs,
                                                      split=split,
                                                      data_config=data_config,
                                                      episode_descr_config=episod_config)
episodic_loader = DataLoader(dataset=episodic_dataset,
                                 batch_size=meta_batch_size,
                                 num_workers=data_config.num_workers,
                                 worker_init_fn=seeded_worker_fn)
#Sample a batch of size [B, N*K, C, H, W] from episodic loader via next(iter(episodic_loader))
#where B = meta_batch_size, N*K = n_ways*k_shots, C = channels, H = height of image, W = width of image

Do you know what may be causing the KeyError: "Key image doesn't exist (select from [])!" StopIteration error? For the above error, I am setting 5-way 15-shots for train and 5-way 5-shot for test/validation, and meta_batch_size 2 for train and 4 for test/val.

EDIT: sometimes I'm also sampling from the episodic loader and encounter an infinite loop.

Thanks a lot in advance!

What are tricks to speed up training for SL and MAML?

e.g. can I increase the number of workers? is the code compatible with this?

CrossTransformers implementation

Hey. Thanks a lot for the quality codes and instructions.

Do you have any plans for implementing CrossTransformers in the original repo?

Learn2Learn support?

Is there learn2learn support for this learnables/learn2learn#286

I am happy to help add it. What would be a good starting point?

Exception when num_workers > 0 on Windows, works on linux

On Windows 10, if num_workers > 0, you get the following exception:
Traceback (most recent call last):
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle generator objects
python-BaseException
Traceback (most recent call last):
File "C:\Users\User\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
python-BaseException

mboudiaf / pytorch-meta-dataset Goto Github PK

pytorch-meta-dataset's People

Contributors

Stargazers

Watchers

Forkers

pytorch-meta-dataset's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs