sounakdey / doodle2search Goto Github PK

View Code? Open in Web Editor NEW

74.0 74.0 25.0 10.49 MB

Doodle to Search: Practical Zero Shot Sketch Based Image Retrieval

Home Page: https://sounakdey.github.io/doodle2search.github.io/

Shell 14.49% Python 85.51%

doodle2search's People

Contributors

Stargazers

Watchers

doodle2search's Issues

The data format defined in dowload_datasets.py is different from the code.

Hi, thanks for providing the code. It is easy to understand. However, there is some errors. You have renamed and removed some files/folders. However, the code is based on the old format. Can you fix this?

Question on the Presision@200

In th code 'test.py' at line 108, the Presision@200 is computed as:

# Precision@200 means at the place 200th
   precision_200 = np.mean(sort_str_sim[:, 200])

As the sort_str_sim is actually the sorted version for a tensor with [n_sketches, n_images, n_clssses], it has the true label information on sketches and images. But it has no information on the precision and recall. The number should be computed using average_precision_score().

Does anyone know how to compute the Precision@200 value?

Some problems about word embeddings

doodle2search/src/data/class_word2vec.py

Line 13 in 07680c4

w2v_class.append(np.array(model[voca]))

hi~
Your work is quite excellent, but I'm a little confused about the word embeddings.
I'm wondering how to deal with the situation where the class label is not available in linguistic model's vocabulary. For example, the class label 'car_(sedan)' and 'hot_air_balloon' of Sketchy-Extend dataset are not in google_300_corpus model's vocabulary, but I do not find any solutions to this situation in your codes.
By the way, according to your released word embeddings, I guess you normalize the word vectors, but the codes lack this process.

Missing the 'media/sounak/4tbdisk/Datasets/quick_draw_125/class_labels.txt' file

Hello,

When I tried to train the model, it told me that I lacked the ’class_labels.txt‘ file and couldn't find it. I think youmay forgot to upload it. I wish you can upload it or tell me how to generate it. I wish you can reply to me.

Thank you

There are some careless mistakes in the QuickDraw_Extended dataset

The class eiffel_tower has two different spellings in the sketch and image dataset, which are eiffel_tower and effiel_tower.
There remains an s.sh file in the images/cactus

Release of pretrained models

Do you guys have any plan on releasing the pretrained models?

Excellent work by the way.

When test in Quickdraw_extend, Dataset 'EXTEND_image_quickdraw' missed

doodle2search/src/data/quickdraw_extended.py

Lines 89 to 92 in 07680c4

 if type_skim == 'images': 

 self.dir_file = os.path.join(args.data_path,'EXTEND_image_quickdraw') 

 elif type_skim == 'sketch': 

 self.dir_file = os.path.join(args.data_path, 'QuickDraw_sketches')

Do 'EXTEND_image_quickdraw' and 'QuickDraw_sketches' means 'QuickDraw_images_final' and 'QuickDraw_sketches_final'? If not, how can I get the test dataset?

I want to experiment on QuickDraw dataset, but we don't have a dataset description file like Sketchy and TUBerlin dataset, like SAKE model

Can you provide a source folder like this? It's a little difficult for me to implement those

Even if the network is unblocked, the data set cannot be obtained

Even if the network is unblocked, the data set cannot be obtained；I tried to run the .sh file, but it didn't work once; I directly used URL stitching and searched on Google, and an error was prompted when the data was downloaded halfway.So, how can I get the dataset downloaded by the .sh file?

FileNotFoundError: [Errno 2] No such file or directory: '../dataset/Sketchy\\class_labels.txt'

For some reason, I have difficult in download dataset sketchy and TU-Berlin with download_datasets. But, I have had the two datasets.
So,could you only provide me the txt file such as /dataset/Sketchy\class_labels.txt and all others in Sketchy and TU-Berlin.

My e-mail is [email protected]
Thanks.

How to create word vector for QuickDraw dataset

How to create word vector for the quickdraw dataset?
Is there some tricks to normalize them? like in #10

memory overflow when testing on quickdraw dataset

Hi, how much memory did you use when you tested the quickdraw dataset? I have 60GB memory but it is still not enough for testing quickdraw.

Why don't you use the ID loss?

Thanks for your great work,I am wondering why don't you use the ID loss?Looking forward to your response,Thank You!

Dataset link failed

Hi, I can not download the dataset from the link you provided, could you update the link? Thanks.

Please update license for the model and datasets

Thanks for your contribution. Can you please add the license for the repo.

Vector sketches of the QuickDraw-Extended dataset

Hi, I am very interested in the QuickDraw-Extended dataset. And I want to try it in a RNN framework. Do you have the sketch data in vector format rather than raster?

can't open file 'src/download_gdrive.py': [Errno 2] No such file or directory

When I run 'bash download_datasets.sh', it happened.

MAP@200 calculation might be wrong

Hi, I think you are MAP calculation is wrong, especially MAP@200. You have ignored the total number of related documents in the MAP@200 calculation. That is why your MAP@200 value is big. I hope I am wrong but it seems you are wrong. You can check the MAP calculation of this paper that you have not cited: https://github.com/qliu24/SAKE

Program doesn't exit after training finished

After reach the end of line in file src/train.py. The program still run, but the CPU and GPU usages are both 0%. The memory was not released.

How to produce the class semantic vector?

As I got through the code, and tried to run the experiments. One of the key problems is to produce the class semantic vector using the word2vec or other NLP method. But the problem is that there are some class words that do not exist in the google-news-300. Even there are some simimlar ones, but I can not find how to convert these similar ones into google-news-300 format. For example, I can not find the "axe" in google-news-300 in any form. Although the authors provided the genereted semantic labels (in word2vec), we still confuse on how to generate it.

@sounakdey , do you please give us a hint on how did you deal with these problem? Thank you very much.

Training on sketchy TypeError: 'int' object is not iterable

Respected Authors,
I am unable to run the code for the sketchy dataset. It mentioned class labels file is missing so i created a labels file for the dataset. I am unable to comprehend the error in the code. It would be really nice if you can let me know what can I do to reproduce the results.

python3 src/train.py sketchy_extend --data_path /data/anurag/doodledata/sketchy
Parameters:	Namespace(attn=False, batch_size=20, data_path='/data/anurag/doodledata/sketchy', dataset='sketchy_extend', decay=0.0005, early_stop=20, emb_size=256, epochs=1000, exp_idf=None, gamma=0.1, grl_lambda=0.5, learning_rate=0.0001, load=None, log=None, log_interval=20, momentum=0.9, ngpu=1, nopretrain=True, plot=False, prefetch=2, save=None, schedule=[], seed=42, w_domain=1, w_semantic=1, w_triplet=1)
Prepare data
Traceback (most recent call last):
  File "src/train.py", line 287, in <module>
    main()
  File "src/train.py", line 99, in main
    train_data, [valid_sk_data, valid_im_data], [test_sk_data, test_im_data], dict_class = load_data(args, transform)    
  File "/data/anurag/codes/baselines/doodle2search/src/data/generator_train.py", line 18, in load_data
    return Sketchy_Extended(args, transform)
  File "/data/anurag/codes/baselines/doodle2search/src/data/sketchy_extended.py", line 35, in Sketchy_Extended
    class_emb = create_class_embeddings(list_class, args.dataset)
  File "/data/anurag/codes/baselines/doodle2search/src/data/class_word2vec.py", line 9, in create_class_embeddings
    model = Word2Vec(google_300_corpus)
  File "/home/anurag/anaconda3/lib/python3.6/site-packages/gensim/models/word2vec.py", line 783, in __init__
    fast_version=FAST_VERSION)
  File "/home/anurag/anaconda3/lib/python3.6/site-packages/gensim/models/base_any2vec.py", line 759, in __init__
    self.build_vocab(sentences=sentences, corpus_file=corpus_file, trim_rule=trim_rule)
  File "/home/anurag/anaconda3/lib/python3.6/site-packages/gensim/models/base_any2vec.py", line 936, in build_vocab
    sentences=sentences, corpus_file=corpus_file, progress_per=progress_per, trim_rule=trim_rule)
  File "/home/anurag/anaconda3/lib/python3.6/site-packages/gensim/models/word2vec.py", line 1592, in scan_vocab
    total_words, corpus_count = self._scan_vocab(sentences, progress_per, trim_rule)
  File "/home/anurag/anaconda3/lib/python3.6/site-packages/gensim/models/word2vec.py", line 1561, in _scan_vocab
    for sentence_no, sentence in enumerate(sentences):
  File "/home/anurag/anaconda3/lib/python3.6/site-packages/gensim/models/keyedvectors.py", line 355, in __getitem__
    return vstack([self.get_vector(entity) for entity in entities])
TypeError: 'int' object is not iterable

can't reproduce accurate MAP

Respected Authors，
Your paper is great. I have two question to ask you.
The first one is that after training on Sketchy-Extend dataset, I can't reproduce your result. Your mAP result is 0.3691 in the paper, but I only got 0.0589. Can you publish your pretrained model?
The second one is I can't produce sketchy_semantic_label by word2Vec, it told me "word 'XXX' is not in vocabulary"。Can you help me deal with this problem?
Thank you!

	if type_skim == 'images':
	self.dir_file = os.path.join(args.data_path,'EXTEND_image_quickdraw')
	elif type_skim == 'sketch':
	self.dir_file = os.path.join(args.data_path, 'QuickDraw_sketches')

sounakdey / doodle2search Goto Github PK

doodle2search's People

Contributors

Stargazers

Watchers

Forkers

doodle2search's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs