Wanna grab these configuration files ?
saahiluppal / catr Goto Github PK
View Code? Open in Web Editor NEWImage Captioning Using Transformer
License: Apache License 2.0
Image Captioning Using Transformer
License: Apache License 2.0
Hi,
There are three pre-trained models are provided. Are they essentially the same models with same configurations?
Thanks!
Can I perform an end-to-end training with this code on my own dataset (that contains pairs of images and captions), or do I need to first extract features/bounding boxes for the images?
Hi, I get this error while trying to run main.py
runfile('D:/COCO/imge_captioning_transform_github/1/catr-master/main.py', wdir='D:/COCO/imge_captioning_transform_github/1/catr-master')
Reloaded modules: datasets, datasets.utils, datasets.coco, configuration, engine
Initializing Device: cuda
Traceback (most recent call last):
File D:\COCO\imge_captioning_transform_github\1\catr-master\main.py:90 in
main(config)
File D:\COCO\imge_captioning_transform_github\1\catr-master\main.py:23 in main
model, criterion = caption.build_model(config)
File ~\Desktop\models\caption.py:51 in build_model
File ~\Desktop\models\backbone.py:112 in build_backbone
File ~\Desktop\models\backbone.py:85 in init
File ~\anaconda3\envs\my_envir_gpu\lib\site-packages\torchvision\models\resnet.py:342 in resnet101
return _resnet("resnet101", Bottleneck, [3, 4, 23, 3], pretrained, progress, **kwargs)
File ~\anaconda3\envs\my_envir_gpu\lib\site-packages\torchvision\models\resnet.py:296 in _resnet
state_dict = load_state_dict_from_url(model_urls[arch], progress=progress)
File ~\anaconda3\envs\my_envir_gpu\lib\site-packages\torch\hub.py:595 in load_state_dict_from_url
return torch.load(cached_file, map_location=map_location)
File ~\anaconda3\envs\my_envir_gpu\lib\site-packages\torch\serialization.py:705 in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File ~\anaconda3\envs\my_envir_gpu\lib\site-packages\torch\serialization.py:243 in init
super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
Can you tell me where I should change in the code to fix this error?
sorry for bothering again,but when i try load the model as u suggested it gives this error
AttributeError: module 'torch' has no attribute 'load_state_dict'
and if i use model=model.load_state_dict insteade of torch.load_state dict it gives the following error
model=model.load_state_dict(torch.load('/content/drive/MyDrive/BanglalekhaDataset/model/checkpoint1.pth'))
NameError: name 'model' is not defined
You are pulling images from .github/ such as .github/cake.png. Where is this located--in your repo? In any case, I was able to insert a reference in colab's file system i.e. /content/catr/test.jpg so all is OK.
I tried two pictures and obtained peculiar results. For the attached, I got "Baby eating a donut with a spoon". Is it trained on a particular corpus so I will know what sorts of images will work better at receiving a more accurate caption?
It'll be good if we can provide model checkpoint file for prediction in predict.py
I wonder training loss and validation loss of your pretrained model
Hello
Thanks for sharing the code, however i cant find the paper pdf anywhere
Can you attach it?
Thanks
Hello, I'm a newbie of image captioning. I have some questions about the image augmentation in coco.py.
Thank you.
Hi! I studied your code, and got some questions.
It seems like pad token is making a loss, too.
Line 55 in 8a1f770
In my opinion, the code above need to be like:
criterion = torch.nn.CrossEntropyLoss(ignore_index=config.pad_token_id)
Otherwise, the model must predict [PAD]
token, too.
Also, I wonder the reason why you used FrozenBatchNorm. Was batch size 32 not sufficient for stable learning?
Thank you!!
I'm trying to implement beam search but I'm getting strange results. Was wondering if anybody has managed to implement beam search with this repo?
prediction: [CLS] i was was was the the was was was i was i [SEP]
dataset: [CLS] you do he coming for you mother he alive not well i [SEP]
prediction: [CLS] i was was i was i of and the i the i [SEP]
dataset: [CLS] you do if you receive a letter from yourself with information only [SEP]
prediction: [CLS] i was was i i was i was was the the i [SEP]
dataset: [CLS] you do mean that she again you do even know what you [SEP]
...........
It was strange because when I used the pre-trained catr-model, he works fine. I modified my dataset format to fit in coco-dataset style, and made sure each data pairs fed successfully into training(I printed the input image and captions during training). I made a mini dataset(n<40) to make sure its convergence(at last loss=0.214xxxx, actually I thought loss should converge to 0.001 due to it's so tiny), and this phenomenon didn't disappear. What possibly happens to my procedure?
Thank you so much for your great work.I was trying to reproduce your work but it seems that I had a version problem.Could you please tell me the detailed version about your requirement, especially the transformers.I will be grateful .
Hi๏ผCould I ask for you about model's performance ? Such as, BLEU, CIDEr and so on
Hi,
The prediction code is only one image each. How can i change this to predict the image in the folder at once?
thanks.
I got an error while trying to change the Bert Base pre-trained library. I have tried to run this model in another language.
the error is like -
` Initializing Device: cuda
Number of params: 83972666
Train: 18308
Valid: 1830
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:481: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
cpuset_checked))
Start Training..
Epoch: 0
0% 0/1144 [00:00<?, ?it/s]/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py:2204: FutureWarning: The pad_to_max_length
argument is deprecated and will be removed in a future version, use padding=True
or padding='longest'
to pad to the longest sequence in the batch, or use padding='max_length'
to pad to a max length. In this case, you can give a specific length with max_length
(e.g. max_length=45
) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
FutureWarning,
/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py:2204: FutureWarning: The pad_to_max_length
argument is deprecated and will be removed in a future version, use padding=True
or padding='longest'
to pad to the longest sequence in the batch, or use padding='max_length'
to pad to a max length. In this case, you can give a specific length with max_length
(e.g. max_length=45
) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
FutureWarning,
/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py:2204: FutureWarning: The pad_to_max_length
argument is deprecated and will be removed in a future version, use padding=True
or padding='longest'
to pad to the longest sequence in the batch, or use padding='max_length'
to pad to a max length. In this case, you can give a specific length with max_length
(e.g. max_length=45
) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
FutureWarning,
/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py:2204: FutureWarning: The pad_to_max_length
argument is deprecated and will be removed in a future version, use padding=True
or padding='longest'
to pad to the longest sequence in the batch, or use padding='max_length'
to pad to a max length. In this case, you can give a specific length with max_length
(e.g. max_length=45
) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
FutureWarning,
/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py:2204: FutureWarning: The pad_to_max_length
argument is deprecated and will be removed in a future version, use padding=True
or padding='longest'
to pad to the longest sequence in the batch, or use padding='max_length'
to pad to a max length. In this case, you can give a specific length with max_length
(e.g. max_length=45
) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
FutureWarning,
/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py:2204: FutureWarning: The pad_to_max_length
argument is deprecated and will be removed in a future version, use padding=True
or padding='longest'
to pad to the longest sequence in the batch, or use padding='max_length'
to pad to a max length. In this case, you can give a specific length with max_length
(e.g. max_length=45
) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
FutureWarning,
/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py:2204: FutureWarning: The pad_to_max_length
argument is deprecated and will be removed in a future version, use padding=True
or padding='longest'
to pad to the longest sequence in the batch, or use padding='max_length'
to pad to a max length. In this case, you can give a specific length with max_length
(e.g. max_length=45
) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
FutureWarning,
/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py:2204: FutureWarning: The pad_to_max_length
argument is deprecated and will be removed in a future version, use padding=True
or padding='longest'
to pad to the longest sequence in the batch, or use padding='max_length'
to pad to a max length. In this case, you can give a specific length with max_length
(e.g. max_length=45
) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
FutureWarning,
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [97,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [98,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [99,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [100,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [101,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [102,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [103,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [104,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [105,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [106,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [107,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [108,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [109,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [110,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [111,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [112,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [113,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [114,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [115,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [116,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [117,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [118,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [119,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [120,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [121,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [122,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [123,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [124,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [125,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [126,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [32,0,0], thread: [127,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [97,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [98,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [99,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [100,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [101,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [102,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [103,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [104,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [105,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [106,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [107,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [108,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [109,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [110,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [111,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [112,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [113,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [114,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [115,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [116,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [117,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [118,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [119,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [120,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [121,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [122,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [123,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [124,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [125,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [126,0,0] Assertion srcIndex < srcSelectDimSize
failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:702: indexSelectLargeIndex: block: [26,0,0], thread: [127,0,0] Assertion srcIndex < srcSelectDimSize
failed.
0% 0/1144 [00:08<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 98, in
main(config)
File "main.py", line 74, in main
model, criterion, data_loader_train, optimizer, device, epoch, config.clip_max_norm)
File "/content/gdrive/My Drive/image captioning research work/image captioning/engine.py", line 25, in train_one_epoch
outputs = model(samples, caps[:, :-1], cap_masks[:, :-1])
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/gdrive/My Drive/image captioning research work/image captioning/models/caption.py", line 29, in forward
pos[-1], target, target_mask)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/gdrive/My Drive/image captioning research work/image captioning/models/transformer.py", line 48, in forward
tgt = self.embeddings(tgt).permute(1, 0, 2)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/gdrive/My Drive/image captioning research work/image captioning/models/transformer.py", line 293, in forward
input_embeds = self.word_embeddings(x)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/sparse.py", line 160, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 2043, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: CUDA error: device-side assert triggered`
Can you tell me where I should change in the code to fix this error?
HI
I tried only the pre-trained models, V1 and V3 --> runnning predict.py , but prediction return only a list of "[unk] [unk] [unk] [unk] [unk] [unk] [unk] [unk] [unk] [unk]" on my images dataset;..
what can be the problem??
thanks for your clear code;
Hi,
It appears that this repository is heavily based off the DETR code, including files copied verbatim, but stripped from the copyright header.
The original code was released under the Apache license. In particular, according to section 4.c:
You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works
In order to be in compliance with the license, could you reinstate appropriate copyrights and pointers to the original work?
Thanks in advance.
Hi there!
This project is super cool! Would you be interested in sharing the pretrained models in the Hugging Face Hub? The Hugging Face Hub offers free hosting of models (over 10,000 models have been uploaded by many research organizations) and it would make your work more accessible and visible to others. People would be able to try the model directly in the browser (we're implementing an image captioning widget at the moment). The only thing required would be to upload the models to the Hub. I'm happy to answer any questions about this.
Happy to hear your thoughts,
Omar
As I want the model to predict the end token by excluding it from the input into the model, I simply slice the token off the end of the sequence. Thus:
trg = [sos, x_1, x_2, x_3, eos]
trg[:-1] = [sos, x_1, x_2, x_3]
This is also same as your implementation.
But actually many datasets collect sentences with different length, ans thus the last elements of sentences are tokens, such as:
trg = [sos, x_1, x_2, x_3, eos, pad, pad, pad]
trg[:-1] = [sos, x_1, x_2, x_3, eos, pad, pad]
In such a case, I canโt slice the token, may I ask how can I solve this issue?
hi i was able to train a model running your code,but when i try to predict the caption using your predict.py file,it downloads your pretrained model,how to use the model i trained in the predict.py file,i am very new to this,i would really appreciate if you could help me out.Thanks in advance
Hi,
I want to get results using gpu, what should i do?
i got this error with main.py ..
result = torch.relu(input)
RuntimeError: CUDA out of memory. Tried to allocate 92.00 MiB (GPU 0; 5.80 GiB total capacity; 4.45 GiB already allocated; 40.94 MiB free; 4.52 GiB reserved in total by PyTorch)
Hi. Thank you for your impressive work.
I've read your work and want to understand your model clearly.
From #2 , I know there is no paper, but I found similar paper with your work.
Does the figure below explain your work?
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.