yufengm / adaptive Goto Github PK

View Code? Open in Web Editor NEW

106.0 106.0 43.0 235.27 MB

Pytorch Implementation of Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning

Python 11.02% Shell 0.05% Jupyter Notebook 88.93%

image-captioning pytorch

adaptive's People

Stargazers

Watchers

adaptive's Issues

is this project completely implement the result of the paper?

When test has a problem,please help

import argparse
from PIL import Image
import torch
import matplotlib.pyplot as plt
import numpy as np
import cv2
import pickle
from utils import CocoEvalLoader, to_var, show_images
from adaptive import Encoder2Decoder
from build_vocab import Vocabulary
from torch.autograd import Variable
from torchvision import transforms
def main():
pretrained = 'models/adaptive-1.pkl'
vocab_path = './data/vocab.pkl'
with open(vocab_path, 'rb') as f:
vocab = pickle.load(f)
# Define model and load pretrained
model = Encoder2Decoder(256, len(vocab), 512)
model.load_state_dict(torch.load(pretrained, map_location={'cuda:1':'cuda:0'}))
model.eval()
# Image transformation
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.485, 0.456, 0.406),
(0.229, 0.224, 0.225))])
image = Image.open('./data/test/1.jpg')
image = image.resize([224, 224], Image.LANCZOS)
image = transform(image).unsqueeze(0)
image_tensor = Variable(image, volatile=True)
generated_captions, _, _ = model.sampler(image_tensor)
captions = generated_captions.data.numpy()
sampled_ids = captions[0]
sampled_caption = []
for word_id in sampled_ids:
word = vocab.idx2word[word_id]
if word == '':
break
else:
sampled_caption.append(word)
sentence = ' '.join(sampled_caption)
# return sentence
print (sentence)

if name == 'main':
main()
出错：TypeError: torch.index_select received an invalid combination of arguments - got (torch.FloatTensor, int, !torch.cuda.LongTensor!)

please help! Thanks

Different tokenizers used on training and validation data

Hi, first of all thank you for the great repo! I noticed that you are using different tokenizers for the training data - nltk.tokenize.word_tokenize in build_vocab.py - and the validation data - PTBTokenizer from coco/pycocoevalcap/tokenizer/ptbtokenizer.py. The first one doesn't split on punctuation, while the second one does, which leads to many <unk> tokens for captions that have hyphenated words. It would be great if you could have both training and captioning use the same tokenizer. Thanks!

how to find accuracy for train.py ?

please help me how to find accuracy for each epochs ? Is their any code ?

Its showing Cross Entropy Loss value and Perplexity value in the training !!

is batch_size_t work?

Hi, there is a 'batch_size_t'，which will filter the samples that have short length, but why do this, I dont think this will work, maybe this will lead the model adopt very few samples?
batch_size_t = sum([l > timestep for l in decode_lengths]) current_input = inputs[:batch_size_t, timestep, :]

Please help me, thanks.

pre-trained model?

Thanks for your work on this pytorch version of 'knowing when to look'. Can you public the pre-trained model?

bug，help

How to make a test?

Could you please upload a example script?

when testing

could you tell me how to test ?

I got CIDer 0.82 only ，could you please help me about how I can imporve the score?thanks

How to count len( data_loader )?

Thanks for your contribution!
But I have a question that during training, the total number of iteration of every epoch is 9446. I am wondering how this comes from...As we know, there are about 80k+40k-10k=110k images in the training set.
So the length of data_loader should be : 110k/batch_size=110k/60=1833, rather than 9446.
Can anyone help me?

No module named 'bleu_scorer' for image captioning

Traceback (most recent call last):
File "train.py", line 9, in
from utils import coco_eval, to_var
File "/data/DATA_DIR/utils.py", line 12, in
from coco.pycocoevalcap.eval import COCOEvalCap
File "/data/DATA_DIR/coco/pycocoevalcap/eval.py", line 3, in
from .bleu.bleu import Bleu
File "/data/DATA_DIR/coco/pycocoevalcap/bleu/bleu.py", line 11, in
from bleu_scorer import BleuScorer
ModuleNotFoundError: No module named 'bleu_scorer'

bug

@yufengm
After one epoch, I want to calculate the METEOR score, but error as follow:
computing METEOR score...
Traceback (most recent call last):
File "train_model.py", line 268, in
main(args)
File "train_model.py", line 177, in main
method_score = coco_eval(adaptive, args, epoch)
File "/home1/haoyanlong/imagecaption/Adaptive_Attention/code/utils.py", line 173, in coco_eval
cocoEval.evaluate()
File "/home1/haoyanlong/imagecaption/Adaptive_Attention/code/coco/pycocoevalcap/eval.py", line 51, in evaluate
score, scores = scorer.compute_score(gts, res)
File "/home1/haoyanlong/imagecaption/Adaptive_Attention/code/coco/pycocoevalcap/meteor/meteor.py", line 38, in compute_score
stat = self._stat(res[i][0], gts[i])
File "/home1/haoyanlong/imagecaption/Adaptive_Attention/code/coco/pycocoevalcap/meteor/meteor.py", line 56, in _stat
self.meteor_p.stdin.write('{}\n'.format(score_line))
IOError: [Errno 32] Broken pipe

When I try to run train.py the attribute error occurs

Hello,

When I try to run the 'train.py' there comes this error.

Traceback (most recent call last):
File "train.py", line 263, in
main( args )
File "train.py", line 176, in main
cider = coco_eval( adaptive, args, epoch )
File "/home/harryjhnam/Documents/final_project/Adaptive/utils.py", line 120, in coco_eval
CocoEvalLoader( args.image_dir, args.caption_val_path, transform ).samples
AttributeError: 'CocoEvalLoader' object has no attribute 'samples'

I have modified your code for python3.5.2 and also tried on the python2.7 with your original code, but both methods occur the same error.

Encoder encodes the same image differently

Hi,

I trained a fine-tuned model on my own custom dataset, and the results are generally quite accurate. However, with some images I've noticed that if I sample the same image multiple times I get different captions. The differences are usually small, but for my use case I need consistency. I tried to debug by printing out the encoder tensors and it looks like the model encodes the same image differently at different samplings. Is this expected behavior? Is there a way to "stabilize" the encoder so it encodes the same image the same way each time?

Thanks!

train.py

Hi, this is the best project, but when i use the train.py the error occurs.
Here is the error
RuntimeError: matrix and matrix expected at /opt/conda/conda-bld/pytorch_1501971235237/work/pytorch-0.1.12/torch/lib/THC/generic/THCTensorMathBlas.cu:237
My pytorch version is 0.1.12
Best,

After 2nd epoch, I started to have results like;

"A yellow train", "A small bus", "An aeroplane." etc.

The remaining of the sentences are missing.

Is it because of the data, or what?

Thank you,

yufengm / adaptive Goto Github PK

adaptive's People

Stargazers

Watchers

Forkers

adaptive's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs