GithubHelp home page GithubHelp logo

yukezhu / visual7w-qa-models Goto Github PK

View Code? Open in Web Editor NEW
62.0 62.0 23.0 131 KB

Visual7W visual question answering models

Home Page: http://ai.stanford.edu/~yukez/visual7w/

License: MIT License

Python 19.51% Shell 0.70% Lua 79.79%
deep-learning recurrent-neural-networks

visual7w-qa-models's People

Contributors

yukezhu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

visual7w-qa-models's Issues

Segmentation fault in executing train_telling.lua on Jetson TX1

Hello @yukezhu
I am having an issue executing the train_telling.lua script and I get segmentation fault error immediately after the successfully loaded cnn_models.

The execution is done on Jetson TX1 on GPU and with 4GB RAM shared between cpu and gpu

$ th train_telling.lua -gpuid 0 -mc_evaluation -verbose -finetune_cnn_after -1

QADatasetLoader loading dataset file: visual7w-toolkit/datasets/visual7w-telling/dataset.json
image size is 28653
QADatasetLoader loading json file: data/qa_data.json
vocab size is 3007
QADatasetLoader loading h5 file: data/qa_data.h5
max question sequence length in data is 15
max answer sequence length in data is 5
assigned 5678 images to split val
assigned 8609 images to split test
assigned 14366 images to split train
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message. If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 553432081
Successfully loaded cnn_models/VGG_ILSVRC_16_layers.caffemodel
Segmentation fault.

Do you have any suggestion how to solve this?

Thanks
Enid

Training with visual genome dataset

Hello @yukezhu ,

I am running the train_telling.lua code after generating qa_data.h5 and qa_data.json with prepare_dataset.py from Visual Genome dataset (read all the images from VG_100K and VG_100K_2 folders). I am training in GPU mode with batch_size=1. I get the following error after 25th iteration. Same error in the CPU mode.

.
.
.
question 976648: where is this picture taken ? ten king doe while staying rc mason culture kitties familiar doe staying church lockers .	
evaluating validation performance... 250 (9.231342)	
validation loss: 	9.1918324186961	
wrote json checkpoint to checkpoints/model_id.json	
iter 1: 9.186291 (9.191735)	
iter 2: 9.096266 (9.191258)	
iter 3: 8.853861 (9.189571)	
iter 4: 8.986907 (9.188557)	
iter 5: 9.046180 (9.187845)	
iter 6: 8.980703 (9.186810)	
iter 7: 8.890981 (9.185331)	
iter 8: 8.588496 (9.182346)	
iter 9: 8.541151 (9.179140)	
iter 10: 8.375111 (9.175120)	
iter 11: 8.544146 (9.171965)	
iter 12: 8.388834 (9.168050)	
iter 13: 7.858945 (9.161504)	
iter 14: 7.269617 (9.152045)	
iter 15: 7.214907 (9.142359)	
iter 16: 7.827134 (9.135783)	
iter 17: 6.620107 (9.123205)	
iter 18: 7.074771 (9.112962)	
iter 19: 6.372057 (9.099258)	
iter 20: 7.658658 (9.092055)	
iter 21: 7.211345 (9.082651)	
iter 22: 5.337039 (9.063923)	
iter 23: 6.063521 (9.048921)	
iter 24: 6.835446 (9.037854)	
iter 25: 6.785455 (9.026592)	
/home/f/torch/install/bin/luajit: /home/f/torch/install/share/lua/5.1/hdf5/dataset.lua:114: attempt to perform arithmetic on a nil value
stack traceback:
	/home/f/torch/install/share/lua/5.1/hdf5/dataset.lua:114: in function 'rangesToOffsetAndCount'
	/home/f/torch/install/share/lua/5.1/hdf5/dataset.lua:136: in function 'partial'
	./misc/QADatasetLoader.lua:232: in function 'getBatch'
	train_telling.lua:177: in function 'lossFun'
	train_telling.lua:336: in main chunk
	[C]: in function 'dofile'
	...r226/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

I would really appreciate if someone can help me with this.

Thanks in advance.

Feeding answer sequences during the evaluation

Inside the eval_split() function (line 249 at train_telling.lua), you're feeding the answer sequence as well as question sequence. In evaluation mode, why you're feeding the answers ?

Attention Visualization?

Hello ,
Please correct me , if i am wrong:
To visualize Attention, Get attention probability and multiply with image regions .
Please share , If any one has the attention visualization code.

cannot run build_attention_nn function

When I only try to run the build_attention_nn as below, I get the following error:

torch/install/share/lua/5.1/nn/View.lua:38: attempt to index local 'input' (a nil value)
stack traceback:

require 'nn'
function build_attention_nn()
  local conv_feat_maps = nn.Identity()()
  local prev_h = nn.Identity()()
  -- compute attention coefficients
  local flatten_conv = nn.View(-1):setNumInputDims(2)(conv_feat_maps)
  local f_conv = nn.Linear(512*196, 196)(flatten_conv)
end

build_attention_nn()

Out of Memory error

Hi @yukezhu

I run your program to retrain the model. But I get the out-of-memory error. I have GTX 1080 (8GB) installed in my machine. Which GPU do you use to train the model? How much memory do you use?

`
$ th train_telling.lua -gpuid 0 -mc_evaluation -verbose -finetune_cnn_after -1

QADatasetLoader loading dataset file: visual7w-toolkit/datasets/visual7w-telling/dataset.json
image size is 28653
QADatasetLoader loading json file: data/qa_data.json
vocab size is 3007
QADatasetLoader loading h5 file: data/qa_data.h5
max question sequence length in data is 15
max answer sequence length in data is 5
assigned 5678 images to split val
assigned 8609 images to split test
assigned 14366 images to split train
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message. If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 553432081
Successfully loaded cnn_models/VGG_ILSVRC_16_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
converting first layer conv filters from BGR to RGB...
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-1922/cutorch/lib/THC/generic/THCStorage.cu line=65 error=2 : out of memory
/home/jjhu/torch/install/bin/lua: /home/jjhu/torch/install/share/lua/5.2/nn/utils.lua:11: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-1922/cutorch/lib/THC/generic/THCStorage.cu:65
stack traceback:
[C]: in function 'resize'
/home/jjhu/torch/install/share/lua/5.2/nn/utils.lua:11: in function 'torch_Storage_type'
/home/jjhu/torch/install/share/lua/5.2/nn/utils.lua:57: in function 'recursiveType'
/home/jjhu/torch/install/share/lua/5.2/nn/Module.lua:152: in function 'type'
/home/jjhu/torch/install/share/lua/5.2/nn/utils.lua:45: in function 'recursiveType'
/home/jjhu/torch/install/share/lua/5.2/nn/utils.lua:41: in function 'recursiveType'
/home/jjhu/torch/install/share/lua/5.2/nn/Module.lua:152: in function </home/jjhu/torch/install/share/lua/5.2/nn/Module.lua:143>
(...tail calls...)
train_telling.lua:131: in main chunk
[C]: in function 'dofile'
...jjhu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: in ?
`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.