zhegan27 / scn_for_video_captioning Goto Github PK
View Code? Open in Web Editor NEWUsing Semantic Compositional Networks for Video Captioning
Using Semantic Compositional Networks for Video Captioning
hi zhegan
When I extracted the C3D feature,It is a binary file (.pool5) How can I turn it into a mat file to use in the experiment. Can you give me some advice?
thank you for your patience!
x = cPickle.load(open("./data/youtube2text/corpus.p", "rb"))
train, val, test = x[0], x[1], x[2]
wordtoix, ixtoword = x[3], x[4]
W = x[5]
Hi, it walkthrough all steps, it is a great project!
How could I feed in a video for captioning? Thanks.
Hi
Thank you for providing the code for the paper . I want to apply this technique to a new data set . So I need to know how captioning features were extracted.
Any help is appreciated
Regards
Kamal
which 670 test datasets would you choose??? or from shuffle?
Traceback (most recent call last):
File "SCN_training.py", line 244, in
n_words=n_words)
File "SCN_training.py", line 162, in train_model
valid_negll = calu_negll(f_cost, prepare_data, valid, img_feats, tag_feats, kf_valid)
File "SCN_training.py", line 53, in calu_negll
return totalcost/totallen
ZeroDivisionError: float division by zero
D:\Anaconda2\envs\Theano\python.exe "D:/PythonCode1/Semantic Compositional Networks/SCN_for_video_captioning-master/SCN_decode.py"
loading data...
loading learned params...
count how many captions we have generated...
start decoding @ 18:10:53.925000
D:/PythonCode1/Semantic Compositional Networks/SCN_for_video_captioning-master/SCN_decode.py:39: RuntimeWarning: overflow encountered in exp
return 1/(1+np.exp(-x))
Thank you!
hi:
I want to know what file are the corpus.p and youtube2text_nbest.p? how can i get them when i want to caption my own raw video.
thanks!!!
Traceback (most recent call last):
File "SCN_training.py", line 243, in
n_words=n_words)
File "SCN_training.py", line 109, in train_model
f_grad_shared, f_update = Adam(tparams, cost, [x, mask, y, z], lr)
File "/home/sk/SCN/model_scn_v2/optimizers.py", line 216, in Adam
grads = tensor.grad(cost, tparams.values())
File "/home/anaconda3/lib/python3.5/site-packages/theano/gradient.py", line 502, in grad
" of type " + str(type(elem)))
TypeError: Expected Variable, got odict_values
Traceback (most recent call last):
File "SCN_evaluation.py", line 52, in
print score(refs, hypo)
File "SCN_evaluation.py", line 25, in score
(Meteor(),"METEOR"),
File "/root/SCN_for_video_captioning/pycocoevalcap/meteor/meteor.py", line 24, in init
stderr=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Hi,
I tried to run the first step (Run 1_obtain_tags_youtube2text.py to obtain the ground-truth 300 tags for the Youtube2Text dataset.) to obtain the tag file but there is no file "./data/corpus_youtube2text.p"
. Can you upload it? And how this file is format?
For Youtube2Text dataset, the dimension for C3D, ResNet, tag are (1970, 512), (1970, 2048), (1970, 300). What is each row in "1970" in this case? Is it just one frame or 1 video?
Also, can you give a more detailed guide about how to train on other datasets?
Thanks!
How do you extract the video features from the Raw video data on the youtube dataset?
please help me, thank you very mach!
ImportError: No module named version
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Traceback (most recent call last):
File "SCN_training.py", line 243, in
n_words=n_words)
File "SCN_training.py", line 137, in train_model
cost = f_grad_shared(x, mask,y,z)
File "/home/koeldc23/anaconda2/envs/theano/lib/python2.7/site-packages/theano/compile/function_module.py", line 917, in call
storage_map=getattr(self.fn, 'storage_map', None))
File "/home/koeldc23/anaconda2/envs/theano/lib/python2.7/site-packages/theano/gof/link.py", line 325, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/home/koeldc23/anaconda2/envs/theano/lib/python2.7/site-packages/theano/compile/function_module.py", line 903, in call
self.fn() if output_subset is None else
File "pygpu/gpuarray.pyx", line 693, in pygpu.gpuarray.pygpu_empty
File "pygpu/gpuarray.pyx", line 301, in pygpu.gpuarray.array_empty
pygpu.gpuarray.GpuArrayException: cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory
Apply node that caused the error: GpuSoftmaxWithBias(GpuDot22.0, bhid)
Toposort index: 449
Inputs types: [GpuArrayType(float32, matrix), GpuArrayType(float32, vector)]
Inputs shapes: [(2944, 12594), (12594,)]
Inputs strides: [(50376, 4), (4,)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[GpuAdvancedSubtensor(GpuSoftmaxWithBias.0, ARange{dtype='int64'}.0, HostFromGpu(gpuarray).0), HostFromGpu(gpuarray)(GpuSoftmaxWithBias.0)]]
Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "SCN_training.py", line 243, in
n_words=n_words)
File "SCN_training.py", line 104, in train_model
(use_noise, x, mask, y, z, cost) = build_model(tparams,options)
File "/home/koeldc23/SCN_for_video_captioning/model_scn_v2/video_cap.py", line 91, in build_model
pred = tensor.nnet.softmax(pred_x)
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
how did you predict semantic attributes? could you release the code of predict semantic attributes?
Hello,
I'm trying to replicate what you've done in the article by extracting c3d and resnet feature vectors and then running the SCN_decode.py. In the article it says c3d vectors should be of size 4096 since we are extracting them from layer fc7. However, I've noticed that the shape of your dataset (the c3d.mat files provided) is equal to 512. Which step am I missing?
Thanks in advance.
Hi, if I want to apply the model for other videos, how can I extract the features?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.