GithubHelp home page GithubHelp logo

scn_for_video_captioning's People

Contributors

zhegan avatar zhegan27 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

scn_for_video_captioning's Issues

About process raw feature

hi zhegan
When I extracted the C3D feature,It is a binary file (.pool5) How can I turn it into a mat file to use in the experiment. Can you give me some advice?

thank you for your patience!

How to extract captioning features_

Hi

Thank you for providing the code for the paper . I want to apply this technique to a new data set . So I need to know how captioning features were extracted.
Any help is appreciated

Regards
Kamal

ZeroDivisionError: float division by zero

Traceback (most recent call last):
File "SCN_training.py", line 244, in
n_words=n_words)
File "SCN_training.py", line 162, in train_model
valid_negll = calu_negll(f_cost, prepare_data, valid, img_feats, tag_feats, kf_valid)
File "SCN_training.py", line 53, in calu_negll
return totalcost/totallen
ZeroDivisionError: float division by zero

overflow

D:\Anaconda2\envs\Theano\python.exe "D:/PythonCode1/Semantic Compositional Networks/SCN_for_video_captioning-master/SCN_decode.py"
loading data...
loading learned params...
count how many captions we have generated...
start decoding @ 18:10:53.925000
D:/PythonCode1/Semantic Compositional Networks/SCN_for_video_captioning-master/SCN_decode.py:39: RuntimeWarning: overflow encountered in exp
return 1/(1+np.exp(-x))
Thank you!

About corpus.p file

hi:
I want to know what file are the corpus.p and youtube2text_nbest.p? how can i get them when i want to caption my own raw video.
thanks!!!

TypeError

Traceback (most recent call last):
File "SCN_training.py", line 243, in
n_words=n_words)
File "SCN_training.py", line 109, in train_model
f_grad_shared, f_update = Adam(tparams, cost, [x, mask, y, z], lr)
File "/home/sk/SCN/model_scn_v2/optimizers.py", line 216, in Adam
grads = tensor.grad(cost, tparams.values())
File "/home/anaconda3/lib/python3.5/site-packages/theano/gradient.py", line 502, in grad
" of type " + str(type(elem)))
TypeError: Expected Variable, got odict_values

OSError: [Errno 2] No such file or directory

Traceback (most recent call last):
File "SCN_evaluation.py", line 52, in
print score(refs, hypo)
File "SCN_evaluation.py", line 25, in score
(Meteor(),"METEOR"),
File "/root/SCN_for_video_captioning/pycocoevalcap/meteor/meteor.py", line 24, in init
stderr=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 710, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

Feature dimension and missing corpus files

Hi,

I tried to run the first step (Run 1_obtain_tags_youtube2text.py to obtain the ground-truth 300 tags for the Youtube2Text dataset.) to obtain the tag file but there is no file "./data/corpus_youtube2text.p". Can you upload it? And how this file is format?

For Youtube2Text dataset, the dimension for C3D, ResNet, tag are (1970, 512), (1970, 2048), (1970, 300). What is each row in "1970" in this case? Is it just one frame or 1 video?

Also, can you give a more detailed guide about how to train on other datasets?
Thanks!

problems running SCN_training.py

ImportError: No module named version
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Traceback (most recent call last):
File "SCN_training.py", line 243, in
n_words=n_words)
File "SCN_training.py", line 137, in train_model
cost = f_grad_shared(x, mask,y,z)
File "/home/koeldc23/anaconda2/envs/theano/lib/python2.7/site-packages/theano/compile/function_module.py", line 917, in call
storage_map=getattr(self.fn, 'storage_map', None))
File "/home/koeldc23/anaconda2/envs/theano/lib/python2.7/site-packages/theano/gof/link.py", line 325, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/home/koeldc23/anaconda2/envs/theano/lib/python2.7/site-packages/theano/compile/function_module.py", line 903, in call
self.fn() if output_subset is None else
File "pygpu/gpuarray.pyx", line 693, in pygpu.gpuarray.pygpu_empty
File "pygpu/gpuarray.pyx", line 301, in pygpu.gpuarray.array_empty
pygpu.gpuarray.GpuArrayException: cuMemAlloc: CUDA_ERROR_OUT_OF_MEMORY: out of memory
Apply node that caused the error: GpuSoftmaxWithBias(GpuDot22.0, bhid)
Toposort index: 449
Inputs types: [GpuArrayType(float32, matrix), GpuArrayType(float32, vector)]
Inputs shapes: [(2944, 12594), (12594,)]
Inputs strides: [(50376, 4), (4,)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[GpuAdvancedSubtensor(GpuSoftmaxWithBias.0, ARange{dtype='int64'}.0, HostFromGpu(gpuarray).0), HostFromGpu(gpuarray)(GpuSoftmaxWithBias.0)]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "SCN_training.py", line 243, in
n_words=n_words)
File "SCN_training.py", line 104, in train_model
(use_noise, x, mask, y, z, cost) = build_model(tparams,options)
File "/home/koeldc23/SCN_for_video_captioning/model_scn_v2/video_cap.py", line 91, in build_model
pred = tensor.nnet.softmax(pred_x)

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

c3d feature vector of size 512

Hello,

I'm trying to replicate what you've done in the article by extracting c3d and resnet feature vectors and then running the SCN_decode.py. In the article it says c3d vectors should be of size 4096 since we are extracting them from layer fc7. However, I've noticed that the shape of your dataset (the c3d.mat files provided) is equal to 512. Which step am I missing?

Thanks in advance.

Feature Extraction

Hi, if I want to apply the model for other videos, how can I extract the features?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.