GithubHelp home page GithubHelp logo

gsig / actor-observer Goto Github PK

View Code? Open in Web Editor NEW
75.0 75.0 9.0 1.66 MB

ActorObserverNet code in PyTorch from "Actor and Observer: Joint Modeling of First and Third-Person Videos", CVPR 2018

License: GNU General Public License v3.0

Python 99.68% Shell 0.32%

actor-observer's People

Contributors

gsig avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

actor-observer's Issues

Some questions about code

The command I used is python third_to_first_person.py.
and then with some problems

  1. the CharadesEgo_val_video part always 0 samples loaded?

cachefile ./caches/third_to_first_person//CharadesEgo_train.pkl
Loading cached result from './caches/third_to_first_person//CharadesEgo_train.pkl'
516965 samples loaded
cachefile ./caches/third_to_first_person//CharadesEgo_val.pkl
Loading cached result from './caches/third_to_first_person//CharadesEgo_val.pkl'
137321 samples loaded
cachefile ./caches/third_to_first_person//CharadesEgo_val_video.pkl
Loading cached result from './caches/third_to_first_person//CharadesEgo_val_video.pkl'
0 samples loaded

  1. Which dataset does the code use? CharadesEgo_v1_rgb or Charades_v1_rgb?

Thank you in advance!

How to get the fine-tune baseline results?

Hi,
how to fine-tune the model from the third-view dataset on the first-view dataset?

I tried to directly fine-tune the trained model from https://github.com/gsig/charades-algorithms by using this script https://github.com/gsig/actor-observer/blob/master/exp/baseline_resnet152imagenet.py.

However, the results is only 22, which is the same with the model fine-tuned from ImageNet, which indicates the third-view pre-training has no effect on the first-view performance.
Could you please indicate how to correctly fine-tune the model trained from the third-view dataset?

Thanks.

Not very clearly about ActorObserverFC7.py line 28

Sorry to disturb you!
Should base_x be replaced with base_y?
w_y = self.third_fc(base_x).view(-1) * torch.exp(self.third_scale)

And What does this function do? (in tasks.py)
def best_one_sec_moment(mat, winsize=6):

thank you!

Out of memory.

Hi, thank you for sharing this amazing code.
How many GPUs are you used to train the model, and what type of the GPUs?
Thank you in advance!

What does ResNet-152 Transfer mean?

Hi Gunnar,

I was just wondering what "ResNet-152 Transfer" stands for.
The papers said "It uses the Charades model to predict the activities in the third person video,
and then uses those labels as supervision for the first-person video."

Does it mean you run the Charades model on CharadesEgo_only3rd videos, then warp the labels to their corresponding 1st person video, and use the labels as supervision to finetune the original Charades model?

Thank you.

Best,
Ziwei

Why one frame belongs to different actions?

Hi~
In the Charades and CharadesEgo dataset, one video always contains several actions. In the code, you divide the video into several clips according to the start and end time, but I have observed that one frame may belongs to multiple action tags.
In this case, Can the loss function be trained normally?

What's the version of dataset?

Hi,
I find that you use version0 under the folder datasets/labels, but when I download the CharadesEgo dataset I get the label of version1. What version do you use to get the result of the paper?
Thanks.

Problem of load pre-trained model.

The command I used is python baseline_resnet152charades.py.
Thank you in advance!

=> using pre-trained model 'resnet152'
loading pretrained-weights from /nfs.yoda/gsigurds/charades_pretrained/resnet_rgb.pth.tar
Traceback (most recent call last):
File "baseline_resnet152charades.py", line 38, in
main()
File "./main.py", line 60, in main
model, criterion, optimizer = create_model(args)
File "./models/init.py", line 10, in create_model
model = load_architecture(args)
File "./models/utils.py", line 78, in load_architecture
model = generic_load(args.arch, args.pretrained, args.pretrained_weights, args)
File "./models/utils.py", line 61, in generic_load
model = model.dictarch
File "./models/ActorObserverBase.py", line 55, in init
model = load_sub_architecture(args)
File "./models/utils.py", line 73, in load_sub_architecture
model = generic_load(args.subarch, args.pretrained, args.pretrained_subweights, args)
File "./models/utils.py", line 65, in generic_load
chkpoint = torch.load(weights)
File "/home/csdept/anaconda3/envs/py27/lib/python2.7/site-packages/torch/serialization.py", line 301, in load
f = open(f, 'rb')
IOError: [Errno 2] No such file or directory: '/nfs.yoda/gsigurds/charades_pretrained/resnet_rgb.pth.tar'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.