vt-vl-lab / sdn Goto Github PK

View Code? Open in Web Editor NEW

82.0 5.0 13.0 41 KB

[NeurIPS 2019] Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition

Home Page: http://chengao.vision/SDN/

License: MIT License

Python 100.00%

action-recognition debiasisng representation-learning activity-recognition video-understanding

sdn's Issues

Diving48 dataset issues

Hi,

Thanks for sharing the codebase.

I am interested to test the SDN model on the Diving48 dataset. The Diving48 project page has 2 sets of data, rgb and flow, I would like to know which one did you use?

Thanks!

Difficulties installing Python requirements

Hello,
i am trying to install the python requirements mentioned in your readme, but when executing the pip command, i get the following error:

$ pip install -r sdn_packages.txt
ERROR: Invalid requirement: '_libgcc_mutex             0.1                        main' (from line 2 of sdn_packages.txt)

From my experience, requirements.txt files which are usually used with pip contain just the package names and optionally their versions, no build or channel information.

I found out that channels are used with anaconda, so i tried using a conda environment instead of the virtualenv i created before, but with pip, i get the same error, and the conda equivalent does not seem to be able to parse the line either:

$  conda install --file sdn_packages.txt
CondaValueError: could not parse '_libgcc_mutex             0.1                        main' in: sdn_packages.txt

As i am not very experiences with anaconda, i also tried creating a new environment using the command conda create --name SDNauto --file sdn_packages.txt, but this lead to the same error as conda install.

Could you please provide a working command for installing the requirements?

I am using Python 3.6 and Ubuntu 18.04.

Kind Regards
Alexander

how to understand the permutation?

Hi Jinwoo Choi,

I am new to your project.
I noticed that in the dataset, in getitem, it will return
clip = torch.stack(clip, 0).permute(1, 0, 2, 3) and the target.
Is it mean the clip's shape is [3, 12, 224, 224] , means the channel, number of frames and height and width of the input respectively?
What's the purpose of permute ?

Thanks

3D-ResNet-18 result

@jinwchoi

Result in the paper for 3D-ResNet-18 algorithm (Table 1) is 83.5% on UCF101 and 53.6% on HMDB51.

Did you obtain these results pretraining model on Mini-Kinetics and fine-tuning on UCF101 and HMDB51 datasets?

If so, the difference of your result and [20] paper's result is so close to each other. How it can be explained since Mini-Kinetics dataset is 2 times smaller than Kinetics s dataset.

Please, clerify this situtation

Thank you!

Can't open scene and human detection data files

Hi @jinwchoi
Thanks for great work.

I downloaded scene and human detection data files, however I can't extract them.

Results on Diving48 Datasets without pretrained weights

Hi Choi,

Very interesting paper! Thanks for sharing the repo too!

I am trying to do action classification on the Diving48 dataset with 3D-ResNet 18 layers without using any pre-trained weights. I am getting a very poor result: 9.5% classification accuracy. Do you have any suggestions on improving this number or is it correct? Can you pl give me any reference paper which reports 3D CNN results without using any pre-trained weights?

My train clips: 112 x 112 resolution, and 16 frames
I do testing only on a single 16 frames clip for a video.

Hope to hear from you soon.

-Ishan

The baseline of 3D-ResNet-18 on Mini-Kinetics-200 trained from scratch

Hi Jinwoo,

Thanks for your work. One thing I want to know is, it seems the top-1 acc. of 3D-ResNet-18 on Mini-Kinetics-200 is not reported in the paper (or the reference paper). Could you kindly provide the information? And it is much better that you could share the .pth pretrained model of this.

Regards,

Mini-Kinetics pseudo scene labels

@jinwchoi

In the paper, you did write "we obtain a pseudo scene label p˜ ∈ P˜ by running Places365
dataset pre-trained ResNet-50 [73] on the Kinetics dataset".

Since you pre-trained models on Mini-Kinetic dataset, why you obtain pseudo scene labels of Kinetics dataset, not Mini-Kinetics dataset? (I guess in your experiment, pseudo scene labels of Kinetics dataset is not needed)

Thank you!

parameter settings for video_jpg_diving48.py

Hi！thank for your works！
can tell me about the script parameter settings for running video_jpg_diving48.py

TSN setting for Diving48

Hi, thanks for sharing your interesting work.

I have some questions about the TSN result in the paper, because I'm running TSN/TRN with Diving48 but I'm getting a way higher number.

Where did you get the number from? It looks like this repository doesn't have TSN model, so did you just use the original TSN code?
How many frames did you input? I know that it's not 16 but was it 8 or 32?
How did you sample the video? Sparsely sampled throughout the video (TSN strategy), or densely sampled (3D CNN strategy)?

I used 8-frame input and trained/tested with 25% of the Diving48 data (official split V2). I used sparsely sampled video, train scale jittering in [224,336] range and used 224x224 input resolution, and I got way over 50% on TSN which doesn't make sense.
My TSN/TRN code would show matching baseline results for other datasets like Something-Something, EPIC-Kitchens etc., so I'm wondering what the difference in the settings would be.

Thank you!

Mini-Kinetics 200 pre-trained weights

Hi there, thank you for the great work and providing the code.

I was wondering if you could release the ResNet-18 weights trained on the Mini-Kinetics dataset with the debiasing (i.e., without fine-tuning on any downstream dataset). I would like to try fine-tuning this pre-trained model on other datasets. Please let me know if you can release these weights.

Thank you!

Scene labels

@jinwchoi, could you explain more about obtaining pseudo scene labels, please?

I would like to get UCF101 pseudo scene labels

Thank you

vt-vl-lab / sdn Goto Github PK

sdn's Issues

Diving48 dataset issues

Difficulties installing Python requirements

how to understand the permutation?

3D-ResNet-18 result

Can't open scene and human detection data files

Results on Diving48 Datasets without pretrained weights

The baseline of 3D-ResNet-18 on Mini-Kinetics-200 trained from scratch

Mini-Kinetics pseudo scene labels

parameter settings for video_jpg_diving48.py

TSN setting for Diving48

Mini-Kinetics 200 pre-trained weights

Scene labels

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs