GithubHelp home page GithubHelp logo

vt-vl-lab / sdn Goto Github PK

View Code? Open in Web Editor NEW
82.0 5.0 13.0 41 KB

[NeurIPS 2019] Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition

Home Page: http://chengao.vision/SDN/

License: MIT License

Python 100.00%
action-recognition debiasisng representation-learning activity-recognition video-understanding

sdn's Issues

Diving48 dataset issues

Hi,

Thanks for sharing the codebase.

I am interested to test the SDN model on the Diving48 dataset. The Diving48 project page has 2 sets of data, rgb and flow, I would like to know which one did you use?

Thanks!

Difficulties installing Python requirements

Hello,
i am trying to install the python requirements mentioned in your readme, but when executing the pip command, i get the following error:

$ pip install -r sdn_packages.txt
ERROR: Invalid requirement: '_libgcc_mutex             0.1                        main' (from line 2 of sdn_packages.txt)

From my experience, requirements.txt files which are usually used with pip contain just the package names and optionally their versions, no build or channel information.

I found out that channels are used with anaconda, so i tried using a conda environment instead of the virtualenv i created before, but with pip, i get the same error, and the conda equivalent does not seem to be able to parse the line either:

$  conda install --file sdn_packages.txt
CondaValueError: could not parse '_libgcc_mutex             0.1                        main' in: sdn_packages.txt

As i am not very experiences with anaconda, i also tried creating a new environment using the command conda create --name SDNauto --file sdn_packages.txt, but this lead to the same error as conda install.

Could you please provide a working command for installing the requirements?

I am using Python 3.6 and Ubuntu 18.04.

Kind Regards
Alexander

how to understand the permutation?

Hi Jinwoo Choi,

I am new to your project.
I noticed that in the dataset, in getitem, it will return
clip = torch.stack(clip, 0).permute(1, 0, 2, 3) and the target.
Is it mean the clip's shape is [3, 12, 224, 224] , means the channel, number of frames and height and width of the input respectively?
What's the purpose of permute ?

Thanks

3D-ResNet-18 result

@jinwchoi

Result in the paper for 3D-ResNet-18 algorithm (Table 1) is 83.5% on UCF101 and 53.6% on HMDB51.

Did you obtain these results pretraining model on Mini-Kinetics and fine-tuning on UCF101 and HMDB51 datasets?

If so, the difference of your result and [20] paper's result is so close to each other. How it can be explained since Mini-Kinetics dataset is 2 times smaller than Kinetics s dataset.

Please, clerify this situtation

Thank you!

Results on Diving48 Datasets without pretrained weights

Hi Choi,

Very interesting paper! Thanks for sharing the repo too!

I am trying to do action classification on the Diving48 dataset with 3D-ResNet 18 layers without using any pre-trained weights. I am getting a very poor result: 9.5% classification accuracy. Do you have any suggestions on improving this number or is it correct? Can you pl give me any reference paper which reports 3D CNN results without using any pre-trained weights?

My train clips: 112 x 112 resolution, and 16 frames
I do testing only on a single 16 frames clip for a video.

Hope to hear from you soon.

-Ishan

The baseline of 3D-ResNet-18 on Mini-Kinetics-200 trained from scratch

Hi Jinwoo,

Thanks for your work. One thing I want to know is, it seems the top-1 acc. of 3D-ResNet-18 on Mini-Kinetics-200 is not reported in the paper (or the reference paper). Could you kindly provide the information? And it is much better that you could share the .pth pretrained model of this.

Regards,

Mini-Kinetics pseudo scene labels

@jinwchoi

In the paper, you did write "we obtain a pseudo scene label p˜ ∈ P˜ by running Places365
dataset pre-trained ResNet-50 [73] on the Kinetics dataset".

Since you pre-trained models on Mini-Kinetic dataset, why you obtain pseudo scene labels of Kinetics dataset, not Mini-Kinetics dataset? (I guess in your experiment, pseudo scene labels of Kinetics dataset is not needed)

Thank you!

TSN setting for Diving48

Hi, thanks for sharing your interesting work.

I have some questions about the TSN result in the paper, because I'm running TSN/TRN with Diving48 but I'm getting a way higher number.

  1. Where did you get the number from? It looks like this repository doesn't have TSN model, so did you just use the original TSN code?
  2. How many frames did you input? I know that it's not 16 but was it 8 or 32?
  3. How did you sample the video? Sparsely sampled throughout the video (TSN strategy), or densely sampled (3D CNN strategy)?

I used 8-frame input and trained/tested with 25% of the Diving48 data (official split V2). I used sparsely sampled video, train scale jittering in [224,336] range and used 224x224 input resolution, and I got way over 50% on TSN which doesn't make sense.
My TSN/TRN code would show matching baseline results for other datasets like Something-Something, EPIC-Kitchens etc., so I'm wondering what the difference in the settings would be.

Thank you!

Mini-Kinetics 200 pre-trained weights

Hi there, thank you for the great work and providing the code.

I was wondering if you could release the ResNet-18 weights trained on the Mini-Kinetics dataset with the debiasing (i.e., without fine-tuning on any downstream dataset). I would like to try fine-tuning this pre-trained model on other datasets. Please let me know if you can release these weights.

Thank you!

Scene labels

@jinwchoi, could you explain more about obtaining pseudo scene labels, please?

I would like to get UCF101 pseudo scene labels

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.