GithubHelp home page GithubHelp logo

seth-park / multimodalexplanations Goto Github PK

View Code? Open in Web Editor NEW
48.0 48.0 10.0 709 KB

Code release for Park et al. Multimodal Multimodal Explanations: Justifying Decisions and Pointing to the Evidence. in CVPR, 2018

License: BSD 2-Clause "Simplified" License

Python 100.00%

multimodalexplanations's People

Contributors

seth-park avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

multimodalexplanations's Issues

Unknown date type layer when run pretrained model

screen shot 2018-12-21 at 3 53 37 pm

Hi SethPark it was very nice of you to upload this folder but some errors happen and even how hard I try to do something with Makefile.config, this error did not pass and my teacher told me that my virtual machine did not have enough ram to run resnet152.

Training my own model

Hi. I've been trying to train my own models for vqa-x, but when I try to generate explanations on models I trained using train.py, my vqa answers and explanations are horrible.

I train for 50k iterations and the print results during training look great, but when I run generate_explanations.py, I would only get 1/1459 vqa answers correct and my explanations look off. In fact, when I test the training set, I only get 15/29459 correct. However, when I use your pretrained model, I get 1073/1459 correct using generate_explanations.py for the validation set.

Is there a step I'm missing going from training my own model using train.py to generating explanations using generate_explanations.py? Training more iterations (going from 30k to 50k) doesn't seem to be improving this issue.

Thanks

Explanation Evaluation Metric

Hey Seth,

I've revisited your code again and I've been running some experiments with your dataset in pytorch! To evaluate the explanations, I've been using an adapted version of https://github.com/tylin/coco-caption. I've also used your pretrained models to generate explanations for comparisons to my results, but I notice I get numbers that are different than the reported numbers in your paper.

Using your pretrained model with --use_gt, I get 16.7 on Bleu-4 (19.8 in the paper), 51.3 for Cider (73.4 in the paper), and 39.6 for Rouge (44.0) in the paper. What evaluation code did you use to run your evaluation metrics and what do you think could be the reason from this difference?

Thanks!

Jeff

Explanations Annotations for Generating Explanations

Hi,
For VQA-X, I was able to generate explanations using the pre-trained model, but I am a bit confused why it is necessary to give the --exp_file with the explanation labels. I'm referring to this command:

cd PJ-X-VQA/generate_vqa_exp
python generate_explanation.py --ques_file ../VQA-X/Questions/v2_OpenEnded_mscoco_val2014_questions.json --ann_file ../VQA-X/Annotations/v2_mscoco_val2014_annotations.json --exp_file ../VQA-X/Annotations/val_exp_anno.json --gpu 0 --out_dir ../VQA-X/results --folder ../model/ --model_path $PATH_TO_CAFFEMODEL --use_gt --save_att_map

I understand that at the moment of creating the explanations in
PJ-X-VQA/generate_vqa_exp/generate_explanation.py the labels are not being used. Could you please clarify?
Thanks,

Extracting visual explanations from VQA-X

I understand that the visual explanation (in the form of a heatmap) is given in VQA-X/visual. How do I extract the images? I am using the following numpy code:

img = np.load("11042002.npy")
shape = np.sort(img.shape)[::-1] # Reverse sorting shape so as to bring it in the form of (64x, 48x, 3)
plt.imshow(img.reshape(shape))

However, this leads to strange heatmaps like:

11042002

What am I doing wrong here?

Dataset files

Hi,

Could you please what are the files v2_mscoco_train2014_annotations.json and the v2_mscoco_val2014_annotations.json meant to be in Annotations/ and the v2_OpenEnded_mscoco_train2014_questions.json and v2_OpenEnded_mscoco_val2014_questions.json files from the Questions/ ?

I couldn't find these exact names in the links mentioned and I want to make sure I'm not using incorrect files if I just guess myself what they would be. Also, to be sure, should visual/val/ and visual/test be empty in case we retrain the model from scratch or use the files you in https://drive.google.com/drive/u/0/folders/1Cr9JRXDmjks_wmi-a9eIe4SWSwWKcCk7 ?

Thanks,
Oana

Help with EMD details

Hello Seth,
I am using pyemd as you posted, but I have some doubts about the EMD calculation.
Did you use emd or emd_samples ?.
If it is emd, what distance matrix did you use?, and if it is emd_samples, how many bins ?

Thanks!

Full VQA-X dataset

The VQA-X google drive location contains 1000 images from val and test each. Is there any way to access the rest of the data?

EMD and Rank Correlation calculation

Does this repo contain code to calculate the Earth Mover Distance (EMD) and Rank Correlation as given in the paper? If not, how can I go about calculating them?

Content of visual directory

Hi Seth,

I am curious about what information does visual directory contain. This would be helpful to know as I want to train this model on a completely different dataset.

Thank you!

Originally posted by @Seth-Park in #12 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.