GithubHelp home page GithubHelp logo

hcnmn's Introduction

Toward multi-granularity decision-making: explicit visual reasoning with hierarchical knowledge

Data and Knowledge Prepartion

  1. Download and unpack Visual Genome images as well as the annotations, class info and image meta-data
  2. Get ConceptNet from conceptNet or through the link
  3. Download WikiText-2
  4. Follow MAVEx to preprocess the v features from VG
  5. Download Glove pretrained word vectors
  6. WordNet is available at nltk package

Prepare Hierarchical Concept Graph

  1. Preprocess questions to obtain train_questions.pt and vocab.json
python preprocess_questions.py --glove_pt </path/to/generated/glove/pickle/file> --input_questions_json </your/path/to/v2_OpenEnded_mscoco_train2014_questions.json> --input_annotations_json </your/path/to/v2_mscoco_train2014_annotations.json> --output_pt </your/output/path/train_questions.pt> --vocab_json </your/output/path/vocab.json> --mode train
  1. Preprocess vocab.json with ontology information from wordNet to obtain augmented concepts (vocab.json) and the concept hierarchy (hierarchy.json)
python vocab_augmentation.py --input_vocab </your/input/path/vocab.json> --glove_pt </path/to/generated/glove/pickle/file> --vocab_json </your/output/path/vocab.json> --hierarchy </your/output/path/vocab.json> --wordnet_base
  1. Combine the WordNet categorical hierarchy (hierarchy.json) with other downloaded knowledge sources to obtain the full hierarchical concept graph topology (topology.json) and the vocab list of cross-concept relationship (relation.json).
python knowledge_incorporation.py --input_vocab </your/input/path/vocab.json> --knowledge_dir </your/input/knowledge/dir> --glove_pt </path/to/generated/glove/pickle/file> --input_hierarchy </your/input/hierarchy.json> --topology_json </your/output/path/topology.json> --relation_vocab </your/output/path/relation.json> --hierarchy
  1. Extract distinguishable concept property (concept_property.json) vectors and property list (property.json) by incorporation knowledge with all downloaded knowledge sources.
python knowledge_incorporation.py --input_vocab </your/input/path/vocab.json> --knowledge_dir </your/input/knowledge/dir> --glove_pt </path/to/generated/glove/pickle/file> --property_vocab </your/output/path/property.json> --concept_property </your/output/path/concept_property.json>  --property
  1. Download grounded features from paper LXMERT repo

  2. Preprocess features

python preprocess_features.py --input_tsv_folder /your/path/to/trainval_36/ --output_h5 /your/output/path/trainval_feature.h5
  1. generate Hierarchical concept graph from all the downloaded data files to initialize concepts features
python preprocess_concepts.py --input_knowledge_folder /your/path/to/knowledge/sources --output_folder /your/output/path/concepts --glove_pt /your/path/to/glove/features mavex_pt /your/path/to/mavex/features

Train and Evaluate the Model (HCNMN)

  1. Train the model and get valuation results
python train.py --input_dir <path/to/preprocessed/data/folder> --concept <path/to/preprocessed/concept/folder> --concept_property <path/to/concept_property.json> --topology <path/to/topology.json> --relation_list <path/to/relation.json> --property_list <path/to/property.json> --save_dir </path/for/checkpoint> --model HCNMN  --T_ctrl 3 --stack_len 4 --cuda 1 --val

hcnmn's People

Contributors

superjohnzhang avatar

Stargazers

 avatar persistence avatar  avatar rookie_lh avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.