GithubHelp home page GithubHelp logo

canyuchen / hcan Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jinfengr/hcan

0.0 0.0 0.0 36.81 MB

EMNLP'19: Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling

Python 97.57% Shell 2.43%

hcan's Introduction

Bridging the Gap Between Relevance Matching and Semantic Matching for Short Text Similarity Modeling

This repo contains code and data for our paper published in EMNLP'19.

Reference

If you are using this code or dataset, please kindly cite the paper below:

@inproceedings{rao2019bridging,
  title={Bridging the Gap Between Relevance Matching and Semantic Matching for Short Text Similarity Modeling},
  author={Rao, Jinfeng and Liu, Linqing and Tay, Yi and Yang, Wei and Shi, Peng and Lin, Jimmy},
  booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)},
  pages={5373--5384},
  year={2019}
}

Requirements

  • Python 2.7
  • Tensorflow (tested on 1.9.0)
  • Keras (tested on 2.1.5)

Install

  • Download our repo:
git clone https://github.com/jinfengr/hcan.git
cd hcan
  • Install Tensorflow and Keras dependency:
$ pip install -r requirements.txt
  • Install gdrive
  • Download required data and word2vec:
$ chmod +x *.sh; ./download.sh
$ ./generate_idf.sh

Run

  • Run on TrecQA/Quora/TwitterURL datasets:
CUDA_VISIBLE_DEVICES=0 python -u train.py --dataset TrecQA -j hcan

The path of best model and output predictions will be shown in the log.

  • Run on Twitter datasets (test on trec-2013):
CUDA_VISIBLE_DEVICES=0 python -u train.py --dataset twitter -t trec-2013 -j hcan

Note: you might need around ~40GB memory to create the twitter dataset (because of the large size of IDF weights). Please file a issue if you have any problem in creating the dataset.

  • Parameter sweep to find the best parameter set (make sure the dataset is created before sweep):
./param_sweep.sh TrecQA hcan 0 &

This command will save all the outputs under tune_logs folder.

Command line parameters

option input format default description
-l [true, false] false whether to load pre-created dataset (set to true when data is ready)
-j [matching, biattention, hcan] matching attention choices, matching for relevance matching in Sec. 2.2, biattention for semantic matching in Sec. 2.3, hcan for the complete hcan model
-e [deepconv, wideconv, bilstm] deepconv encoder choices described in Sec. 2.1
-w [none, query] none whether to include IDF weighting, none for not include, query for include
--nb_layers [1, n) 5 number of convolutional or BiLSTM layers
--nb_filters [1, n) 256 number of convolutional filters or BiLSTM hidden dim
--model_option [complete, word-only] complete what input sources to use, complete for using both word and character-level ngram representations, word-only for using only word representations
--conv_option [normal, ResNet] normal convolutional model, normal or ResNet
--co-attention [BiDAF, ESIM] BiDAF different biattention implementations
--highway [true, false] false whether to include highway layer
-t [trec-2011, trec-2012, trec-2013, trec-2014] trec-2013 test set, only needed for twitter datasets
--load_model [true, false] false whether to load pre-trained model
-b [1, n) 64 batch size
-d [0, 1] 0.1 dropout rate
-o [sgd, adam, rmsprop] sgd optimization method
--lr [0, 1] 0.05 learning rate
--epochs [1, n) 15 number of training epochs
--trainable [true, false] true whether to train word embeddings
--val_split (0, 1) 0.15 percentage of validation set sampled from training set
-v [0, 1, 2] 1 verbose (for logging), 0 for silent, 1 for interactive, 2 for per-epoch logging

hcan's People

Contributors

jinfengr avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.