alibaba / esim-response-selection Goto Github PK

View Code? Open in Web Editor NEW

580.0 33.0 182.0 1.58 MB

ESIM for Multi-turn Response Selection Task

Home Page: https://arxiv.org/pdf/1901.02609.pdf

License: Apache License 2.0

Python 98.52% Shell 1.48%

esim-response-selection's Introduction

ESIM for Multi-turn Response Selection Task

Introduction

If you use this code as part of any published research, please acknowledge one of the following papers.

@inproceedings{chen2019sequential,
  title={Sequential Matching Model for End-to-end Multi-turn Response Selection},
  author={Chen, Qian and Wang, Wen},
  booktitle={ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={7350--7354},
  year={2019},
  organization={IEEE}
}

@article{DBLP:journals/corr/abs-1901-02609,
  author    = {Chen, Qian and Wang, Wen},
  title     = {Sequential Attention-based Network for Noetic End-to-End Response Selection},
  journal   = {CoRR},
  volume    = {abs/1901.02609},
  year      = {2019},
  url       = {http://arxiv.org/abs/1901.02609},
}

Requirement

gensim

pip install gensim

Tensorflow 1.9-1.12 + Python2.7

Steps

Download the Ubuntu dataset released by (Xu et al, 2017)
Unzip the dataset and put data directory into data/
Preprocess dataset, including concatenatate context and build vocabulary

cd data
python prepare.py

Train word2vec

bash run_train_word2vec.sh

Train and test ESIM, the log information is in log.txt file. You could find an example log file in log_example.txt.

cd scripts/esim
bash run.sh

esim-response-selection's People

Contributors

Stargazers

Watchers

Forkers

dachengai wonder2025 gavinzjchao gaozhangmin yyht hfxunlp zhlj98 njnuwjq wuxiaolianggit lukazheng hanxueming126 togoll zoucongchao liuwq168 jjwangnlp xiatian168 szbradly yuxuan2015 jieliorz ymj4023 pudzhang mingjunw wixche long20150122 jiyulongxu xdawenxi wuyunxiangwyx handong890 qsong4 lylyhs aiqoai wtthard zxiaoyusq nanbo99 b-xiang sadnessofatlantis ludovicly honyking saxh cnsky2016 jygan zhaoyun630 vguanwenv dootn tangzhiqiang homerj233 jingyilang wushicanasl hhy5277 wangxuekui hecongqing alfredsoo jeinlee1991 flameonyou deepfool crazier9527 yueyedeai jecktion zgd716 smallfang2008 zzky ylh23y all-amazing 83909339 jackliu16 jackquj ares2013 dwyane3 gavinljj 447555240 pengg williamdeve chinazpj aositeluofu tobetheduke asdlei99 karakoram blueroutecn quan95 zhouchunyi shelsenchen engineerfan nipengmath skywj zongweili ericxsun lwllove11 jinxiu0406 shaogx sanyuan-chen bright59 b2220333 kytening xuhaiming1996 hbwzhsh enningxie hackertao maogwleon lusonpan62678 imfuji

esim-response-selection's Issues

训练和实际应用的偏差

训练数据中，负样本是随机采样得到的，会差的比较离谱，所以在训练时, p@1可以得到很高的值，因为差中选一比较容易。
但是实际应用中，一个query过来，类似检索策略给出的候选，质量会相对训练时随机采样的高很多，模型效果也就比较差了。
要怎么做才能消除这种训练和应用的偏差呢？

数据预处理部分

在textIterator里面，下面这段代码写的有问题吧，按照注释，应该是按照context和response长度和排序，这里怎么写成了“current_length = ins[1] + ins[2]”，应该是"current_length = len(ins[1]) + len(ins[2])"吧

            # sort by length of sum of target buffer and target_buffer
            length_list = []
            for ins in self.instance_buffer:
                current_length = ins[1] + ins[2]
                length_list.append(current_length)

            length_array = numpy.array(length_list)
            length_idx = length_array.argsort()

语料中的content为啥混在一起了？

感谢阿里作者的无私奉献和共享。我看完语料后有些疑问，例如中文语料E-conmerce中的一行：

聊天内容被混在一起了，只有前两句和后两句用了tab键分开，中间的对话都没切分，混在了一起。
请问大佬，这是怎么回事呀？没理解

Sentence-encoding based methods implement

Sentence-encoding based methods useed in subtask 2 in the paper，where is the implement

for l_x, s_x, l_y, s_y, l in zip(lengths_x, seqs_x, lengths_y, seqs_y, labels):
if l_x > maxlen_1:
new_seqs_x.append(s_x[-maxlen_1:])
new_lengths_x.append(maxlen_1)
else:
new_seqs_x.append(s_x)
new_lengths_x.append(l_x)
if l_y > maxlen_2:
new_seqs_y.append(s_y[:maxlen_2])
new_lengths_y.append(maxlen_2)
else:
new_seqs_y.append(s_y)
new_lengths_y.append(l_y)

这里为什么s_x是取后maxlen_1的词,而s_y是取前maxlen_2的词

tensorflow要用GPU版本的吗？

InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'CudnnRNNCanonicalToParams' with these attrs. Registered devices: [CPU], Registered kernels:

请问一下：开源出的代码是否还采用了一些别的优化方法？另外也想问下如何应用？

如题，按照Readme跑出来的MAP和MRR都只有0.33左右，而且跑到epoch5就结束了，论文上的准确率是不是还采用了其他的优化方法啊？另外同另一个issue一样，也想问下能否开源一下应用的代码？想试试看实际效果如何。。。
先谢过啦！

请问probability的两个值[[r1,r2],[r1,r2],...]分表表示什么含义？

当我输入一个问题N个候选答案时，probability是一个n*2的矩阵，请问每个值是什么含义？

dataset files can not download

the link or file has been removed? I'm internet in Chinese mainland, i can't download the link https://www.dropbox.com/s/2fdn26rj6h9bpvl/ubuntu_data.zip?dl=0 which write in the README.md file.

performance is poor on valid set using e-commerce dataset

is there someone can help or give advice on how to make it work?

change hidden size to 100, so can use batch size 32.

hyper-parameter as below:

CUDA_VISIBLE_DEVICES=6 python -u main.py
--train_file=$DATA_DIR/train.txt
--valid_file=$DATA_DIR/valid.txt
--test_file=$DATA_DIR/test.txt
--vocab_file=$DATA_DIR/vocab.txt
--output_dir=result
--embedding_file=../../data/embedding_w2v_d300.txt
--maxlen_1=300
--maxlen_2=150
--hidden_size=100
--train_batch_size=32
--valid_batch_size=16
--test_batch_size=16
--fix_embedding=True
--patience=1 \

log.txt 2>&1 &

请问下这份代码是不是不包括如何从120000候选集中筛选出100个候选回答的模块呢?

请问下，如何用在简单的问答匹配模型上？

如题，辛苦解答下

context = ' eou eot '.join(arr[1:-1]) + ' eou eot '

在data prepare 模块，为什么用两个符号“eou eot"分割一个utterance而不是更简单的符号比如”eou"？
论文里面说，"The multi-turn context was concatenated and two special tokens, eou and eot , were inserted, where eou denotes end-of utterance and eot denotes end-of-turn."
所以这里的turn时什么意思？是一问一答2个utterance的意思吗？
如果是的话，一个context就应该是这样的，在偶数句后面多一个__eot__：
A1 eou B1 eou eot A2 eou B2 eou eot A3 eou
希望作者帮忙解答一下，谢谢

model restore problem

Hi,I have an perplexed issue when I try to restore the model to inference corpus.The issue is originated from the use of tf.contrib.cudnn_rnn.CudnnLSTM to restore the model after using this function to train model.But there is no issue when I use the tf.contrib.rnn.LSTMCell function to train corpus and restore model to inference corpus.I found that many developers encounter the same issue with me and no one can handle this issue correctly.I read your code,you just restore it by using the saver.restore(sess, os.path.join(FLAGS.output_dir, "model_epoch_{}...),but I can't restore the model by using it.

Please help me to figure the issue out,thank you!My tensorflow's version is 1.10.0.

alibaba / esim-response-selection Goto Github PK

esim-response-selection's Introduction

ESIM for Multi-turn Response Selection Task

Introduction

Requirement

Steps

esim-response-selection's People

Contributors

Stargazers

Watchers

Forkers

esim-response-selection's Issues

hyper-parameter as below:

Recommend Projects

Recommend Topics

Recommend Org

Jobs