hltchkust / mem2seq Goto Github PK

Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems

License: MIT License

Python 97.09% Perl 2.91%

memory-network deep-learning dialogue-systems end-to-end-learning

mem2seq's Introduction

Installation

git clone https://github.com/HLTCHKUST/hltchkust.github.io.git
git fetch
git checkout source

Make sure you have Ruby and Bundler installed on your system (hint: for ease of managing ruby gems, consider using rbenv), then run

bundle install

To add new blog

Create a new markdown file under _posts, follow the format of previous ones.

To add an announcement to News

Create a new markdown file under _news, follow the format of previous ones.

To add a new publication

Just add your bibliography to the _bibliography/papers.bib. If you want to show it in the Selected publications, add the selected={true} field.

Deployment

To deploy new changes, just

./bin/deploy --user

then

git push origin source

Wait a few seconds, and refresh!

For local development

To build the latest content that you change

bundle exec jekyll build

To run the server locally

bundle exec jekyll serve

Then, a localhost is established at port 4000.

mem2seq's People

Contributors

Stargazers

Watchers

Forkers

zcfrank1st ivanhe andreamad8 yinzhang809 hyzcn uniphix000 friendshipity judelee19 monireh2 fengzhangyin jankim masterbing twoflypig robspringles studydeeplearningai mhbashari u784799i hazelzzang jacoxu zhaoyangyanghh hydercps teng-sun youngornever uditsaxena sc89703312 sungjinlees igoingdown mwufi fuxeyhuang yucoian shiweiba fandongmeng shubhampachori12110095 alex-fabbri kdjyss fastcode3d shaficse zhoudayang xiaomi2008 drevicko haoyusoong dupanfei1 tonydeep sg12qt chl916185 isaacmg njmch03 zxsted maxindian luomuqinghan beethovenvirus zouning68 mrzpx fivekilometers cstghitpku stiffxj mattzheng pokbe paulrich1234 scape1989 xiehuateng couragelfyang marvinzh clbb1127 czmgit kolasamuel shiquanyang fw339wj yjnjerry ttgit liushui9404 hwwancient yaphetsyu auscenery nlpming luyulalala mruayan xcgfth jcarlosneto rashad101 thunderboom chenny0808 hulumei123 lpschaub azureblauw hbwzhsh xuehuiping yangpuhai dmgolembiowski lei-li fairy-of-9 traintravel c-dongbo vanessahlyan holykikyou initialbug ezeob002 christinataft worldie-com peterzhousz

mem2seq's Issues

ptr_UNK network failed.

When i run the ptr_UNK network,
it cause a interrupt in decoder's forward function
dimension out of range (expected to be in range of [-1, 0], but got 1)
with this line
v = self.v.repeat(encoder_outputs.data.shape[0],1).unsqueeze(1)

why use the nn.Embedding rather than something pre-trained

Sorry to bother you, but i really want to ask when encoding the words , why you use the nn.Embedding rather than use a pre-trained Embedding such as Glove? Hope you can help me with this question. Thank you very much !

Code fails on evaluation for kvr

I'm running the following command
python3 main_train.py -lr=0.001 -layer=1 -hdd=128 -dr=0.2 -dec=Mem2Seq -bsz=8 -ds=kvr -t= I'm in the process of upgrading to torch 0.4.0 and I already changed the line to top_ptr_i.squeeze()[i].item(). It only fails on the 97 of the 99 evaluation for the code which makes me think that it is only some weird data.
Here is the error code:
02-27 00:04 STARTING EVALUATION R:0.0855,W:72.8617: 99% 97/98 [00:22<00:00, 4.58it/s]Traceback (most recent call last): File "main_train.py", line 62, in <module> acc = model.evaluate(dev,avg_best, BLEU) File "/content/Mem2Seq/models/Mem2Seq.py", line 281, in evaluate data_dev[2],data_dev[3],data_dev[4],data_dev[5],data_dev[6]) File "/content/Mem2Seq/models/Mem2Seq.py", line 202, in evaluate_batch next_in = [top_ptr_i.squeeze()[i].item() if(int(toppi.squeeze()[i].item()) < input_lengths[i]-1) else topvi.squeeze()[i].item() for i in range(batch_size)] File "/content/Mem2Seq/models/Mem2Seq.py", line 202, in <listcomp> next_in = [top_ptr_i.squeeze()[i].item() if(int(toppi.squeeze()[i].item()) < input_lengths[i]-1) else topvi.squeeze()[i].item() for i in range(batch_size)] IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
BTW my branch is located here https://github.com/isaacmg/Mem2Seq

Thanks for the help.

acc_avg and wer_avg

hi，thank you for your great work,and i want to ask the role of acc_avg and wer_avg.

about version of environment

could you please offer a requirement file about the envirnoment's version ?

tokenisation script

Hi,
Could you share the script you use to tokenize kvret dataset to txt files. There are several non trivial transformations which are hard to replicate.

Thanks

Pre-processing questions

Hello, I was just hoping you could clarify some of the preprocessing code as I'm trying to convert MultiWOZ to the same format to use with the model. First of all I'm assuming that read_entities.py is what is producing the train.txt, test.txt etc for the KVR dataset. So the goal of preprocessing seems to be to get them all into the format 0 2_miles road_block_nearby parking_garage poi dish_parking. So I assume the 0 means that they are a KB entry correct? Then the 1, 2,3 are for the actual dialogue turns. Also for the KB entries how do you decide how many items to have? Some seem to be the standard KB triplet ravenswood_shopping_center distance 6_miles whereas other seem to be longer like 4_miles heavy_traffic coffee_or_tea_place poi philz? What is the criteria for determining that? Also are there are any other preprocessing subtleties I should be aware of?

Thanks.

i'm confusing about p_ptr variable in ptrMemDecoder function in decoder?

def ptrMemDecoder(self, enc_query, last_hidden): embed_q = self.C[0](enc_query) # b * e output, hidden = self.gru(embed_q.unsqueeze(0), last_hidden) temp = [] u = [hidden[0].squeeze()] for hop in range(self.max_hops): m_A = self.m_story[hop] if(len(list(u[-1].size()))==1): u[-1] = u[-1].unsqueeze(0) ## used for bsz = 1. u_temp = u[-1].unsqueeze(1).expand_as(m_A) prob_lg = torch.sum(m_A*u_temp, 2) prob_ = self.softmax(prob_lg) m_C = self.m_story[hop+1] temp.append(prob_) prob = prob_.unsqueeze(2).expand_as(m_C) o_k = torch.sum(m_C*prob, 1) if (hop==0): p_vocab = self.W1(torch.cat((u[0], o_k),1)) u_k = u[-1] + o_k u.append(u_k) p_ptr = prob_lg return p_ptr, p_vocab, hidden
In the function ptrMemDecoder in the decoder, i have a question why p_ptr = prob_lg but not p_ptr = self.softmax(prob_lg)?
i have read in the paper, and it said that:

P(ptr) is generated using the attention weights at the last MemNN hop of the decoder: P(ptr) = p(K)(t)

Entity F1 without KB

Hi,
Congratulations for the amazing paper.

I have a doubt on how to measure Entity F1 Score without KB input. I have a prediction file (system responses) and ground truth file. How should I measure Entity F1 as a whole?

Data loading reverse KB entry

Hi, I was wondering why you were inverting the KB entry to predict. It would be great if I can have an explanation for your reasoning behind it.

Could you tell what is the difference between with or without useKB?

Sorry to bother you . Is there any difference with or without 'useKB' parameter? If i use KB as a part of input ,will it be a noise to the dialogue history ? Or is there any benifits with using it ?
Thanks a lot.

https://github.com/HLTCHKUST/Mem2Seq/blob/master/data/KVR/read_data.py

Originally posted by @jasonwu0731 in #5 (comment)

Thanks for quick reply. There is some lower casing issue. I get errors like these:

#navigate#
0 2_miles road_block_nearby parking_garage poi dish_parking
Traceback (most recent call last):
File "read_data.py", line 119, in
print("0 {} {} ".format(poi, slot)+di[el[slot]])
KeyError: 'dish parking'

What's the meaning here?

Hi,
what does this line do here?

Mem2Seq/utils/utils_NMT.py

Line 127 in 84a2ccd

index = [loc for loc, val in enumerate(eng) if (val[0] == key)]

And what's the meaning of cnt_ptr, cnt_voc, ptr_index, max_r_len?

Mem2Seq model accuracy stuck at 0

Hi,

I am trying to reproduce your experiments, and just running the first command in the readme. My pytorch version is 0.3 as you can see below. I am evaluating after every epoch instead of just the first epoch. As you can see at the bottom of the log the model accuracy is close to 0 even after
8 epochs and the BLEU score is ~ 4.5.

Is this expected behavior ?

$ python -c 'import torch; print(torch.__version__)'
0.3.0.post4
$ python main_train.py -lr=0.001 -layer=1 -hdd=12 -dr=0.0 -dec=Mem2Seq -bsz=2 -ds=kvr -t= -evalp=1
{'dataset': 'kvr', 'task': '', 'decoder': 'Mem2Seq', 'hidden': '12', 'batch': '2', 'learn': '0.001', 'drop': '0.0', 'unk_mask': 1, 'layer': '1', 'limit': -10000, 'path': None, 'test': None, 'sample': None, 'useKB': 1, 'entPtr': 0, 'evalp': '1', 'addName': ''}
08-10 12:47 Reading lines from data/KVR/train.txt
08-10 12:47 Pointer percentace= 0.4208753595747005 
08-10 12:47 Max responce Len: 80
08-10 12:47 Max Input Len: 249
08-10 12:47 Avg. User Utterances: 2.593814432989691
08-10 12:47 Avg. Bot Utterances: 2.593814432989691
08-10 12:47 Avg. KB results: 64.69896907216494
08-10 12:47 Avg. responce Len: 8.732273449920509
Sample:  [['dish_parking', 'poi', 'parking_garage', 'road_block_nearby', '2_miles'], ['2_miles', 'distance', 'dish_parking', 'PAD', 'PAD'], ['road_block_nearby', 'traffic_info', 'dish_parking', 'PAD', 'PAD'], ['parking_garage', 'poi_type', 'dish_parking', 'PAD', 'PAD'], ['550_alester_ave', 'address', 'dish_parking', 'PAD', 'PAD'], ['stanford_oval_parking', 'poi', 'parking_garage', 'no_traffic', '6_miles'], ['6_miles', 'distance', 'stanford_oval_parking', 'PAD', 'PAD'], ['no_traffic', 'traffic_info', 'stanford_oval_parking', 'PAD', 'PAD'], ['parking_garage', 'poi_type', 'stanford_oval_parking', 'PAD', 'PAD'], ['610_amarillo_ave', 'address', 'stanford_oval_parking', 'PAD', 'PAD'], ['willows_market', 'poi', 'grocery_store', 'car_collision_nearby', '4_miles'], ['4_miles', 'distance', 'willows_market', 'PAD', 'PAD'], ['car_collision_nearby', 'traffic_info', 'willows_market', 'PAD', 'PAD'], ['grocery_store', 'poi_type', 'willows_market', 'PAD', 'PAD'], ['409_bollard_st', 'address', 'willows_market', 'PAD', 'PAD'], ['the_westin', 'poi', 'rest_stop', 'moderate_traffic', '2_miles'], ['2_miles', 'distance', 'the_westin', 'PAD', 'PAD'], ['moderate_traffic', 'traffic_info', 'the_westin', 'PAD', 'PAD'], ['rest_stop', 'poi_type', 'the_westin', 'PAD', 'PAD'], ['329_el_camino_real', 'address', 'the_westin', 'PAD', 'PAD'], ['toms_house', 'poi', 'friends_house', 'heavy_traffic', '1_miles'], ['1_miles', 'distance', 'toms_house', 'PAD', 'PAD'], ['heavy_traffic', 'traffic_info', 'toms_house', 'PAD', 'PAD'], ['friends_house', 'poi_type', 'toms_house', 'PAD', 'PAD'], ['580_van_ness_ave', 'address', 'toms_house', 'PAD', 'PAD'], ['pizza_chicago', 'poi', 'pizza_restaurant', 'heavy_traffic', '4_miles'], ['4_miles', 'distance', 'pizza_chicago', 'PAD', 'PAD'], ['heavy_traffic', 'traffic_info', 'pizza_chicago', 'PAD', 'PAD'], ['pizza_restaurant', 'poi_type', 'pizza_chicago', 'PAD', 'PAD'], ['915_arbol_dr', 'address', 'pizza_chicago', 'PAD', 'PAD'], ['valero', 'poi', 'gas_station', 'car_collision_nearby', '6_miles'], ['6_miles', 'distance', 'valero', 'PAD', 'PAD'], ['car_collision_nearby', 'traffic_info', 'valero', 'PAD', 'PAD'], ['gas_station', 'poi_type', 'valero', 'PAD', 'PAD'], ['200_alester_ave', 'address', 'valero', 'PAD', 'PAD'], ['mandarin_roots', 'poi', 'chinese_restaurant', 'no_traffic', '2_miles'], ['2_miles', 'distance', 'mandarin_roots', 'PAD', 'PAD'], ['no_traffic', 'traffic_info', 'mandarin_roots', 'PAD', 'PAD'], ['chinese_restaurant', 'poi_type', 'mandarin_roots', 'PAD', 'PAD'], ['271_springer_street', 'address', 'mandarin_roots', 'PAD', 'PAD'], ['where', '$u', 't1', 'PAD', 'PAD'], ['s', '$u', 't1', 'PAD', 'PAD'], ['the', '$u', 't1', 'PAD', 'PAD'], ['nearest', '$u', 't1', 'PAD', 'PAD'], ['parking_garage', '$u', 't1', 'PAD', 'PAD'], ['the', '$s', 't1', 'PAD', 'PAD'], ['nearest', '$s', 't1', 'PAD', 'PAD'], ['parking_garage', '$s', 't1', 'PAD', 'PAD'], ['is', '$s', 't1', 'PAD', 'PAD'], ['dish_parking', '$s', 't1', 'PAD', 'PAD'], ['at', '$s', 't1', 'PAD', 'PAD'], ['550_alester_ave', '$s', 't1', 'PAD', 'PAD'], ['would', '$s', 't1', 'PAD', 'PAD'], ['you', '$s', 't1', 'PAD', 'PAD'], ['like', '$s', 't1', 'PAD', 'PAD'], ['directions', '$s', 't1', 'PAD', 'PAD'], ['there', '$s', 't1', 'PAD', 'PAD'], ['yes', '$u', 't2', 'PAD', 'PAD'], ['please', '$u', 't2', 'PAD', 'PAD'], ['set', '$u', 't2', 'PAD', 'PAD'], ['directions', '$u', 't2', 'PAD', 'PAD'], ['via', '$u', 't2', 'PAD', 'PAD'], ['a', '$u', 't2', 'PAD', 'PAD'], ['route', '$u', 't2', 'PAD', 'PAD'], ['that', '$u', 't2', 'PAD', 'PAD'], ['avoids', '$u', 't2', 'PAD', 'PAD'], ['all', '$u', 't2', 'PAD', 'PAD'], ['heavy_traffic', '$u', 't2', 'PAD', 'PAD'], ['if', '$u', 't2', 'PAD', 'PAD'], ['possible', '$u', 't2', 'PAD', 'PAD'], ['$$$$', '$$$$', '$$$$', '$$$$', '$$$$']] it looks like there is a road block being reported on the route but i will still find the quickest route to 550_alester_ave [70, 70, 54, 56, 48, 62, 70, 70, 70, 70, 70, 45, 63, 70, 70, 70, 70, 70, 45, 70, 63, 70, 51] [0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1] ['550_alester_ave']
08-10 12:47 Reading lines from data/KVR/dev.txt
08-10 12:47 Pointer percentace= 0.4167286798630749 
08-10 12:47 Max responce Len: 87
08-10 12:47 Max Input Len: 264
08-10 12:47 Avg. User Utterances: 2.5728476821192054
08-10 12:47 Avg. Bot Utterances: 2.5728476821192054
08-10 12:47 Avg. KB results: 63.847682119205295
08-10 12:47 Avg. responce Len: 8.647361647361647
Sample:  [['make', '$u', 't1', 'PAD', 'PAD'], ['an', '$u', 't1', 'PAD', 'PAD'], ['appointment', '$u', 't1', 'PAD', 'PAD'], ['to', '$u', 't1', 'PAD', 'PAD'], ['reserve', '$u', 't1', 'PAD', 'PAD'], ['conference_room_100', '$u', 't1', 'PAD', 'PAD'], ['later', '$u', 't1', 'PAD', 'PAD'], ['this', '$u', 't1', 'PAD', 'PAD'], ['week', '$u', 't1', 'PAD', 'PAD'], ['for', '$u', 't1', 'PAD', 'PAD'], ['a', '$u', 't1', 'PAD', 'PAD'], ['meeting', '$u', 't1', 'PAD', 'PAD'], ['what', '$s', 't1', 'PAD', 'PAD'], ['day', '$s', 't1', 'PAD', 'PAD'], ['and', '$s', 't1', 'PAD', 'PAD'], ['time', '$s', 't1', 'PAD', 'PAD'], ['should', '$s', 't1', 'PAD', 'PAD'], ['i', '$s', 't1', 'PAD', 'PAD'], ['set', '$s', 't1', 'PAD', 'PAD'], ['an', '$s', 't1', 'PAD', 'PAD'], ['appointment', '$s', 't1', 'PAD', 'PAD'], ['to', '$s', 't1', 'PAD', 'PAD'], ['reserve', '$s', 't1', 'PAD', 'PAD'], ['the', '$s', 't1', 'PAD', 'PAD'], ['conference', '$s', 't1', 'PAD', 'PAD'], ['room', '$s', 't1', 'PAD', 'PAD'], ['monday', '$u', 't2', 'PAD', 'PAD'], ['at', '$u', 't2', 'PAD', 'PAD'], ['3pm', '$u', 't2', 'PAD', 'PAD'], ['$$$$', '$$$$', '$$$$', '$$$$', '$$$$']] i have made an appointment for monday at 3pm for the meeting [17, 29, 29, 19, 20, 9, 26, 27, 28, 9, 23, 11] [1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1] ['meeting', 'monday', '3pm']
08-10 12:47 Reading lines from data/KVR/test.txt
08-10 12:47 Pointer percentace= 0.4224432239869378 
08-10 12:47 Max responce Len: 36
08-10 12:47 Max Input Len: 228
08-10 12:47 Avg. User Utterances: 2.6546052631578947
08-10 12:47 Avg. Bot Utterances: 2.6546052631578947
08-10 12:47 Avg. KB results: 64.84539473684211
08-10 12:47 Avg. responce Len: 8.34820322180917
Sample:  [['remind', '$u', 't1', 'PAD', 'PAD'], ['me', '$u', 't1', 'PAD', 'PAD'], ['to', '$u', 't1', 'PAD', 'PAD'], ['take', '$u', 't1', 'PAD', 'PAD'], ['my', '$u', 't1', 'PAD', 'PAD'], ['pills', '$u', 't1', 'PAD', 'PAD'], ['what', '$s', 't1', 'PAD', 'PAD'], ['time', '$s', 't1', 'PAD', 'PAD'], ['do', '$s', 't1', 'PAD', 'PAD'], ['you', '$s', 't1', 'PAD', 'PAD'], ['need', '$s', 't1', 'PAD', 'PAD'], ['to', '$s', 't1', 'PAD', 'PAD'], ['take', '$s', 't1', 'PAD', 'PAD'], ['your', '$s', 't1', 'PAD', 'PAD'], ['pills', '$s', 't1', 'PAD', 'PAD'], ['i', '$u', 't2', 'PAD', 'PAD'], ['need', '$u', 't2', 'PAD', 'PAD'], ['to', '$u', 't2', 'PAD', 'PAD'], ['take', '$u', 't2', 'PAD', 'PAD'], ['my', '$u', 't2', 'PAD', 'PAD'], ['pills', '$u', 't2', 'PAD', 'PAD'], ['at', '$u', 't2', 'PAD', 'PAD'], ['7pm', '$u', 't2', 'PAD', 'PAD'], ['$$$$', '$$$$', '$$$$', '$$$$', '$$$$']] ok setting your medicine appointment for 7pm [23, 23, 13, 23, 23, 23, 22] [0, 0, 1, 0, 0, 0, 1] ['7pm']
08-10 12:47 Read 6290 sentence pairs train
08-10 12:47 Read 777 sentence pairs dev
08-10 12:47 Read 807 sentence pairs test
08-10 12:47 Max len Input 265 
08-10 12:47 Vocab_size 1554 
08-10 12:47 USE_CUDA=False
08-10 12:47 Epoch:0
L:6.63, VL:4.80, PL:1.83: 100%|███████████████████████████| 3145/3145 [00:43<00:00, 72.70it/s]
08-10 12:48 STARTING EVALUATION
R:0.0746,W:77.2260: 100%|███████████████████████████████████| 389/389 [00:17<00:00, 21.75it/s]
08-10 12:48 F1 SCORE:	0.0
08-10 12:48 F1 CAL:	0.0
08-10 12:48 F1 WET:	0.0
08-10 12:48 F1 NAV:	0.0
08-10 12:48 BLEU SCORE:0.0
08-10 12:48 MODEL SAVED
08-10 12:48 Epoch:1
L:5.81, VL:4.15, PL:1.66: 100%|███████████████████████████| 3145/3145 [00:45<00:00, 68.87it/s]
08-10 12:49 STARTING EVALUATION
R:0.0874,W:76.1113: 100%|███████████████████████████████████| 389/389 [00:20<00:00, 19.21it/s]
08-10 12:49 F1 SCORE:	0.00974817221770918
08-10 12:49 F1 CAL:	0.0
08-10 12:49 F1 WET:	0.017167381974248927
08-10 12:49 F1 NAV:	0.009111617312072893
08-10 12:49 BLEU SCORE:0.0
08-10 12:49 MODEL SAVED
08-10 12:49 Epoch:2
L:5.47, VL:3.85, PL:1.62: 100%|███████████████████████████| 3145/3145 [00:47<00:00, 66.57it/s]
08-10 12:50 STARTING EVALUATION
R:0.0900,W:72.6340: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 389/389 [00:23<00:00, 16.66it/s]
08-10 12:51 F1 SCORE:	0.01380991064175467
08-10 12:51 F1 CAL:	0.02147239263803681
08-10 12:51 F1 WET:	0.02145922746781116
08-10 12:51 F1 NAV:	0.0
08-10 12:51 BLEU SCORE:0.0
08-10 12:51 MODEL SAVED
Epoch     2: reducing learning rate of group 0 to 5.0000e-04.
08-10 12:51 Epoch:3
L:5.26, VL:3.68, PL:1.58: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3145/3145 [01:44<00:00, 30.20it/s]
08-10 12:52 STARTING EVALUATION
R:0.0797,W:72.2230: 100%|████████████████████████████████████| 389/389 [00:20<00:00, 18.97it/s]
08-10 12:53 F1 SCORE:	0.01299756295694557
08-10 12:53 F1 CAL:	0.015337423312883437
08-10 12:53 F1 WET:	0.023605150214592276
08-10 12:53 F1 NAV:	0.0
08-10 12:53 BLEU SCORE:1.7
08-10 12:53 MODEL SAVED
08-10 12:53 Epoch:4
L:5.16, VL:3.60, PL:1.56: 100%|████████████████████████████| 3145/3145 [00:54<00:00, 58.01it/s]
08-10 12:54 STARTING EVALUATION
R:0.0874,W:72.3398: 100%|████████████████████████████████████| 389/389 [00:21<00:00, 17.83it/s]
08-10 12:54 F1 SCORE:	0.014622258326563772
08-10 12:54 F1 CAL:	0.05214723926380368
08-10 12:54 F1 WET:	0.002145922746781116
08-10 12:54 F1 NAV:	0.0
08-10 12:54 BLEU SCORE:2.31
08-10 12:54 MODEL SAVED
08-10 12:54 Epoch:5
L:5.09, VL:3.54, PL:1.54: 100%|████████████████████████████| 3145/3145 [00:45<00:00, 69.83it/s]
08-10 12:55 STARTING EVALUATION
R:0.0925,W:70.8605: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 389/389 [00:20<00:00, 19.22it/s]
08-10 12:55 F1 SCORE:	0.01949634443541836
08-10 12:55 F1 CAL:	0.049079754601227
08-10 12:55 F1 WET:	0.015021459227467811
08-10 12:55 F1 NAV:	0.002277904328018223
08-10 12:55 BLEU SCORE:0.0
08-10 12:55 Epoch:6
L:5.02, VL:3.49, PL:1.53: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3145/3145 [01:33<00:00, 33.74it/s]
08-10 12:57 STARTING EVALUATION
R:0.0771,W:72.1792: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 389/389 [00:20<00:00, 19.14it/s]
08-10 12:57 F1 SCORE:	0.008123476848090982
08-10 12:57 F1 CAL:	0.027607361963190184
08-10 12:57 F1 WET:	0.002145922746781116
08-10 12:57 F1 NAV:	0.0
08-10 12:57 BLEU SCORE:2.62
08-10 12:57 MODEL SAVED
08-10 12:57 Epoch:7
L:4.98, VL:3.46, PL:1.52: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3145/3145 [01:15<00:00, 41.40it/s]
08-10 12:58 STARTING EVALUATION
R:0.0900,W:72.8743: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 389/389 [00:24<00:00, 15.91it/s]
08-10 12:59 F1 SCORE:	0.006498781478472785
08-10 12:59 F1 CAL:	0.018404907975460124
08-10 12:59 F1 WET:	0.0
08-10 12:59 F1 NAV:	0.004555808656036446
08-10 12:59 BLEU SCORE:2.22
08-10 12:59 Epoch:8
L:4.94, VL:3.42, PL:1.52: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3145/3145 [01:27<00:00, 35.80it/s]
08-10 13:00 STARTING EVALUATION
R:0.0887,W:71.7623: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 389/389 [00:23<00:00, 16.73it/s]
08-10 13:01 F1 SCORE:	0.016246953696181964
08-10 13:01 F1 CAL:	0.05214723926380368
08-10 13:01 F1 WET:	0.006437768240343348
08-10 13:01 F1 NAV:	0.0
08-10 13:01 BLEU SCORE:4.49
08-10 13:01 MODEL SAVED

multi-bleu problem

Hi, I'm trying to work on the open source of yours, Mem2Seq.
I found it fascinating to be able to run this good work ! thanks a lot.
However, whenever I try to run the program,
it gives 'multi-bleu' error.

FileNotFoundError: [Errno 2] No such file or directory:
'/home/Somedirectory/bin/tools/multi-bleu.perl': '/home/Somedirectory/bin/tools/multi-bleu.perl'

I try downloading 'multi-bleu.perl' from the web and located
at the place where the error says, but now it gives me permission denied error.
Can I get an idea of how to handle this problem?

Thanks !

What operating system is installed on your computer?

What operating system is installed on your computer?
Could you please tell me?

ERROR When running

When I was running the following code: python3 main_train.py -lr=0.001 -layer=1 -hdd=128 -dr=0.2 -dec=Mem2Seq -bsz=8 -ds=babi -t=1，an error show up and I don't know how to solve it,plz help me!

05-24 23:49 Dialog Accuracy: 0.088
Traceback (most recent call last):
File "F:/Code/Mem2Seq/Mem2Seq/main_train.py", line 62, in
acc = model.evaluate(dev,avg_best, BLEU)
File "F:\Code\Mem2Seq\Mem2Seq\models\Mem2Seq.py", line 361, in evaluate
bleu_score = moses_multi_bleu(np.array(hyp), np.array(ref), lowercase=True)
File "F:\Code\Mem2Seq\Mem2Seq\utils\measures.py", line 97, in moses_multi_bleu
with open(hypothesis_file.name, "r") as read_pred:
PermissionError: [Errno 13] Permission denied: 'C:\Users\Stan\AppData\Local\Temp\tmpg1tkn92y'

a problem about Mem2Seq.py

Hi, in the Mem2Seq.py, I have a question about the GRU's hidden layer.
In the DecoderMemNN init function, the code self.gru = nn.GRU(embedding_dim, embedding_dim, dropout=dropout) doesn't define the GRU's layer_num, so does it have a default one hidden layer?
And in the train_batch and evaluate_batch function, the code decoder_hidden = self.encoder(input_batches).unsqueeze(0) also implies the GRU has one hidden layer.
So, I think, during training and evaluating, the GRU always has only one hidden layer regardless of the args['layer']. Should the code be modified?

Look forward to your reply, thanks.

The meaning of comments in the code.

Hi! Thanks for your share code.
I'm studying with your paper with code.
I have a question about comments in your code.

EncoderMemNN in Mem2Seq.py

    def forward(self, story):
        story = story.transpose(0,1)
        story_size = story.size() # b * m * 3 
        if self.unk_mask:
            if(self.training):
                ones = np.ones((story_size[0],story_size[1],story_size[2]))
                rand_mask = np.random.binomial([np.ones((story_size[0],story_size[1]))],1-self.dropout)[0]
                ones[:,:,0] = ones[:,:,0] * rand_mask
                a = Variable(torch.Tensor(ones))
                if USE_CUDA: a = a.cuda()
                story = story*a.long()
        u = [self.get_state(story.size(0))]
        for hop in range(self.max_hops):
            embed_A = self.C[hop](story.contiguous().view(story.size(0), -1).long()) # b * (m * s) * e
            embed_A = embed_A.view(story_size+(embed_A.size(-1),)) # b * m * s * e
            m_A = torch.sum(embed_A, 2).squeeze(2) # b * m * e

What does b, m, s, e, 3 mean?

Thank you!

Why add $$$$ after each English sentence?

Hi,
Why add $$$$ after each English sentence?

Mem2Seq/utils/utils_NMT.py

Line 138 in 84a2ccd

eng = eng + ['$$$$']

Project dependencies may have API risk issues

Hi, In Mem2Seq, inappropriate dependency versioning constraints can cause risks.

Below are the dependencies and version constraints that the project is using

cycler==0.10.0
joblib==0.14.1
kiwisolver==1.1.0
matplotlib==3.2.1
nltk==3.4.5
numpy==1.18.2
pandas==1.0.3
pyparsing==2.4.6
python-dateutil==2.8.1
pytz==2019.3
scikit-learn==0.22.2.post1
scipy==1.4.1
seaborn==0.10.0
six==1.14.0
sklearn==0.0
torch==1.1.0
tqdm==4.43.0

The version constraint == will introduce the risk of dependency conflicts because the scope of dependencies is too strict.
The version constraint No Upper Bound and * will introduce the risk of the missing API Error because the latest version of the dependencies may remove some APIs.

After further analysis, in this project,
The version constraint of dependency numpy can be changed to >=1.8.0,<=1.23.0rc3.

The above modification suggestions can reduce the dependency conflicts as much as possible,
and introduce the latest version as much as possible without calling Error in the projects.

The invocation of the current project includes all the following methods.

The calling methods from the numpy

char.isdigit

The calling methods from the all methods

gete_s.Variable.transpose
input_batches.size
torch.bmm.transpose
zip
gate.append
argparse.ArgumentParser.add_argument
self.LuongAttnDecoderRNN.super.__init__
self.concat.cuda
embed_A.torch.sum.squeeze.size
super
target_gate.float
item.lower.replace
torch.sum
lengths.max.sequences.len.torch.ones.long
torch.nn.functional.log_softmax
LuongAttnDecoderRNN
correct.lstrip.rstrip.lstrip
torch.nn.functional.tanh
set
join.find
i.story.append
args.globals
story.contiguous
toppi.view.Variable.input_batches.torch.gather.transpose
read_langs
all_decoder_outputs_ptr.transpose.contiguous
args.globals.evaluate
PtrDecoderRNN
str
numpy.zeros
self.preprocess
lengths.max.sequences.len.torch.zeros.long
self.add_module
torch.autograd.Variable
h0_encoder.cuda.cuda
ID.append
self.evaluate_batch
toppi.view
self.decoder.train
el.replace.tokenizer.join.replace.lower
correct.lstrip.rstrip
st.lstrip.rstrip.lstrip
self.PtrDecoderRNN.super.__init__
torch.nn.utils.clip_grad_norm
logging.info
tqdm.set_description
char.isdigit
st.lstrip.rstrip.split
ast.literal_eval
self.decoder_optimizer.step
torch.nn.GRU
sequence_length.data.max
self.from_whichs.append
six.moves.urllib.request.urlretrieve
sequence_length.size
input.size
max_len.torch.arange.long
argparse.ArgumentParser.parse_args
self.compute_prf
kb_arr.append
torch.nn.utils.rnn.pad_packed_sequence
hop.self.C
hidden.squeeze
trg_seqs.Variable.transpose
decoder_ptr.data.topk
d.pop
gete_s.cuda.cuda
torch.LongTensor
tempfile.NamedTemporaryFile.write
AttrProxy
torch.nn.MSELoss
self.EncoderMemNN.super.__init__
e.keys
max_len.sequences.len.torch.zeros.long
context.squeeze.word_embedded.torch.cat.unsqueeze
energy.transpose.transpose
context.squeeze.squeeze
entity_replace
self.embedding_dropout.cuda
torch.autograd.Variable.float
torch.nn.functional.softmax
day.el.split.rstrip
self.W
torch.nn.functional.sigmoid
d.split
references.join.encode
masked_cross_entropy.item
line.strip.replace
self.encoder.cuda
nltk.wordpunct_tokenize
get_type_dict.keys
torch.nn.Softmax
torch.Tensor
torch.optim.Adam
line.strip.strip
logits.size
k.item.lower
self.index_word
day.el.split.split
k.item.lower.replace
entity.type_dict.append
VanillaDecoderRNN
entity_nav.append
utils.until_temp.entityList.append
utils.measures.wer
os.path.abspath
self.decoder
el.keys
ind_seqs.cuda.cuda
os.path.dirname
encoder_outputs.transpose.transpose
torch.Tensor.split
p.replace
correct.lstrip
self.concat
item.keys
tqdm.tqdm.set_description
globals
encoder_outputs.data.shape.self.v.repeat.unsqueeze
target_batches.transpose.contiguous
self.VanillaSeqToSeq.super.__init__
os.path.exists
sent_new.append
torch.gather
dict
i.top_ptr_i.item
embed_A.torch.sum.squeeze.view
hypotheses.join.encode
line.replace.replace
torch.nn.Parameter
self.C.unsqueeze
self.W1
seq_range.unsqueeze.expand
torch.nn.functional.sigmoid.squeeze
story.contiguous.view
os.path.join
enumerate
input_batches.transpose
encoder_outputs.transpose.size
tqdm.tqdm
all_decoder_outputs_gate.cuda.cuda
candid_all.append
unicodedata.category
context.split
self.softmax
s.re.sub.strip.lower
self.decoder.ptrMemDecoder
src_seqs.Variable.transpose
last_hidden.unsqueeze
input_batches.transpose.self.encoder.unsqueeze
r_index.append
torch.Tensor.append
embed_C.torch.sum.squeeze
unicode_to_ascii
sequence_length.unsqueeze.expand_as
tqdm
int
self.U
candid2candDL.keys
torch.save
join
self.gru
float
entity_list.append
os.makedirs
max_len.sequences.len.torch.ones.long
mask.float.sum
embed_A.torch.sum.squeeze
generate_memory
re.sub
a.long.size
args.globals.print_loss
args.globals.train_batch
torch.bmm
line.split.replace
open
a.cuda.long
target.size
torch.utils.data.DataLoader
hidden.squeeze.unsqueeze
torch.gather.squeeze
ref.append
all_decoder_outputs_vocab.cuda.transpose
torch.nn.utils.rnn.pack_padded_sequence
bleu_out.decode.decode
hyp.append
length.torch.LongTensor.Variable.cuda
target_batches.transpose
self.decoder.parameters
DecoderMemNN
torch.zeros
numpy.random.binomial
max
self.encoder
sum
os.chmod
conv_seqs.Variable.transpose
line.replace.split
get_type_dict
prepare_data_seq
entity_cal.append
prob_.unsqueeze.expand_as
r.split
seq_range_expand.cuda.cuda
torch.nn.Linear
self.embedding_dim.bsz.torch.zeros.Variable.cuda
math.sqrt
story.size
self.encoder_optimizer.step
EncoderRNN
topi.view
torch.rand
torch.cat
utils.until_temp.entityList
print
torch.nn.functional.softmax.bmm
all_decoder_outputs_ptr.cuda.cuda
range
self.W.cuda
decoder_input.cuda.cuda
json.load.keys
os.cpu_count
from_which.append
self.encoder_optimizer.zero_grad
max_len.s_t.repeat.transpose
torch.utils.data.append
conv_seqs.cuda.cuda
self.dropout
toppi.squeeze
self.W1.cuda
rnn_output.squeeze.squeeze
EncoderMemNN
x.str.lower
re.search
tempfile.NamedTemporaryFile.close
map
nltk.tokenize.word_tokenize
logging.basicConfig
self.U.cuda
global_temp.append
Lang.index_words
self.preprocess_gate
numpy.uint8.h.len.r.len.numpy.zeros.reshape
self.PTRUNK.super.__init__
src_seqs.cuda.cuda
self.out.cuda
self.DecoderMemNN.super.__init__
get_seq
bool
idx.token_array.isdigit
cleaner
temp.append
entityList
self.decoder.load_memory
embedded.view.view
sequence_mask.float
merge
st.lstrip.rstrip
any
s.lower.strip
last_hidden.unsqueeze.repeat
eng.lower
getattr
i.data_dev.dialog_acc_dict.append
new_token_array.pop
i.d.str.lower
self.get_state.unsqueeze
self.EncoderRNN.super.__init__
self.v.data.normal_
hidden.unsqueeze.repeat
slot.el.replace.tokenizer.join.lower
torch.nn.Embedding
candidates.append
vars
numpy.float32
s.re.sub.strip
loss.backward
self.get_state
Dataset
embed_C.torch.sum.squeeze.size
u.unsqueeze.expand_as
st.lstrip
self.criterion
a.long.transpose
y_seq.append
self.C
self.encoder.parameters
i.topvi.item
target.view
sequence_mask
unicodedata.normalize
x_seq.append
utils.measures.moses_multi_bleu
ptr_seq.append
torch.load
torch.gather.view
decoder_vacab.data.topk
self.encoder.train
json.load
el.replace
sequence_length.unsqueeze
sent.split
entity.append
dialog_acc_dict.keys
p.append
MEM_TOKEN_SIZE.lengths.max.sequences.len.torch.ones.long
self.preprocess.append
self.LuongSeqToSeq.super.__init__
trg_seqs.cuda.cuda
all_decoder_outputs_vocab.cuda.cuda
model.scheduler.step
hidden.squeeze.append
self.embedding.cuda
self.dropout.cuda
global_rp.keys
masked_cross_entropy
prob.unsqueeze.expand_as
subprocess.check_output
c0_encoder.cuda.cuda
list
topvi.squeeze
ptr_index.append
format
numpy.array
self.embedding_dropout
elm.split
self.v.repeat
sentence.split
torch.arange
all_decoder_outputs_vocab.transpose.contiguous
decoded_words.append
self.VanillaDecoderRNN.super.__init__
tempfile.NamedTemporaryFile.flush
join.replace
target_index.transpose.contiguous
self.preprocess_inde
self.out.squeeze
self.save_model
p.e.str.lower.replace
line.replace.strip
argparse.ArgumentParser
C.weight.data.normal_
day.el.split.rstrip.replace
min
all_decoder_outputs_ptr.cuda.transpose
self.v.size
random.random
entity_wet.append
slot.el.replace
get_type_dict.append
len.append
load_candidates
target_index.transpose
dict.items
el.replace.tokenizer.join.replace
p.e.str.lower
list.append
self.m_story.append
new_token_array.append
prob_.unsqueeze.expand_as.unsqueeze
i.toppi.item
fre.lower
DecoderrMemNN
temp_gen.append
torch.ones
torch.nn.Dropout
os.path.realpath
join.lstrip
torch.utils.data.sort
tempfile.NamedTemporaryFile
self.decoder_optimizer.zero_grad
self.lstm.cuda
line.split.split
day.el.split
bleu_out.re.search.group
a.bmm.squeeze
self.softmax.unsqueeze
a.cuda.cuda
numpy.transpose
self.v.cuda
loss.item
el_key.el.tokenizer.join.lower
embed_C.torch.sum.squeeze.view
numpy.ones
logits.view
len
torch.optim.lr_scheduler.ReduceLROnPlateau
input_seq.size
gate_seq.append
line.strip.split
item.lower
a.long.contiguous
p.str.replace
self.out
max_len.torch.arange.long.unsqueeze
self.decoder.cuda
input_batches.self.encoder.unsqueeze
numpy.size
candid2DL
names.keys
ind_seqs.Variable.transpose
self.embedding
length.float.sum
hidden.squeeze.size
torch.nn.LSTM
story.size.story.contiguous.view.long
self.lstm
Lang
join.split
self.DecoderrMemNN.super.__init__
self.Mem2Seq.super.__init__

@developer
Could please help me check this issue?
May I pull a request to fix it?
Thank you very much.

RuntimeError: cuDNN version mismatch: PyTorch was compiled against 7003 but linked against 7300

I have faced this problem . In your readme also does not contain any clarification about the cuda and cudnn version . Could you please clarify to solve this problem . Thank you in advance .

About import

Can you tell me the version of python?
I'm trying to use python 3.5 to run this code. I've install the latest version of pandas in my python env. But it always report error that "ImportError: No module named 'pandas.compat'".

predict word is out of vocabulary

Thank you for your great work first!
When I run:
python3 main_train.py -lr=0.001 -layer=1 -hdd=12 -dr=0.0 -dec=Mem2Seq -bsz=2 -ds=babi -t=1
I got this problem:
Traceback (most recent call last):
File "Mem2Seq/main_train.py", line 70, in
acc = model.evaluate(dev,avg_best, BLEU)
File "Mem2Seq\models\Mem2Seq.py", line 272, in evaluate
data_dev[2],data_dev[3],data_dev[4],data_dev[5],data_dev[6], data_dev[-4], data_dev[-3])
File "Mem2Seq\models\Mem2Seq.py", line 223, in evaluate_batch
temp.append(self.lang.index2word[ind])
KeyError: tensor(21, device='cuda:0')

I guess OOV may be the reason and I made a test.I printed the 'ind' and the index2word dic, I found the 'ind' is 93 while the max key in index2word is 92:rome.
I'm wondering why could this happen when the word 'UNK' is in the dictionary and how should I handle it? Should I just ignore it and set it to 'UNK' or something else?

Thank you again~

a probe about main_test.py

after training data,and have saved the model,but when I use the model to test,there is a IndexError:list index out of range,So, what's the wrong ?
I just do this: python3 main_test.py -dec=Mem2Seq -path=save\mem2seq-BABI -bsz=8 -ds=babi
thank you very much!!

Question about the predicted reply from trained model

Hello. I pretty like your advanced work on this dialogue system. I have tried to train a Mem2Seq model recently, and I get predicted replys shown below, which seems not quite ideal as examples shown on your article. I just want to make sure whether I have made mistakes. I used the recommended hyper parameters from article, and I just printed the outputs when evaluating the models. Thank you.

Here are all predicted replys from first 5 dialogues of test set of KVR:
what city do you want the temperature for
in alhambra temperature on friday low of 20f and high of 30f on friday
the temperature in alhambra will be low and high and of on and on and on and on and on and on and on and on on and on on and on on and on on and on on on and on sunday and
you re welcome
the weather 48 hours in fresno will be a low temperature of of of the tomorrow
you re welcome
there is heavy_traffic on our way but you should be able to reach there in a few minutes
i don t see your friend s home
chevron is moderate_traffic on the route to
chevron is located at 783_arcadia_pl
you re welcome
the closest grocery_store is sigona_farmers_market 4_miles away
sigona_farmers_market is the closest at
you re welcome
what time and time should i to to to to to take your medicine
okay i will to take to you to to take your medicine at 7pm

Can I run this code on pytorch 0.4 now?

question about function get_state?

in function get_state (),return Variable(torch.zeros(bsz, self.embedding_dim)).cuda(),and then is used in forward function of
prob = self.softmax(torch.sum(m_A*u_temp, 2)),isn't the result zero???

when i run PTRUNK, i got an error

hello, when i try to run
'python3 main_train.py -lr=0.001 -layer=1 -hdd=128 -dr=0.2 -dec=PTRUNK -bsz=8 -ds=babi -t=1'
, there happens a TypeError in models/enc_PTRUNK.py, line 347
cannot assign 'torch.autograd.variable.Variable' as parameter 'v'(torch.nn.Parameter or None expected),by the way ,i use pytorch 0.3. Can you help me please?