I am trying to reproduce your experiments, and just running the first command in the readme. My pytorch version is 0.3 as you can see below. I am evaluating after every epoch instead of just the first epoch. As you can see at the bottom of the log the model accuracy is close to 0
even after
8 epochs and the BLEU score is ~ 4.5.
$ python -c 'import torch; print(torch.__version__)'
0.3.0.post4
$ python main_train.py -lr=0.001 -layer=1 -hdd=12 -dr=0.0 -dec=Mem2Seq -bsz=2 -ds=kvr -t= -evalp=1
{'dataset': 'kvr', 'task': '', 'decoder': 'Mem2Seq', 'hidden': '12', 'batch': '2', 'learn': '0.001', 'drop': '0.0', 'unk_mask': 1, 'layer': '1', 'limit': -10000, 'path': None, 'test': None, 'sample': None, 'useKB': 1, 'entPtr': 0, 'evalp': '1', 'addName': ''}
08-10 12:47 Reading lines from data/KVR/train.txt
08-10 12:47 Pointer percentace= 0.4208753595747005
08-10 12:47 Max responce Len: 80
08-10 12:47 Max Input Len: 249
08-10 12:47 Avg. User Utterances: 2.593814432989691
08-10 12:47 Avg. Bot Utterances: 2.593814432989691
08-10 12:47 Avg. KB results: 64.69896907216494
08-10 12:47 Avg. responce Len: 8.732273449920509
Sample: [['dish_parking', 'poi', 'parking_garage', 'road_block_nearby', '2_miles'], ['2_miles', 'distance', 'dish_parking', 'PAD', 'PAD'], ['road_block_nearby', 'traffic_info', 'dish_parking', 'PAD', 'PAD'], ['parking_garage', 'poi_type', 'dish_parking', 'PAD', 'PAD'], ['550_alester_ave', 'address', 'dish_parking', 'PAD', 'PAD'], ['stanford_oval_parking', 'poi', 'parking_garage', 'no_traffic', '6_miles'], ['6_miles', 'distance', 'stanford_oval_parking', 'PAD', 'PAD'], ['no_traffic', 'traffic_info', 'stanford_oval_parking', 'PAD', 'PAD'], ['parking_garage', 'poi_type', 'stanford_oval_parking', 'PAD', 'PAD'], ['610_amarillo_ave', 'address', 'stanford_oval_parking', 'PAD', 'PAD'], ['willows_market', 'poi', 'grocery_store', 'car_collision_nearby', '4_miles'], ['4_miles', 'distance', 'willows_market', 'PAD', 'PAD'], ['car_collision_nearby', 'traffic_info', 'willows_market', 'PAD', 'PAD'], ['grocery_store', 'poi_type', 'willows_market', 'PAD', 'PAD'], ['409_bollard_st', 'address', 'willows_market', 'PAD', 'PAD'], ['the_westin', 'poi', 'rest_stop', 'moderate_traffic', '2_miles'], ['2_miles', 'distance', 'the_westin', 'PAD', 'PAD'], ['moderate_traffic', 'traffic_info', 'the_westin', 'PAD', 'PAD'], ['rest_stop', 'poi_type', 'the_westin', 'PAD', 'PAD'], ['329_el_camino_real', 'address', 'the_westin', 'PAD', 'PAD'], ['toms_house', 'poi', 'friends_house', 'heavy_traffic', '1_miles'], ['1_miles', 'distance', 'toms_house', 'PAD', 'PAD'], ['heavy_traffic', 'traffic_info', 'toms_house', 'PAD', 'PAD'], ['friends_house', 'poi_type', 'toms_house', 'PAD', 'PAD'], ['580_van_ness_ave', 'address', 'toms_house', 'PAD', 'PAD'], ['pizza_chicago', 'poi', 'pizza_restaurant', 'heavy_traffic', '4_miles'], ['4_miles', 'distance', 'pizza_chicago', 'PAD', 'PAD'], ['heavy_traffic', 'traffic_info', 'pizza_chicago', 'PAD', 'PAD'], ['pizza_restaurant', 'poi_type', 'pizza_chicago', 'PAD', 'PAD'], ['915_arbol_dr', 'address', 'pizza_chicago', 'PAD', 'PAD'], ['valero', 'poi', 'gas_station', 'car_collision_nearby', '6_miles'], ['6_miles', 'distance', 'valero', 'PAD', 'PAD'], ['car_collision_nearby', 'traffic_info', 'valero', 'PAD', 'PAD'], ['gas_station', 'poi_type', 'valero', 'PAD', 'PAD'], ['200_alester_ave', 'address', 'valero', 'PAD', 'PAD'], ['mandarin_roots', 'poi', 'chinese_restaurant', 'no_traffic', '2_miles'], ['2_miles', 'distance', 'mandarin_roots', 'PAD', 'PAD'], ['no_traffic', 'traffic_info', 'mandarin_roots', 'PAD', 'PAD'], ['chinese_restaurant', 'poi_type', 'mandarin_roots', 'PAD', 'PAD'], ['271_springer_street', 'address', 'mandarin_roots', 'PAD', 'PAD'], ['where', '$u', 't1', 'PAD', 'PAD'], ['s', '$u', 't1', 'PAD', 'PAD'], ['the', '$u', 't1', 'PAD', 'PAD'], ['nearest', '$u', 't1', 'PAD', 'PAD'], ['parking_garage', '$u', 't1', 'PAD', 'PAD'], ['the', '$s', 't1', 'PAD', 'PAD'], ['nearest', '$s', 't1', 'PAD', 'PAD'], ['parking_garage', '$s', 't1', 'PAD', 'PAD'], ['is', '$s', 't1', 'PAD', 'PAD'], ['dish_parking', '$s', 't1', 'PAD', 'PAD'], ['at', '$s', 't1', 'PAD', 'PAD'], ['550_alester_ave', '$s', 't1', 'PAD', 'PAD'], ['would', '$s', 't1', 'PAD', 'PAD'], ['you', '$s', 't1', 'PAD', 'PAD'], ['like', '$s', 't1', 'PAD', 'PAD'], ['directions', '$s', 't1', 'PAD', 'PAD'], ['there', '$s', 't1', 'PAD', 'PAD'], ['yes', '$u', 't2', 'PAD', 'PAD'], ['please', '$u', 't2', 'PAD', 'PAD'], ['set', '$u', 't2', 'PAD', 'PAD'], ['directions', '$u', 't2', 'PAD', 'PAD'], ['via', '$u', 't2', 'PAD', 'PAD'], ['a', '$u', 't2', 'PAD', 'PAD'], ['route', '$u', 't2', 'PAD', 'PAD'], ['that', '$u', 't2', 'PAD', 'PAD'], ['avoids', '$u', 't2', 'PAD', 'PAD'], ['all', '$u', 't2', 'PAD', 'PAD'], ['heavy_traffic', '$u', 't2', 'PAD', 'PAD'], ['if', '$u', 't2', 'PAD', 'PAD'], ['possible', '$u', 't2', 'PAD', 'PAD'], ['$$$$', '$$$$', '$$$$', '$$$$', '$$$$']] it looks like there is a road block being reported on the route but i will still find the quickest route to 550_alester_ave [70, 70, 54, 56, 48, 62, 70, 70, 70, 70, 70, 45, 63, 70, 70, 70, 70, 70, 45, 70, 63, 70, 51] [0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1] ['550_alester_ave']
08-10 12:47 Reading lines from data/KVR/dev.txt
08-10 12:47 Pointer percentace= 0.4167286798630749
08-10 12:47 Max responce Len: 87
08-10 12:47 Max Input Len: 264
08-10 12:47 Avg. User Utterances: 2.5728476821192054
08-10 12:47 Avg. Bot Utterances: 2.5728476821192054
08-10 12:47 Avg. KB results: 63.847682119205295
08-10 12:47 Avg. responce Len: 8.647361647361647
Sample: [['make', '$u', 't1', 'PAD', 'PAD'], ['an', '$u', 't1', 'PAD', 'PAD'], ['appointment', '$u', 't1', 'PAD', 'PAD'], ['to', '$u', 't1', 'PAD', 'PAD'], ['reserve', '$u', 't1', 'PAD', 'PAD'], ['conference_room_100', '$u', 't1', 'PAD', 'PAD'], ['later', '$u', 't1', 'PAD', 'PAD'], ['this', '$u', 't1', 'PAD', 'PAD'], ['week', '$u', 't1', 'PAD', 'PAD'], ['for', '$u', 't1', 'PAD', 'PAD'], ['a', '$u', 't1', 'PAD', 'PAD'], ['meeting', '$u', 't1', 'PAD', 'PAD'], ['what', '$s', 't1', 'PAD', 'PAD'], ['day', '$s', 't1', 'PAD', 'PAD'], ['and', '$s', 't1', 'PAD', 'PAD'], ['time', '$s', 't1', 'PAD', 'PAD'], ['should', '$s', 't1', 'PAD', 'PAD'], ['i', '$s', 't1', 'PAD', 'PAD'], ['set', '$s', 't1', 'PAD', 'PAD'], ['an', '$s', 't1', 'PAD', 'PAD'], ['appointment', '$s', 't1', 'PAD', 'PAD'], ['to', '$s', 't1', 'PAD', 'PAD'], ['reserve', '$s', 't1', 'PAD', 'PAD'], ['the', '$s', 't1', 'PAD', 'PAD'], ['conference', '$s', 't1', 'PAD', 'PAD'], ['room', '$s', 't1', 'PAD', 'PAD'], ['monday', '$u', 't2', 'PAD', 'PAD'], ['at', '$u', 't2', 'PAD', 'PAD'], ['3pm', '$u', 't2', 'PAD', 'PAD'], ['$$$$', '$$$$', '$$$$', '$$$$', '$$$$']] i have made an appointment for monday at 3pm for the meeting [17, 29, 29, 19, 20, 9, 26, 27, 28, 9, 23, 11] [1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1] ['meeting', 'monday', '3pm']
08-10 12:47 Reading lines from data/KVR/test.txt
08-10 12:47 Pointer percentace= 0.4224432239869378
08-10 12:47 Max responce Len: 36
08-10 12:47 Max Input Len: 228
08-10 12:47 Avg. User Utterances: 2.6546052631578947
08-10 12:47 Avg. Bot Utterances: 2.6546052631578947
08-10 12:47 Avg. KB results: 64.84539473684211
08-10 12:47 Avg. responce Len: 8.34820322180917
Sample: [['remind', '$u', 't1', 'PAD', 'PAD'], ['me', '$u', 't1', 'PAD', 'PAD'], ['to', '$u', 't1', 'PAD', 'PAD'], ['take', '$u', 't1', 'PAD', 'PAD'], ['my', '$u', 't1', 'PAD', 'PAD'], ['pills', '$u', 't1', 'PAD', 'PAD'], ['what', '$s', 't1', 'PAD', 'PAD'], ['time', '$s', 't1', 'PAD', 'PAD'], ['do', '$s', 't1', 'PAD', 'PAD'], ['you', '$s', 't1', 'PAD', 'PAD'], ['need', '$s', 't1', 'PAD', 'PAD'], ['to', '$s', 't1', 'PAD', 'PAD'], ['take', '$s', 't1', 'PAD', 'PAD'], ['your', '$s', 't1', 'PAD', 'PAD'], ['pills', '$s', 't1', 'PAD', 'PAD'], ['i', '$u', 't2', 'PAD', 'PAD'], ['need', '$u', 't2', 'PAD', 'PAD'], ['to', '$u', 't2', 'PAD', 'PAD'], ['take', '$u', 't2', 'PAD', 'PAD'], ['my', '$u', 't2', 'PAD', 'PAD'], ['pills', '$u', 't2', 'PAD', 'PAD'], ['at', '$u', 't2', 'PAD', 'PAD'], ['7pm', '$u', 't2', 'PAD', 'PAD'], ['$$$$', '$$$$', '$$$$', '$$$$', '$$$$']] ok setting your medicine appointment for 7pm [23, 23, 13, 23, 23, 23, 22] [0, 0, 1, 0, 0, 0, 1] ['7pm']
08-10 12:47 Read 6290 sentence pairs train
08-10 12:47 Read 777 sentence pairs dev
08-10 12:47 Read 807 sentence pairs test
08-10 12:47 Max len Input 265
08-10 12:47 Vocab_size 1554
08-10 12:47 USE_CUDA=False
08-10 12:47 Epoch:0
L:6.63, VL:4.80, PL:1.83: 100%|███████████████████████████| 3145/3145 [00:43<00:00, 72.70it/s]
08-10 12:48 STARTING EVALUATION
R:0.0746,W:77.2260: 100%|███████████████████████████████████| 389/389 [00:17<00:00, 21.75it/s]
08-10 12:48 F1 SCORE: 0.0
08-10 12:48 F1 CAL: 0.0
08-10 12:48 F1 WET: 0.0
08-10 12:48 F1 NAV: 0.0
08-10 12:48 BLEU SCORE:0.0
08-10 12:48 MODEL SAVED
08-10 12:48 Epoch:1
L:5.81, VL:4.15, PL:1.66: 100%|███████████████████████████| 3145/3145 [00:45<00:00, 68.87it/s]
08-10 12:49 STARTING EVALUATION
R:0.0874,W:76.1113: 100%|███████████████████████████████████| 389/389 [00:20<00:00, 19.21it/s]
08-10 12:49 F1 SCORE: 0.00974817221770918
08-10 12:49 F1 CAL: 0.0
08-10 12:49 F1 WET: 0.017167381974248927
08-10 12:49 F1 NAV: 0.009111617312072893
08-10 12:49 BLEU SCORE:0.0
08-10 12:49 MODEL SAVED
08-10 12:49 Epoch:2
L:5.47, VL:3.85, PL:1.62: 100%|███████████████████████████| 3145/3145 [00:47<00:00, 66.57it/s]
08-10 12:50 STARTING EVALUATION
R:0.0900,W:72.6340: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 389/389 [00:23<00:00, 16.66it/s]
08-10 12:51 F1 SCORE: 0.01380991064175467
08-10 12:51 F1 CAL: 0.02147239263803681
08-10 12:51 F1 WET: 0.02145922746781116
08-10 12:51 F1 NAV: 0.0
08-10 12:51 BLEU SCORE:0.0
08-10 12:51 MODEL SAVED
Epoch 2: reducing learning rate of group 0 to 5.0000e-04.
08-10 12:51 Epoch:3
L:5.26, VL:3.68, PL:1.58: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3145/3145 [01:44<00:00, 30.20it/s]
08-10 12:52 STARTING EVALUATION
R:0.0797,W:72.2230: 100%|████████████████████████████████████| 389/389 [00:20<00:00, 18.97it/s]
08-10 12:53 F1 SCORE: 0.01299756295694557
08-10 12:53 F1 CAL: 0.015337423312883437
08-10 12:53 F1 WET: 0.023605150214592276
08-10 12:53 F1 NAV: 0.0
08-10 12:53 BLEU SCORE:1.7
08-10 12:53 MODEL SAVED
08-10 12:53 Epoch:4
L:5.16, VL:3.60, PL:1.56: 100%|████████████████████████████| 3145/3145 [00:54<00:00, 58.01it/s]
08-10 12:54 STARTING EVALUATION
R:0.0874,W:72.3398: 100%|████████████████████████████████████| 389/389 [00:21<00:00, 17.83it/s]
08-10 12:54 F1 SCORE: 0.014622258326563772
08-10 12:54 F1 CAL: 0.05214723926380368
08-10 12:54 F1 WET: 0.002145922746781116
08-10 12:54 F1 NAV: 0.0
08-10 12:54 BLEU SCORE:2.31
08-10 12:54 MODEL SAVED
08-10 12:54 Epoch:5
L:5.09, VL:3.54, PL:1.54: 100%|████████████████████████████| 3145/3145 [00:45<00:00, 69.83it/s]
08-10 12:55 STARTING EVALUATION
R:0.0925,W:70.8605: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 389/389 [00:20<00:00, 19.22it/s]
08-10 12:55 F1 SCORE: 0.01949634443541836
08-10 12:55 F1 CAL: 0.049079754601227
08-10 12:55 F1 WET: 0.015021459227467811
08-10 12:55 F1 NAV: 0.002277904328018223
08-10 12:55 BLEU SCORE:0.0
08-10 12:55 Epoch:6
L:5.02, VL:3.49, PL:1.53: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3145/3145 [01:33<00:00, 33.74it/s]
08-10 12:57 STARTING EVALUATION
R:0.0771,W:72.1792: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 389/389 [00:20<00:00, 19.14it/s]
08-10 12:57 F1 SCORE: 0.008123476848090982
08-10 12:57 F1 CAL: 0.027607361963190184
08-10 12:57 F1 WET: 0.002145922746781116
08-10 12:57 F1 NAV: 0.0
08-10 12:57 BLEU SCORE:2.62
08-10 12:57 MODEL SAVED
08-10 12:57 Epoch:7
L:4.98, VL:3.46, PL:1.52: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3145/3145 [01:15<00:00, 41.40it/s]
08-10 12:58 STARTING EVALUATION
R:0.0900,W:72.8743: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 389/389 [00:24<00:00, 15.91it/s]
08-10 12:59 F1 SCORE: 0.006498781478472785
08-10 12:59 F1 CAL: 0.018404907975460124
08-10 12:59 F1 WET: 0.0
08-10 12:59 F1 NAV: 0.004555808656036446
08-10 12:59 BLEU SCORE:2.22
08-10 12:59 Epoch:8
L:4.94, VL:3.42, PL:1.52: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3145/3145 [01:27<00:00, 35.80it/s]
08-10 13:00 STARTING EVALUATION
R:0.0887,W:71.7623: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 389/389 [00:23<00:00, 16.73it/s]
08-10 13:01 F1 SCORE: 0.016246953696181964
08-10 13:01 F1 CAL: 0.05214723926380368
08-10 13:01 F1 WET: 0.006437768240343348
08-10 13:01 F1 NAV: 0.0
08-10 13:01 BLEU SCORE:4.49
08-10 13:01 MODEL SAVED