GithubHelp home page GithubHelp logo

jasonwu0731 / glmp Goto Github PK

View Code? Open in Web Editor NEW
160.0 160.0 24.0 4.9 MB

PyTorch code for ICLR 2019 paper: Global-to-local Memory Pointer Networks for Task-Oriented Dialogue https://arxiv.org/pdf/1901.04713

Python 93.83% Perl 6.17%

glmp's Introduction

glmp's People

Contributors

jasonwu0731 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

glmp's Issues

How to set the hyper-parameters

Hi,
How did you set the hyper-parameters to get the results in the paper?I try to set hidden size to 128, and hop K is set to 3,but the results are far from those in the paper.Below is the result of my running:
ACC SCORE: 0.1227
F1 SCORE: 0.5788
CAL F1: 0.7344
WET F1: 0.5486
NAV F1: 0.5058

I try to set hidden size to 256, even the result is not as good as the result with a hidden size of 128.How do you set the hyper-parameters?

API calls

Hi, first of all - thank you for your great work and for sharing the code!

Would you please help me with a following question: According to the task formulation for the bAbI dataset, system should make API calls to derive information on smth. for instance, on restaurants, correct? If I understood it correctly, GLMP also able to deal with it, however I failed to find a place in a code, where it actually happens.
Would you please, tell whether GLMP makes API calls and if yes, on which position in a code it happens? One more question: Do you think it is possible to replace this kind of API calls with calls to ElasticSearch?

I would be very grateful for your answers!

Many thanks in advance!

Read data script

Hi,
First of all, great work.

I notice that in mem2seq you guys provided the read data script but in GLMP you didn't.

Can you provide the read_data script for kvr and babi datasets?

Thanks.

Will it work with real data?

Hi, first thanks four publishing your job, it's amazing. This is not a bug, but a question:
Do you think that this kind of models are ready to work with real data? Say I have real transcriptions for a Airway company, where the agent and the client are talking with a goal (buy tickets, do check-in, etc). Will the model be able to simulate the agent's sentences in a correct way or is it so complicated yet?
Do you think that some end-to-end model it's ready to do it at the moment? Or is the only way to label intents and entities like in Dialogflow, etc.?

Thanks for your opinion!

hard to reproduce the results reported in this paper

Hi, I run this code with "python3 myTrain.py -lr=0.001 -l=1 -hdd=128 -dr=0.2 -dec=GLMP -bsz=8 -ds=kvr" or "python3 myTrain.py -lr=0.001 -l=3 -hdd=128 -dr=0.2 -dec=GLMP -bsz=8 -ds=kvr". But the F1 score achieves only 56-58. I don't know why that happen, and can you help me out?

Trainning for each task

Hello, thank you for sharing your code.

I have 2 question, please help me answer it:

  1. You train a new model for each task or you use a trained model in previous task and finetune it on another task?

  2. To build a full coversation chat, i should use model trained on task 5, right?

Thank you for your answer.

How to report the results in your paper

When I run your code, sometimes I get better results (e.g. BLEU SCORE:15.45; F1 SCORE: 0.6038309388456686; though not all F1 scores are better) than the reported scores in your paper.
Do you run several times and then average those results, or just select the best test results according to your best valid scores, as I find you do not fix a random seed in your code.
Thanks for your code release!

error in models/GLMP.py

in models/GLMP.py,F1_score used without definition.
also, I confused about data/KVR/train.txt + dev.txt + test.txt, how does they come from and why they are processed in that format?

ZeroDivisionError: float division by zero when evaluating on BABI

Hi, thanks for the code posting the code for the paper.

I did have an issue when training on BABI. When running the following command
python myTrain.py -lr=0.001 -l=1 -hdd=128 -dr=0.2 -dec=GLMP -bsz=8 -ds=babi -t=1

It throws the following error:

L:0.58,LE:0.09,LG:0.38,LP:0.10: 100% 753/753 [01:34<00:00,  7.82it/s]
STARTING EVALUATION
100% 61/61 [00:13<00:00,  4.57it/s]
Traceback (most recent call last):
  File "myTrain.py", line 49, in <module>
    acc = model.evaluate(dev, avg_best, early_stop)
  File "/content/GLMP/models/GLMP.py", line 261, in evaluate
    F1_score = F1_pred / float(F1_count)
ZeroDivisionError: float division by zero

Just glancing at the code it seems that F1 score is initialized to zero and is not changed for anything but the KVR dataset.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.