GithubHelp home page GithubHelp logo

budzianowski / multiwoz Goto Github PK

View Code? Open in Web Editor NEW
821.0 17.0 196.0 122.76 MB

Source code for end-to-end dialogue model from the MultiWOZ paper (Budzianowski et al. 2018, EMNLP)

License: MIT License

Python 100.00%
machine-learning dialogue-systems dialogues dialogue seq2seq dialogue-manager dialogue-library natural-language-processing

multiwoz's People

Contributors

amiralikaboli avatar andy194673 avatar bepoetree avatar budzianowski avatar gusalsdmlwlq avatar hpsun1109 avatar hwaranlee avatar hwwancient avatar hydercps avatar jasonyux avatar jh-debug avatar jianguoz avatar leeshiyang avatar nflubis avatar pawel-polyai avatar qywu avatar radi-cho avatar sachin-r avatar shanetian avatar taoyds avatar tianjianh avatar tomiinek avatar tonynemo avatar vaishnavmenon avatar wise-east avatar xiaoxuezang avatar yinpeidai avatar yushi-hu avatar yxuansu avatar zlinao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

multiwoz's Issues

[ERROR] in MultiWOZ 2.1

Some conversations have no dialog_act annotation in MultiWOZ 2.1:

  • PMUL4707.json
  • PMUL2245.json
  • PMUL4776.json
  • PMUL3872.json
  • PMUL4859.json

Will the data is not yet complete labeling?

Hi, thanks a lot for sharing the data!
I found that some slot labels in MultiWOZ2.2 are incomplete, like follow.

In PMUL0698.json, trun6, user says:"I am leaving from Cambridge and going to Norwich." for book train, this turn has train-departure and train-destination slots, but there are no labels in slots field, Will the data is not yet complete labeling?

The following is the part of MultiWOZ2.2/dev/dialogues_001.json

{
        "frames": [
          {
            "actions": [],
            "service": "restaurant",
            "slots": [],
            "state": {
              "active_intent": "NONE",
              "requested_slots": [],
              "slot_values": {
                "restaurant-area": [
                  "centre"
                ],
                "restaurant-food": [
                  "chinese"
                ]
              }
            }
          },
          {
            "actions": [],
            "service": "train",
            "slots": [],
            "state": {
              "active_intent": "find_train",
              "requested_slots": [],
              "slot_values": {
                "train-day": [
                  "sunday"
                ],
                "train-departure": [
                  "cambridge"
                ],
                "train-destination": [
                  "norwich"                       # in slot-values but not in slots, and no slots fields contains it before this turn.
                ],
                "train-leaveat": [
                  "16:15"
                ]
              }
            }
          },
          {
            "actions": [],
            "service": "taxi",
            "slots": [],
            "state": {
              "active_intent": "NONE",
              "requested_slots": [],
              "slot_values": {}
            }
          },
          {
            "actions": [],
            "service": "bus",
            "slots": [],
            "state": {
              "active_intent": "NONE",
              "requested_slots": [],
              "slot_values": {}
            }
          },
          {
            "actions": [],
            "service": "police",
            "slots": [],
            "state": {
              "active_intent": "NONE",
              "requested_slots": [],
              "slot_values": {}
            }
          },
          {
            "actions": [],
            "service": "hotel",
            "slots": [],
            "state": {
              "active_intent": "NONE",
              "requested_slots": [],
              "slot_values": {}
            }
          },
          {
            "actions": [],
            "service": "attraction",
            "slots": [],
            "state": {
              "active_intent": "NONE",
              "requested_slots": [],
              "slot_values": {}
            }
          },
          {
            "actions": [],
            "service": "hospital",
            "slots": [],
            "state": {
              "active_intent": "NONE",
              "requested_slots": [],
              "slot_values": {}
            }
          }
        ],
        "speaker": "USER",
        "turn_id": "6",
        "utterance": "I am leaving from Cambridge and going to Norwich."
      }

Python 2 Depreciation

Hi!

Your preprocessing script in create_delex_data.py is a good start for working with Multiwoz. But, it was implemented in Python 2 that is depreciated. I refactored this part and dependent codes for being compatible in Python 3 once for my research project that I think can help others. I want to ask may I push them and make a pull request?

Thanks!

2.2 action annotations missing?

The 2.2 dataset doesn't appear to have any system action annotations at all. The json format for it is convenient, but it isn't useful to me without the action annotations. Will they be added soon?

The upper bound of Inform and Success rate?

I run evaluate.py and get Matches(inform): 90.40, Success 82.3. Are these the upper bound of metric Inform and Success? In some paper, the inform and success rate can exceed 90.40,82.3. In DMAD, under data augmentation setting, it can get inform 95.4 and Success 87.2. Which confused me so much.
4e4b735bf38315fb2f608ab49a216ca

data

Hi,

May I ask, whether the three MultiWOZ dataset in your folder data/ are same as downloaded from https://www.repository.cam.ac.uk/handle/1810/294507
Thanks for your feedback.

I want to the storage structure of this dataset, so just ran 'create_delex_data.py' and also read all of the source code in this script. But i still confused about "db" and "bs". could you help me explain something about these two variables? Thanks in advance!

Mirror of the dataset

Hi Paweł,

Do you know if there's are mirrors of the dataset anywhere? The cambridge website has reported that it's under maintenance for a couple weeks now.

Cheers,
Stephen

dataset_url = "https://www.repository.cam.ac.uk/bitstream/handle/1810/280608/MULTIWOZ2.zip?sequence=3&isAllowed=y"

Restaurant name error in the dataset "MUL1382.json"

In the Multi-WOZ 2.2 "data.json", line 7419586, dialog_id : MUL1382.json:
"text": "We've narrowed it down to 3. kihinoor, the gandhi, and mahal of cambridge. Would you like me to make a reservation for you?"
"text": "Yes please make a reservation for 3 people at 16:00 on Saturday at any of those choices."
"text": "I was able to book at Kohinoor for 16:00 on Saturday for 3 people. Your reference number is NTJ52ASI. The table will be held for 15 minutes."

Actually, in the database for restaurant domain, there is no restaurant named "kihinoor" but there is one restaurant named "kohinoor". And based on the next two utterances, I believe the first restaurant name in the first utterance should be "kohinoor".

system action annotation in MultiWOZ2.2

Firstly, thanks for launch of MultiWOZ2.2 dataset. Really appreciate for the contribution and correctness.

I found there are 15 errors in system act annotation in MultiWOZ2.2. please find more details in the following. for every annotation error, I showed dialogue_id, turn_id, span_info and the corresponding system response. Hopefully it helps. Thanks a lot.

**MUL0963.json
13
['Taxi-Inform', 'arriveby', '9:15', 19, 19]
Ok, a white audi will pick you up at cafe jello gallery and bring you to Ali baba by 19:15. You can contact the driver at 07646811518. Anything else?

MUL1382.json
3
['Restaurant-Inform', 'name', 'kihinoor', 29, 37]
We've narrowed it down to 3. kohinoor, the gandhi, and mahal of cambridge. Would you like me to make a reservation for you?

PMUL0363.json
9
['Restaurant-Inform', 'food', 'French', 35, 41]
Restaurant Restaurant Two Two is an expensive French restaurant in the north with wonderful food. Would you like to book a table?

PMUL0363.json
9
['Restaurant-Inform', 'area', 'north', 60, 65]
Restaurant Restaurant Two Two is an expensive French restaurant in the north with wonderful food. Would you like to book a table?

PMUL0363.json
9
['Restaurant-Inform', 'pricerange', 'expensive', 25, 34]
Restaurant Restaurant Two Two is an expensive French restaurant in the north with wonderful food. Would you like to book a table?

PMUL0363.json
9
['Restaurant-Inform', 'name', 'Two Two', 11, 18]
Restaurant Restaurant Two Two is an expensive French restaurant in the north with wonderful food. Would you like to book a table?

PMUL2368.json
11
['Booking-Book', 'ref', '9Z58HWE1,general-reqmore:', 10, 10]
I have you booked at Charlie Chan on Saturday at 20:00 for 5 people. Your reference number is 9Z58HWE1. They hold the table for 15 minutes. Is there anything else?

PMUL2584.json
11
['Taxi-Inform', 'leaveat', '19:00,general-reqmore:', 11, 11]
A grey skoda will pick you up at the hotel by 19:00 to take you to the Castle Galleries. Your contact number is 07375156908. Will there be anything else today? : 07375156908

PMUL3093.json
5
['Train-Inform', 'arriveby', '1:54', 3, 3]
TR8659 leaves at 10:09 and arrives at 11:54, will that work for you?

PMUL3382.json
11
['Train-Inform', 'leaveat', '11:50', 2, 2]
TR0767 leaves at11:50 on Friday morning, arriving 12:07. Price is 4.40 pounds. Would you like me to book a seat?

PMUL4077.json
13
['Taxi-Inform', 'arriveby', '5:15', 5, 5]
Ok you will arrive at 15:15 in a yellow skoda Contact number :07710839987

PMUL4115.json
5
['Train-Inform', 'leaveat', '19:39', 2, 2]
TR3197 leaves atb19:39 and costs 13:39 pounds. is that fine with you?

PMUL4385.json
3
['Train-Inform', 'leaveat', '9:29', 20, 20]
You have a few options available if you're traveling from bishops stortford to cambridge. There is a train leaving at 09:29, Does that work for you?

SNG01733.json
5
['Train-Inform', 'leaveat', '5:40', 6, 6]
Train TR7213 departing from cambridge at 05:40 and arriving at stansted airport at 06:08 will be the best option for you.

SNG1041.json
9
['Hotel-Inform', 'type', 'guesthouse,general-reqmore:', 10, 11]
I remind you that you can check-in in this guesthouse after 3:00 pm. You can leave your suitcases anytime.**

The annotation tool

I want to annotate similar data for different language, thus I want to build a web-based annotation tool, can you share the code of annotation tool or any suggestion?
Thanks.

About requestable slots for success rate

Hi, I have a question about requestable slots for success rate.

requestables = ['phone', 'address', 'postcode', 'reference', 'id']

In evaluate.py, requestables just includes 5 slots('phone', 'address', 'postcode', 'reference', 'id').
But, there are more requestable slots in user's goal.
ex) train-price, taxi-car type, attraction-entrance fee...
Why does it evaluate just 5 slots?

Inform and Success metrics

Hi,
I don't seem to understand the inform metric very well. What do you exactly mean by providing the right entity and why is the inform rate not 100% even with the oracle belief state. Does this mean that dialogue state prediction systems must do better than oracle in-order to improve inform rate?

on benchmarks

I noticed some of the results listed in the Benchmarks are different from those claimed in the original articles. For example, in SimpleTOD article, Joint Accuracy is 56.45, while in your list, the number is 55.72.
How do you get these results? Is there a script for everyone? Or you just rerun their model and report the results you get?

MultiWOZ2.2

I just run the "convert_to_multiwoz_format.py" in MultiWOZ2.2, there is the following error:

File "convert_to_multiwoz_format.py", line 85, in main
clean_dialogue = clean_data[dialogue_id]
KeyError: 'SNG01862.json'

it seems there is no 'SNG01862.json' in the dialogue_acts.json file. how to fit it? @XiaoxueZang

Thanks in advance.

createDelexData problem

idx_acts +=1

Each act in dialogue_acts.json is the response corresponding to the system, and the number of acts in each dialogue is equal to the number of responses in each dialogue.
idx should be equal to twice the idx_acts, because only when idx is odd, it corresponds to response.

MultiWOZ2.2

Hey

Could you please upload the data for MultiWOZ2.2 here aswell?

DB missing?

Hi, The hospital-dbase.db and tax-dbase.db are empty in the db folder.

License

Can you add the license for the baseline in case people want to use it? Thanks!

Hyperparameter of trade on MultiWOZ 2.1

Hello,

I'm conducting some experiments with trade on MultiWOZ 2.1. I simply replaced the dataset used by trade, which experiments on MultiWOZ 2.0, with the hyperparameters unchange., However, this only reached an accuracy of 35% approximately. This result is much lower than the result this paper reported, which I guess it may be an issue of hyperparameters of the trade model.

However, I'm not able to find any reference related to this, I even have no idea if the hyperparameters of trade, or other models, change between these two datasets. I wonder if it is possible for me to get the specific values of these hyperparamters of models on MultiWOZ 2.1, so I could reproduce the result? Thanks in advance.

Dataset annotation process

I have started to collect a new dataset for a new domain, but I don't know how to annotate the dataset.
Should I annotate them manually? Or is there a helpful tool to do it?

New results on context-to-response, and end-to-end evaluation from SOLOIST

Hi Paweł ,

We just released a paper last week : SOLOIST: Few-shot Task-Oriented Dialog with A Single Pre-trained Auto-regressive Model. SOLOIST a pretraining-finetuning solution to building task-oriented dialog at scale with limited training examples and annotation efforts. Details can be found at https://arxiv.org/pdf/2005.05298.pdf . Project website is at here

We have updated numbers on context-to-response, and end-to-end evaluation setting. @budzianowski Could you please help update the leaderboard ?

Context-to-response using MultiWOZ 2.0

Inform: 89.60
Success: 79.30
BLEU: 18.03

End-to-end Evaluation using MultiWOZ 2.0:

Inform: 85.50
Success: 72.90
BLEU : 16.54

New results on dialogue state tracking , policy optimization

We have released our new results on arxiv

A Simple Language Model for Task-Oriented Dialogue
TL;DR: SimpleTOD is a simple approach to task-oriented dialogue that uses a single causal language model trained on all sub-tasks recast as a single sequence prediction problem.
https://arxiv.org/abs/2005.00796

Belief Tracking:

version      joint acc
2.1          55.72

Policy Optimization:

version     Inform    Success       Bleu
2.0          84.4       70.1        15.01
2.1          85         70.5        15.23

@budzianowski do you update the leaderboard? Or should we open a PR?
cc @bmccann

a error in the data.json

Hi,
I found "span_info" of text: "The city centre north b and b has parking and wifi. It is in the north area. Would you like to book this hotel?" an error, i think the index of the value ' north' should. be the second 'north', the first 'north' is a part of name 'city centre north b and b'. could you modify that?
details in he following

"text": "The city centre north b and b has parking and wifi. It is in the north area. Would you like to book this hotel?",
"metadata": {
"taxi": {
"book": {
"booked": []
},
"semi": {
"leaveAt": "",
"destination": "",
"departure": "",
"arriveBy": ""
}
},
"police": {
"book": {
"booked": []
},
"semi": {}
},
"restaurant": {
"book": {
"booked": [
{
"name": "nandos city centre",
"reference": "LYIENP77"
}
],
"people": "4",
"day": "wednesday",
"time": "15:00"
},
"semi": {
"food": "not mentioned",
"pricerange": "not mentioned",
"name": "nandos city centre",
"area": "not mentioned"
}
},
"hospital": {
"book": {
"booked": []
},
"semi": {
"department": ""
}
},
"hotel": {
"book": {
"booked": [],
"people": "",
"day": "",
"stay": ""
},
"semi": {
"name": "not mentioned",
"area": "not mentioned",
"parking": "yes",
"pricerange": "not mentioned",
"stars": "0",
"internet": "yes",
"type": "guesthouse"
}
},
"attraction": {
"book": {
"booked": []
},
"semi": {
"type": "",
"name": "",
"area": ""
}
},
"train": {
"book": {
"booked": [],
"people": ""
},
"semi": {
"leaveAt": "",
"destination": "",
"day": "",
"arriveBy": "",
"departure": ""
}
}
},
"dialog_act": {
"Booking-Inform": [
[
"none",
"none"
]
],
"Hotel-Inform": [
[
"Name",
"city centre north b and b"
],
[
"Area",
"north"
],
[
"Internet",
"none"
],
[
"Parking",
"none"
]
]
},
"span_info": [
[
"Hotel-Inform",
"Name",
"city centre north b and b",
1,
6
],
[
"Hotel-Inform",
"Area",
"north",
3,
3

]
]
},

Hospital database is not complete

The training data contains goals asking for hospital name, postcode and address, but the hospital database only contains departments and phone numbers. Any idea where the complete hospital database could be found? It must exist for the training data to have been created. For example: "Addenbrookes Hospital on Hills Rd".

DB: Booking availability is not provided in database

In db pointer vector, there is information of whether booking is available or not.
Is there a way that using predicted belief states, we compute booking availability for the retrieved entities?

Example:
for the attached example with following gold belief on restaurant, the retried entity has booking=available in db pointer vector, but there is no booking information in the restaurant db

belief : {'pricerange': 'cheap', 'area': 'centre', 'name': 'dojo noodle bar'}

retrieved restaurant: ('19225', '40210 Millers Yard City Centre', 'centre', 'asian oriental', 'dojo noodle bar serves a variety of japanese chinese vietnamese korean and malaysian dishes to eat in or take away sister restaurant to touzai', 'dojo noodle bar', '01223363471', 'cb21rq', 'cheap', 'NULL', 'restaurant')

Screen Shot 2020-04-29 at 2 00 02 PM

@budzianowski
cc: @bmccann

Why is the slot value train-arriveby not updated?

Hi.When I use the test_dials to test my dst model.I found that some slot-values about train-arriveby are not update in the ground truth.
Such as MUL2294 :
"transcript": "i need to travel on saturday from cambridge to london kings cross and need to leave after 18:30",
"system_transcript": "train tr0427 leaves at 19:00 on saturday and will get you there by 19:51. the cost is 18.88 pounds. want me to book it?",
"transcript": "yes please book the train for 1 person and provide the reference number"

From above we can see that the slot-value train-arriveby has changed in the dialog.But why this slot-value is not changed in the ground truth?
This confused me a lot,hope you can have a look,thank you!

Comma missing in db/taxi_db.json

Hi~ I'm Tianbao from Harbin Institute of Technology, I recently load the db from your json files to carry out some reseach on dialogue system, and I noticed db/taxi_db.json couldn't be convert to list of dict by pyhton json pkg since

"taxi_types": ["toyota","skoda","bmw",'honda','ford','audi','lexus','volvo','volkswagen','tesla']
may missed a comma in the end of this line. I'm a little confused that is there anything wrong or it has some other usage instead of use it as a table database? Looking for your reply!

Conversion of non-categorial slot values to database keys.

Hello, I trying figure out whether is there a way to convert a value that appears in a slot to a canonical form suitable for querying database.

For example, according to the database, there exists an attraction named sheep's green and lammas land park fen causeway which appears in various forms in the annotation, namely:

sheep's green and lammas land park
sheeps green and lammas land park fen
sheep's green
sheeps green
lammas land park

I belief that I need to convert all those names to the original form to successfully query the database, is it true? And is there a code which can do it for me (or a mapping, normalization etc.)?

Run on GPU

Can the model run on gpu? I got errors when I set no_cuda to False.

Does the preprocessing script work for 2.2?

There is no instruction in README on how to run preprocessing for 2.2. Instead, it implies that delexicalization is only compatible with earlier versions. Can someone confirm if that's the case? Thanks.

Normalization during test

Hi.

val2 = normalize(val2)

text = re.sub(timepat, ' [value_time] ', text)

In this line, the values of belief state are normalized before searching DB.

But, the time values of train(leaveAt, arrvieBy) are also normalized to [value_time].

In this case, I think we cannot check whether the values meet the user's goal, right?

Is it intended?

on benchmarks

I noticed some of the results listed in the Benchmarks are different from those claimed in the original articles. For example, in SimpleTOD article, Joint Accuracy is 56.45, while in your list, the number is 55.72.
How do you get these results? Is there a script for everyone? Or you just rerun their model and report the results you get?

Tokenization and BLEU score

Hello, I have a hard time evaluating my model.

First, the score for DAMD in the end-to-end modeling table should be 16.6 (as described in their paper) and not 18.6.

Second, I found out that the way I tokenize my responses highly affects the resulting BLEU score. I checked the systems from the end-to-end modeling table that have an open implementation and I am afraid that the numbers are not comparable:

  • DAMD (16.6) -- the tokens they use seem to be the same as the tokens that are predicted by their model (they do not use subwords).
  • LABES-S2S (18.3) -- The score seems to me too high concerning their rather low inform and success rates. However, there is no code or predictions.
  • LAVA (12.0) -- I cannot find out why their score is too low. The outputs they provide are good, and they probably use the tokens corresponding to the words predicted by their model including the tokens in ground truth responses.
  • UBAR (17.0) -- I do not understand the code. It is adapted from DAMD, but they use subwords, so I am not sure about it. Besides that, their reported inform rate is higher than the theoretical upper bound which I hope is somewhere around 92.2.
  • SimpleTOD (15.01) -- They decode responses using the HF GPT2 tokenizer and split them by whitespaces to get the tokens for computing the BLEU score. However, they do not care about interpunction or other stuff, so it is underestimated compared to the DAMD.
  • MinTL (17.89) -- They decode responses using the HF BERT tokenizer, prepend all . , ! ? : 's with spaces, split them by whitespaces, and use the tokens for the BLEU score.
  • SOLOIST (16.56), SUMBT+LaRL (17.9) -- no code 😞
  • Others - More papers are evaluating MultiWOZ and comparing to these numbers. Some of them use the NLTK tokenizer and it probably results in overestimated scores compared to the DAMD or MinTL.

I evaluated my data using different tokenization approaches and there are results:

  • NLTK tokenization with special care of the delexicalized spans - 17.9
  • NLTK tokenization without special care of the delexicalized spans - 24.5
  • whitespace splitting - 14.0
  • whitespace splitting with prepending . , ! ? : 's - 16.9

I think this shows that the evaluation script in this repository should be modified so that it first normalizes the input strings (for example using tokenization and immediate detokenization with the Moses tokenizer), somehow resolves the delexicalized spans (removes spaces etc., removes [ and ]) and does the tokenization on its own. I would really appreciate a standalone script that would be able to output the score from the delexicalized responses with corresponding dialogue and turn ids (provided in a file in a predefined format).
Or at least a guide to the preferred tokenization would be highly appreciated (for future generations).

Similarly, it would be also very nice to have a standalone script for computing inform and success rates that would accept just a file with delexicalized responses (taking into accunt that domain names do not have to be present in the spans) and corresponding dialogue states in .json

fail_book content is missing in some examples

In the example PMUL1848.json the fail_book field for the hotel domain is missing. The prompt message, on the other hand, does include instructions on what to do if booking fails:

If the booking fails how about <span class='emphasis'>friday</span>

Am I right that this is an inconsistency in the data, or am I missing something?

Here is the complete JSON goal:

 'police': {},
 'hospital': {},
 'hotel': {'info': {'area': 'east',
   'internet': 'yes',
   'type': 'guesthouse',
   'parking': 'yes'},
  'fail_info': {},
  'book': {'people': '5', 'day': 'thursday', 'invalid': True, 'stay': '5'}},
 'topic': {'taxi': False,
  'police': False,
  'restaurant': False,
  'hospital': False,
  'hotel': False,
  'general': False,
  'attraction': False,
  'train': False,
  'booking': False},
 'attraction': {},
 'train': {'info': {'destination': 'cambridge',
   'day': 'friday',
   'arriveBy': '14:00',
   'departure': 'stansted airport'},
  'fail_info': {},
  'book': {'invalid': True, 'people': '5'},
  'fail_book': {}},
 'message': ['You are planning your trip in Cambridge',
  "You are looking for a <span class='emphasis'>place to stay</span>. The hotel should <span class='emphasis'>include free parking</span> and should <span class='emphasis'>include free wifi</span>",
  "The hotel should be in the type of <span class='emphasis'>guesthouse</span> and should be in the <span class='emphasis'>east</span>",
  "Once you find the <span class='emphasis'>hotel</span> you want to book it for <span class='emphasis'>5 people</span> and <span class='emphasis'>5 nights</span> starting from <span class='emphasis'>thursday</span>",
  "If the booking fails how about <span class='emphasis'>friday</span>",
  "Make sure you get the <span class='emphasis'>reference number</span>",
  "You are also looking for a <span class='emphasis'>train</span>. The train should <span class='emphasis'>arrive by 14:00</span> and should be on <span class='emphasis'>the same day as the hotel booking</span>",
  "The train should depart from <span class='emphasis'>stansted airport</span> and should go to <span class='emphasis'>cambridge</span>",
  "Once you find the train you want to make a booking for <span class='emphasis'>the same group of people</span>",
  "Make sure you get the <span class='emphasis'>reference number</span>"],
 'restaurant': {}}```

generate SLU datasets

Does it possible to generate the data set of the SLU task, that is, the data set of the sequence labeling task, I need to obtain all possible values of slots in each turn (whether or not the values are in the results of the DST)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.