Dataset and code for "Counterfactual Story Reasoning and Generation"
This repo contains the dataset and code for the following paper:
Counterfactual Story Reasoning and Generation
Lianhui Qin, Antoine Bosselut, Ari Holtzman, Chandra Bhagavatula, Elizabeth Clark and Yejin Choi
EMNLP 2019
Dataset: TimeTravel
The dataset can be downloaded from here.
Data files includes (see examples below):
train_supervised_small.json
: supervised training set (the training set used in the paper experiments)train_supervised_large.json
: supervised training set (a larger supervised training set as we annotated more)train_unsupervised.json
: unsupervised training setdev_data.json
: Dev settest_data.json
: Test set
Data format in each file:
- Supervised training data example
{
"story_id": "4fd7d150-b080-4fb1-a592-8c27fa6e1fc8",
"premise": "Andrea wanted a picture of her jumping.",
"initial": "She set the camera up.",
"counterfactual": "She asked her friend to draw one.",
"original_ending": "Then, she jumped in the air. The picture kept coming out wrong. It took twenty tries to get it right.",
"edited_ending": [
"Then, she jumped in the air to demonstrate how she wanted it to look.",
"The picture kept coming out wrong.",
"It took drawing it several times to get it right."
]
}
- Unsupervised training data example
{
"story_id": "da0e85f1-c586-4236-a8a3-ee6421c8e71d",
"premise": "Charles' mother taught her son to carry a pre-paid cell phone.",
"initial": "As a job seeker, Charles put his cell phone number on applications.",
"counterfactual": "As a job seeker, Charles used his cell phone to keep his information out of employers hands.",
"original_ending": "He needed a real cell phone, but kept up with his pre-paid cell phone. One afternoon he was in a phone interview with Apple Computers. He ran out of minutes and never reached Apple's hiring manager again."
}
- Dev / test data example
{
"story_id": "048f5a77-7c17-4071-8b0b-b8e43087132d",
"premise": "Neil was visiting Limerick in Ireland.",
"initial": "There, he saw a beautiful sight.",
"counterfactual": "It was the ugliest city he's ever seen.",
"original_ending": "He saw the large and lovely River Shannon! After a few minutes, he agreed with the locals. The River Shannon was beautiful.",
"edited_endings": [
[
"He saw the small and lonely River Shannon!",
"After a few minutes, he agreed with the locals.",
"The River Shannon was lonely."
],
[
"However, he saw the large and lovely River Shannon!",
"After a few minutes, he agreed with the locals.",
"The River Shannon was beautiful."
],
[
"However, he did think the large River Shannon was lovely!",
"After a few minutes, he agreed with the locals that Limerick wasn't as ugly as he though.",
"The River Shannon was beautiful."
]
]
}
Code
(The code is still under cleanup. More details of code usage will be added soon.)
- The code depends on Texar. Please install the version under third_party/texar. Follow the installation instructions in the README there.
- Use
prepare_data_rewriting.py
to preprocess the raw text data and transform into TFRecord format. An example command is (please see the code for more config options).
python prepare_data_rewriting.py --data_dir=raw_data_dir
- Run
run_[X].sh
for training/testing model[X]
. - Use
evaluate.py
for evaluation. An example command is
python evaluate.py --all-preds-dir data/100_output_proced --gold-file data/dev.jsonl &> 100_output_proced_metrics.log
- The
WMS
andW+SMS
metrics in the paper (Table.7) use the code here.
Citation
@inproceedings{qin-counterfactual,
title = "Counterfactual Story Reasoning and Generation",
author = "Qin, Lianhui and Bosselut, Antoine and Holtzman, Ari and Bhagavatula, Chandra and Clark, Elizabeth and Choi, Yejin",
booktitle = "2019 Conference on Empirical Methods in Natural Language Processing.",
month = "nov",
year = "2019",
address = "Hongkong, China",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/pdf/1909.04076.pdf",
}