GithubHelp home page GithubHelp logo

nseg's Introduction

Graph based Neural Sentence Ordering

Installation

The following packages are needed:

  • Python == 3.6
  • Pytorch >= 1.0
  • torchtext == 0.3
  • Stanford POS tagger or Dependency Parser
  • Glove (100 dim)

Dataset Format

*.lower: each line is a document: sentence_0 sentence_1 sentence_2

*.eg: entity1:i-r means entity1 is in the sentence_i and its role is r.

Other datasets are easy to access and process. We also recommand a high-quality dataset for sentence ordering, ROC story.

Preprocessing

Use a dependency parser to get POS and syntax

Select the word as entity if the POS is noun

Find the nsubj and dobj to get the roles ( or just use a POS tagger and ignore the roles if you think the dependency parser is time-consuming)

Training and Evaluation

bash run.sh

nseg's People

Contributors

aries-lm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

nseg's Issues

Mismatch between test data file and test data decoding

Hi

I want to test the model on paragraphs of sentences which are in some random order and want the correct sentence order predicted from the model.

For the NIPS dataset, I assume the test.lower file consists of gold paragraph sentences in correct order while the mdl.devorder file (created after model training) consists of the output from model (predicted values|||truth values).
But the gold paragraph sentence order does not match with the truth values order.
Could you please specify how does the decoding work for test data?

Cannot get the same result.

When training on nips dataset, my performance get stuck at around acc of 54%, much less than the result in the paper. I want to now how can I improve it.

Could you provide the Graph-SE version code?

Hi, recently we want to apply your model in a text ordering pipeline. But we found the code in the repo is a simplified version in the original paper called F-graph. Would you mind sharing us the Graph-SE version of code?

How to run for a custom dataset

Hi,
I have sentences in this order(example):
S1,S2,S3
S4,S1,S5

1.How can I convert the above sentence order to run this algorithm??
2.Can I use this algorithm to predict next sentence??
3.If I give a new sentence S6 how to predict next sentence??

Clarification regarding entity roles

The preprocessed file for entities looks like this as mentioned in the ReadMe:
*.eg: entity1:i-r means entity1 is in the sentence_i and its role is r.
The values for roles (r) in the sample nips entity files are 1/2/3. Could you please clarify what exact role does each of these numbers refer to?
As I understand from the paper, are they labeled as: 1-subject, 2-object, 3-other?
Thanks!

Can't get attribute '_default_unk_index'

'bash run.sh' failed with following error:
Traceback (most recent call last):
File "main.py", line 203, in
DOC.vocab = torch.load(args.vocab)
File "anaconda3/envs/NSEG/lib/python3.6/site-packages/torch/serialization.py", line 387, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "anaconda3/envs/NSEG/lib/python3.6/site-packages/torch/serialization.py", line 574, in _load
result = unpickler.load()
AttributeError: Can't get attribute '_default_unk_index' on <module 'torchtext.vocab' from '/anaconda3/envs/NSEG/lib/python3.6/site-packages/torchtext/vocab.py'>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.