GithubHelp home page GithubHelp logo

bibocai / writing_style Goto Github PK

View Code? Open in Web Editor NEW

This project forked from roys174/writing_style

0.0 2.0 0.0 13 KB

Code for classifying sentence according to their writing style

Perl 59.05% Python 27.57% Shell 13.39%

writing_style's Introduction

Writing Style

Code for classifying sentence according to their writing style. This is the code used for the style features classification in the following paper:

The Effect of Different Writing Tasks on Linguistic Style: A Case Study of the ROC Story Cloze Task

Roy Schwartz, Maarten Sap, Yannis Konstas, Li Zilles, Yejin Choi and Noah A. Smith, In proceedings of CoNLL 2017 (pdf, bib)

Requirements:

-- python2.7, with numpy, sklearn and spacy

-- perl5

Running:

1. pre_process.sh <ROC story dev file> <ROC story test file> <work directory = $PWD> <language model scores (dev set)> <language model scores (test set)>

-- This script generates files needed for training and testing. The files are stored in the input working directory.

-- The required arguments are the ROC story dev set and train set (see http://cs.rochester.edu/nlp/rocstories/).

-- Running the code without the last two arguments only uses the style classification features described in the paper (length, character n-grams and word n-grams).

-- In order to include the language model features described in the paper, two other arguments should be provided: the language model scores on the dev and test set. In order to generate those, a language model needs to be trained on the ROC story training set, and applied to the ROC story dev and test set. The code and instructions for doing this is found in the writing_style_lm submodule.

2. run_grid_search.PL <working directory>

-- This script runs grid search on the regularization parameter and returns results for experiment 1 and on the ROC story cloze task on train, dev and test set.

Misc

-- The pre-processing step generates different train/dev splits each time, so results will vary between runs (and specifically different from the results published in the paper).

Contact

[email protected]

writing_style's People

Contributors

schwartz-lab-huji avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.