GithubHelp home page GithubHelp logo

rkly / rcnn-text-classification Goto Github PK

View Code? Open in Web Editor NEW

This project forked from roomylee/rcnn-text-classification

0.0 1.0 0.0 2.74 MB

Tensorflow Implementation of "Recurrent Convolutional Neural Network for Text Classification" (AAAI 2015)

Home Page: https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9745

Python 100.00%

rcnn-text-classification's Introduction

Recurrent Convolutional Neural Network for Text Classification

Tensorflow implementation of "Recurrent Convolutional Neural Network for Text Classification".

rcnn

Data: Movie Review

  • Movie reviews with one sentence per review. Classification involves detecting positive/negative reviews (Pang and Lee, 2005).
  • Download "sentence polarity dataset v1.0" at the Official Download Page.
  • Located in "data/rt-polaritydata/" in my repository.
  • rt-polarity.pos contains 5331 positive snippets.
  • rt-polarity.neg contains 5331 negative snippets.

Implementation of Recurrent Structure

recurrent_structure

  • Bidirectional RNN (Bi-RNN) is used to implement the left and right context vectors.
  • Each context vector is created by shifting the output of Bi-RNN and concatenating a zero state indicating the start of the context.

Usage

Train

  • positive data is located in "data/rt-polaritydata/rt-polarity.pos".

  • negative data is located in "data/rt-polaritydata/rt-polarity.neg".

  • "GoogleNews-vectors-negative300" is used as pre-trained word2vec model.

  • Display help message:

     $ python train.py --help
  • Train Example:

     $ python train.py --cell_type "lstm" \
     --pos_dir "data/rt-polaritydata/rt-polarity.pos" \
     --neg_dir "data/rt-polaritydata/rt-polarity.neg"\
     --word2vec "GoogleNews-vectors-negative300.bin"

Evalutation

  • Movie Review dataset has no test data.

  • If you want to evaluate, you should make test dataset from train data or do cross validation. However, cross validation is not implemented in my project.

  • The bellow example just use full rt-polarity dataset same the train dataset.

  • Evaluation Example:

     $ python eval.py \
     --pos_dir "data/rt-polaritydata/rt-polarity.pos" \
     --neg_dir "data/rt-polaritydata/rt-polarity.neg" \
     --checkpoint_dir "runs/1523902663/checkpoints"

Result

  • Comparision between Recurrent Convolutional Neural Network and Convolutional Neural Network.
  • dennybritz's cnn-text-classification-tf is used for compared CNN model.
  • Same pre-trained word2vec used for both models.

Accuracy for validation set

accuracy

Loss for validation set

accuracy

Reference

  • Recurrent Convolutional Neural Network for Text Classification (AAAI 2015), S Lai et al. [paper]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.