GithubHelp home page GithubHelp logo

likarajo / movie_sentiment Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 52.14 MB

Movie Sentiment Analysis using Deep Learning

Jupyter Notebook 100.00%
keras tensorflow scikit-learn neural-network convolutional-neural-network recurrent-neural-network long-short-term-memory-models natural-language-processing

movie_sentiment's Introduction

Movie Sentiment Analysis

Dataset

Kaggle: https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews

  • Records: 50,000
  • Columns: 2
    • review
    • sentiment - "positive" and "negative" => binary classification problem

Dependencies

  • Pandas
  • Seaborn
  • Numpy
  • Scikit-learn
  • Tensorflow
  • Keras
  • Matplotlib
  • Pickle

pip install -r requirements.txt

Deep Learning using Neural Networks

Simple Neural Network

  • Sequential model
  • One Embedding layer
  • Flattening layer
  • Dense layer
    • activation function

[Notebook](https://github.com/likarajo/movie_sentiment/blob/master/model_NN.ipynb)

Convolutional Neural Network (CNN)

Primarily used for 2D data classification, such as images. Work well with 1D text data as well. Tries to find specific features in the first layer. In the next layers, the initially detected features are joined together to form bigger features. Ref: https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/

  • Sequential model
  • One Embedding layer
  • 1D convolutional layer
    • features or kernels
    • activation function
  • Global max pooling layer
    • reduce feature size
  • Dense layer
    • activation function

[Notebook](https://github.com/likarajo/movie_sentiment/blob/master/model_CNN.ipynb)

Recurrent Neural Network (CNN)

Long Short Term Memory Network (LSTM)
Recurrent neural networks variant

  • Sequential model
  • One Embedding layer
  • LSTM layer
    • neurons
  • Dense layer
    • activation function

[Notebook](https://github.com/likarajo/movie_sentiment/blob/master/model_RNN.ipynb)

Techniques used

  • Keras Embedding Layer
  • Stanford CoreNLP GloVe word embeddings

Conclusion

  • The difference between the accuracy values for training and test sets is much smaller in Recurrent NN as compared to that in Simple NN and Convolutional NN.
  • The difference between the loss values is negligible in Recurrent NN.
    • Model is NOT overfitting

So RNN is the best best algorithm for the model for our text classification.

Considerations

The number of layers, neurons, hyper parameters values, activation functions etc. can be changed to find the best NN model.

movie_sentiment's People

Contributors

likarajo avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.