-
First importing the required libraries like pandas, nltk, vectorizing, ML modules and pickels etc..
-
Loading the data and checking weather it contains any null values.
3.Steps to clean the reviews: 1. Remove HTML tags 2. Remove special characters 3. Convert everything to lowercase 4. Remove stopwords 5. Stemming
-
Creating the model
- Creating Bag Of Words (BOW)
- Train test split
- Defining the models and Training them.
- Prediction and accuracy metrics to choose best model.
-
Creating pickel files of CountVectore and NLP ML module.
In this project the three ML modules are used because to vary the accuracy
- GaussianNB = 0.7843
- MultinomialNB = 0.831
- BernoulliNB = 0.8386
The Naive Bayes algorithum is for the predicting data is classification problem.
The BernouliNB model pickel file is used for deployement purpose beacause it's have good accuraccy
check the below link: https://movie-reviews-predictor.herokuapp.com/