GithubHelp home page GithubHelp logo

movie_genre_classification's Introduction

Movie Genre Classification Service

This service uses machine learning to predict the genre of a movie based on its overview. It provides an API that accepts a movie overview and returns the predicted genre.

Prerequisites

  • Python 3
  • Jupyter
  • Docker

About Jupyter Notebook

Movie_Genre_Classification_Assignment.ipynb : The notebooks contain the Exploratory Data Analysis(EDA), Preprocess the Data and Train the Model.

To run the file locally install pip install jupyter-lab and on terminal command jupyter-lab to start jupyter notebook.

* Run the command `jupyter notebook` on the terminal window.
* Load the dataset into a pandas DataFrame.
* Preprocess the data by selecting relevant columns and cleaning the overview text.
* Split the data into training and testing sets.
* Choose a machine learning algorithm for text classification, such as Naive Bayes, Support Vector Machines (SVM), or   Random Forests.
* Train the model on the preprocessed data.
* Save the Model in the model folder

Getting Started

These instructions will guide how to set up and run the movie genre classification service.

  1. Cloning a repository: git clone https://github.com/noopurdhawan/movie_genre_classification.git
  2. Go to the folder using cd <folder_name>
  • Option 1 to run the Model as a Service:
  1. Run sh install.sh on the terminal which would set the localhost on port 8000. Please make sure that port 8000 is not in use.
  • Option 2: To Re-train the Model
  1. Download the Dataset Download the dataset from Kaggle using the following link: The Movies Dataset. Extract the file movies_metadata.csv from the downloaded zip file and save it in the data folder.

  2. Set Up the Environment Install the required Python packages by running the following command. pip3 install -r ./requirements.txt

  3. Download the Spacy-dependent library for pre-processing data using python -m spacy download en_core_web_sm

    • Load the dataset into a pandas DataFrame.
    • Preprocess the data by selecting relevant columns and cleaning the overview text.
    • Split the data into training and testing sets.
    • Choose a machine learning algorithm for text classification, such as Naive Bayes, Support Vector Machines (SVM), or Random Forests.
    • Train the model on the preprocessed data.
    • Save the Model in the model folder
  4. Preprocess the Data and Train the Model using python model.py

  5. Run the Service Locally on the terminal Start the Flask application by running the following command python predict.py: The service should now be running on http://localhost:8000.

  6. Test the API using the below curl Use a tool like a curl to send a POST request to the API endpoint:

curl -d '{"overview": "A movie about penguins in Antarctica building a spaceship to go to Mars."}' -H "Content-Type: application/json" -X POST http://localhost:8000

The response should contain the predicted genre in JSON format.

  1. To run unit tests run python -m unittest test.py on the terminal.

movie_genre_classification's People

Contributors

noopurdhawan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.