GithubHelp home page GithubHelp logo

mangosplit's Introduction

NETFLIX MOVIE DATA ANALYSIS PROJECT

Netflix Movies and TV Shows project dataset is from this link - https://www.kaggle.com/datasets/shivamb/netflix-shows

Create a conda environment: conda create --name netflix conda activate netflix

conda create -n env-01 python=3.9 scipy=0.15.0 numpy exmaple
pip install --upgrade seaborn matplotlib

0. Ask QUestions

  1. which movie has the highest country viewing/releases - Done
  2. Which actors are most likely to work together?
  3. what type of content is added over months like holiday season (december , july, january) and the quantity of released content over the months
  4. which countries have the largest quantity of released content , group this by content type . What are the most common genres in the top 5 countries ? . visualise the type of content produced by countries
  5. explore the “Age” of content on Netflix, which means the gap between when movies/shows are released and when they are added.
  6. see how 11 varies per country
  7. Find out more on the movie and tv rating , visualise TV vs Movies and group them based on the targeted audience eg : kids, young adult , teenagers, adults
  8. Visualise 13 based on countries
  9. Movie and TV Show Genres, quantity of content released (in the genre)
  10. group 15 based on type(content)
  11. Netflix Titles
  12. Netflix Description
  13. spliting the date_added column (second link)

Data cleaning

Data Exploration

  1. How do the variables correlate?
  2. what type of content have they been focussing on over the years?
  3. Movie and TV Show Duration
  4. What are the top 10 genres on Netflix ?
  5. Find out more on the movie and tv rating and Group them based on the targeted audience eg : kids, young adult , teenagers, adults Data Visualisation
  6. Which countries have contributed most movies in recent years?
  7. what is the content release at netflix like ?
  8. what is the distribution of Netflix’s content by origin, or country ?
  9. what type of content have they been focussing on over the years?

1.Data collection

Data Preparation After downloading the dataset, I load the dataset into a dataframe for the data cleaning process

2.Data cleaning and processing

  1. Fill in the NaN values from the dataset Making sure there arent any NULL value in our data to make the data consistent. culumns with null value include :

    • rating
    • date_added
    • director
    • cast
    • country
    • duration
  2. Deleting redundant columns.

    • Handling invalid values on the date_added column, some values in the date_added column are greater than those in the release_year column (i.e. the year the movie was added is earlier than that it was released)
    • drop such invalid values to ensure data accuracy
  3. Dropping duplicates.

  4. Cleaning individual columns.

Data wrangling

Data Transformation

3-Exploratory analysis & Visualization

Links ive used for reference:

  1. https://www.analyticsvidhya.com/blog/2021/07/visualizing-netflix-data-using-python/
  2. https://jovian.com/janecww415/netflix-movies-and-shows-analysis
  3. https://www.kaggle.com/code/thiagopanini/insights-from-netflix-the-show-must-go-on/notebook ***
  4. https://www.dataquest.io/blog/comical-data-visualization-in-python-using-matplotlib/ ***
  5. https://www.kaggle.com/code/nikunjmalpani/netflix-movies-and-tv-shows-data-visualization
  6. https://jobymathew97.medium.com/netflix-movies-and-tv-shows-data-visualization-using-matplotlib-f1b4e91b5226
  7. https://www.nomidl.com/python/netflix-data-analysis-project-using-python/
  8. https://github.com/nataliafonseca/netflix-data-analysis/blob/main/notebook.ipynb
  9. https://app.datacamp.com/workspace/w/cc0a1d5f-0b59-4555-bc66-87d9dd3d5e96 ***
  10. https://medium.com/@linhvu.nt/data-analysis-and-recommendations-on-netflix-content-28707163553a ***
  11. https://jovian.com/astha1998/netflix-data-analysis-project

mangosplit's People

Contributors

lewah avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.