Data Visualization of IMDb Movies
Archana Rao
Ironhack | Amsterdam | Data Analytics Bootcamp - 26-09-2020
- Project Description
- Questions & Hypotheses
- Dataset
- Database
- Workflow
- Organization
- Links
This project talks about the trends, behavior and analysis of various factors involved in movies found in the most popular website IMDb. (Internet Movie Database). Given that thousands of movies were produced and released each year, the success quotient is never stable as the trends keep changing. The focus of investigation in this project is mainly on the genre of movies, though the influence of IMDb score is also being analyzed.
-
Trends of different genres over the years - What kind of movies are liked by the audience? Are the genres with high IMDb scores highly profited too?
-
What does a movie with a good IMDb score indicate? What are the other factors revolving around it?
The dataset is taken from Kaggle which is in turn from the IMDb database itself. While there are lot of choices for movies datasets apart from IMDb, this one has information for movies released from 1916 till 2016
Choose an interesting topic and gather a dataset with appropriate details and file format. Understand the data and come up with problem statements. Perform data wrangling/cleaning. Update ipynb / python file with interpretation using Pandas/python and visualization of data using Seaborn and Matplotlib plots. Present the data insights with storytelling in a slide deck and evaluate the deviation of facts from hypothesis with proper reasoning. Create a README file which describes about the project.
No Git repository is created exclusively for this project. The main deliverables are the python file , an updated README and the PPT. The .py /.ipynb files detailing about the analysis are pushed to the individual project folder of the class repository. Only the README file will be available in the data folder. Any raw files / image files, data file(dataset)and the presentation file(.ppt) is uploaded in the google drive folder.
Presentation: https://drive.google.com/drive/folders/1RoaPxJ66y_ZHWJpA9vJjeauQ0uxfo3J5 Raw files: CSV file / dataset : https://drive.google.com/drive/folders/1N3u6K0tNsq2VbfoL3BSYDqfFJTjM0223