This dataset contains information about 10,000 movies collected from The Movie Database (TMDb). The aim of this analysis is to find out how revenue is related to some numeric varibles such as budget, popularity and runtime, and how vote_count relates to the profitability, namely the revenue. The dataset is collected, reorganized and offered by Kaggle.
Sourcing: https://www.kaggle.com/tmdb/tmdb-movie-metadata
Conclusions Finally, based on the analysis and visualizations above, we can draw our conclusion that the revenue of movies from the dataset is highly correlated to budget and popularity, but not so much to runtime. Vote_count has positive correlation with profitability as well. Due to the limitation of the sample size of the dataset, we cannot say this is fully represented the situation of the population. Since some of the varibles are categorical ones, so only descriptive statistics are implemented but not inferential statistics.