My basic goal for this mini-project was to do some data mining on video game data - a subject I enjoy greatly.
For this project, I built my own scraper and took a bunch of data from the video game section of metacritic.com metacritic.csv is the result of the scraper, and contains contains roughly 13000 observations of different video games released on the PS4 all the way back to the Nintendo 64.
I wanted to accomplish a few things with these notebooks:
- Test out some data processing/cleanup
- Refresh my memory on some unsupervised learning (KMeans clustering)
- Refresh my memory on some supervised learning (linear regression)
- Finally, I wanted to get reacquainted with python classes and methods
metacriticReviews2.ipynb is more of an exploratory notebook that accomplishes points 1, 2 and 3 above. Here I practice a bunch of pandas and sklearn functions and explore the data often. There is a lot of random calls to look at the data, and I build a few more graphs and build a KMeans clustering algorithm Go here to see my some of my though process and how I looked at the data.
metacriticAnalysis.ipynb is really a more professional-looking notebook. This is what I would have done in school or in some of my past work to create a clean program to process and analyze any new data Technically, this accomplishes all 4 points above, I just obviously dont do calls like .head() to see if my code is working