GithubHelp home page GithubHelp logo

fagan2888 / imdb-ratings-scraper Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jainanuj7/imdb-ratings-scraper

0.0 0.0 0.0 15 KB

๐ŸŽฌ๐ŸŽฆ IMDb scraper written in Python for dumping ratings, episode titles, total votes and air date for all episodes of a TV Show

License: MIT License

Python 100.00%

imdb-ratings-scraper's Introduction

IMDb Scraper in Python for TV Shows

This scraper can be used to generate following data for any TV show on IMDb.

  1. Episode Title
  2. IMDb Rating
  3. Total Votes
  4. Air Date

How to run?

  1. Download the repository or clone using git clone https://github.com/jainanuj7/IMDb-ratings-scraper.git
  2. Sample usage of IMDB_Web_Scrape class:
# Create an obejct of IMDB_Web_Scrape class and pass IMDb tv show id and number of shows
# Eg creating obejct for The Office (US) https://www.imdb.com/title/tt0386676/
TVShow = IMDB_Web_Scrape("tt0386676", 9)

# pull_seasons() returns the resultant pandas dataframe
dataset = TVShow.pull_seasons()

# Write dataset to csv
dataset.to_csv("results.csv")

Why was this scraper developed?

Till date, IMDb doesn't have official APIs. Yes, there are many sophisticated solutions like TMDb and OMDb but,

  1. OMDb is not free anymore, API key is required. API key in free tier has limited number of calls. And the premium API key.. WAIT, Who wants to pay for a personal project anyway?
  2. TMDb is a whole new database with no connection to IMDb.
  3. IMDb has some official dumps of all tv shows/movies, I wasn't able to find ratings in them. Also, nothing is mentioned regarding the when were the datasets last updated. Check out the official datasets at: https://www.imdb.com/interfaces/

Improvements in the scraper

Let me know in the 'Issues' section how this scraper can be futher improved.

Office Fan? ๐Ÿป

Check out my Data Analysis of The Office (US) at https://github.com/jainanuj7/bears-beets-battlestar-analytica

imdb-ratings-scraper's People

Contributors

jainanuj7 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.