GithubHelp home page GithubHelp logo

e-tay / bida_twitter_crawler Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 273 KB

A Twitter crawler that can crawl out details of Twitter users and tweets containing certain keywords.

Jupyter Notebook 100.00%

bida_twitter_crawler's Introduction

Tweeter Crawler

Table of Contents

General Information

  • This crawler allows users to scrape the profile, and social network (i.e., friends and followers) from the handle of a Twitter user of interest. It also scraps for the top 50 most popular tweets containing the keywords "coronavirus" and/or "vaccination".
  • This project is an interim project assigned to learners as part of the Microsoft Business Intelligence and Data Analyst Course.

Technologies Used

  • PostgreSQL 9.5 or higher
  • pgAdmin 4 5.1 or highe
  • Python 3.8 or higher

Python Packages Used

  • Tweepy 3.10.0
  • SQLAlchemy 1.3.19
  • Pandas 1.2.4

Features

This crawler has the ability to:

  • find the twitter details of user of your choice
  • find out who are the friends and followers of your chosen user
  • extract tweets containing the keywords "coronavirus" and "vaccination"

Twitter API Setup

You will need the following information for authentication with Twitter within the crawler in order to crawl data using Twitter API:

  • API Key
  • API Secret
  • Access Token
  • Access Token Secret
  • Here are the steps to obtain them:
  1. Create a Twitter account: https://twitter.com/i/flow/signup
  2. Apply for a developer account: https://developer.twitter.com/en/apply-for-access
  3. After your application is approved, create a Twitter Developer App.
  4. Navigate to the App you just created and you will see a tab on “Keys and Tokens”. Clicking on the "Keys and Tokens" tab should show your API Key and Secret, as well as the Access Token and Secret.

Project Status

Project is complete. Future versions with more advance capabilities are being considered.

Room for Improvement

  • Ablity to pull at least 1,000 tweets per run
  • Allow the user to input keywords of their choice for tweets extraction
  • Sentiment analysis of tweets extracted

Acknowledgements

  • This project was inspired and jointly developed with my amazing teammates Jacob, Tziqing, Weicheng and Yisheng.
  • Special thanks to my instructors Calvin and Nelson for their valuable inputs and guidance.

bida_twitter_crawler's People

Contributors

e-tay avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.